15.7. Analysis of hash tables¶
The table below shows the average case Big-O efficiency of some basic unordered_map operations. For each operation that has average case constant time efficiency, the worst case complexity is \(O(n)\).
Operation |
Big-O Efficiency |
---|---|
assignment = |
O(1) |
insert() |
O(1) |
find() |
O(1) |
contains() |
O(1) |
erase() |
O(1) |
clear() |
O(n) |
The reason these operations may have \(O(n)\) complexity is because the performance of the container is ultimately controlled by the quality of the hash function for the key type in the container. If the hash function performs poorly (many collisions), then the benefits of hash tables are lost and we decay into list performance. When the hash function quality is high, then the performance is good.
When we discussed the messy and neat closets in Tree ADT concepts, we mentioned the primary motivation for non-sequential containers was search. Unlike even a sorted vector or a tree, hash tables provide constant time access tot he correct bucket containing our data and linear search is required only when collisions exist.
The following code shows what happens when searching in an unordered map vs a vector.
Try This!
The online compiler is limited in both memory and time allowed.
Run this example on your own computer with the loop enabled and with larger values and compare.
The vector is linear in std::distance(begin, end)
and as expected,
the hash table is constant time.
Running the previous code should produce results similar to this:
So what about the tree ADT? The std::set is generally implemented as a tree. The C++ standard guarantees logarithmic complexity in the size of the container.
How does std::set find compare to std::unordered_map find?
Although the std::set find is logarithmic complexity, from a practical sense, it compares favorably with the hash table. The graph below shows example output for values up to 1,000,000.
Try This!
The online compiler is limited in both memory and time allowed.
Run this example on your own computer with larger values and compare.
More to Explore
TBD