To hash means to grind up, and that’s essentially what hashing is all about. The
heart of a hashing algorithm is a hash function that takes your nice, neat data
and grinds it into some random-looking integer.
The idea behind hashing is that some data either has no inherent ordering (such
as images) or is expensive to compare (such as images). If the data has no inherent
ordering, you can’t perform comparison searches.
If the data is expensive to compare, the number of comparisons used even by a binary
search might be too many. So instead of looking at the data themselves, you’ll
condense (hash) the data to an integer (its hash value) and keep all the data with
the same hash value in the same place. This task is carried out by using the hash
value as an index into an array.
To search for an item, you simply hash it and look at all the data whose hash values
match that of the data you’re looking for. This technique greatly lessens the
number of items you have to look at. If the parameters are set up with care and
enough storage is available for the hash table, the number of comparisons needed
to find an item can be made arbitrarily close to one.
One aspect that affects the efficiency of a hashing implementation is the hash function
itself. It should ideally distribute data randomly throughout the entire hash table,
to reduce the likelihood of collisions. Collisions occur when two different keys
have the same hash value.