Tuesday, November 25, 2008

Algorithms & Data Structures - Hashing

Objective

At the end of the class learner will be able to answer the following:

  • Design and implement hash-list searches
  • Discuss the merits of different collision resolution algorithms


Hashed List Searches
  • The goal of a hashed search is to find the target data in only one test. After discussing some basic hashed list concepts, we discuss eight hashing methods.


Definition:
Hashing is a key-to-address mapping process.


Terms must be familiarized.
Synonyms: The set of keys that hash to the same location
Collision: A collision occurs when a hashing algorithm produces an address for an insertion key and that address is already occupied.
Home address: The address produced by the hashing algorithm is known as the home address.
Prime area: The memory that contains all of the home addresses is known as the prime area.
Probe: Each calculation of an address and test for success is known as a probe.


Hash Concept



Figure 13-7 shows collision Resolution concept



Hashing Methods
There are eight hashing methods they are:
  1. Direct method
  2. Substraction method
  3. Modulo-division
  4. Midsquare
  5. Digit extraction
  6. Rotation
  7. Folding
  8. Pseudorandom generation



Figure 13-8 shows Basic Hashing Techniques


Direct Method
In direct hashing the key is the address without any algorithmic manipulation.
Direct hashing is limited, but it can be very powerful because it guarantees that there are no synonyms and therefore no collision.

Example:



Modulo-division Method
  • This is also known as division remainder method.
  • This algorithm works with any list size, but a list size that is a prime number produces fewer collisions than other list sizes.
  • The formula to calculate the address is:
    Address = key MODULO listsize + 1
    Where listsize is the number of elements in the arrary.


Example:
Given data : 
Keys are : 137456 214562 140145
137456 % 19 +1 = 11
214562 % 19 + 1 = 15
140145 % 19 + 1 = 2



Digit-extraction Method
  • Using digit extraction selected digits are extracted from the key and used as the address.
Example:
  • Using six-digit employee number to hash to a three digit address (000-999), we could select the first, third, and fourth digits( from the left) and use them as the address.


The keys are:
379452 -> 394
121267 -> 112
378845 -> 388


Folding Method
Two folding methods are used they are:
  1. Fold shift
  2. Fold boundary



Fold Shift
In fold shift the key value is divided into parts whose size matches the size of the required address. Then the left and right parts are shifted and added with the middle part.


Fold boundary
In fold boundary the left and right numbers are folded on a fixed boundary between them and the center number. The two outside values are thus reversed.

Example:



Midsquare Method
  • In midsquare hashing the key is squared and the address is selected from the middle of the square number.
  • Limitation is the size of the key.

Example:
94522 = 89340304: address is 3403


Rotation Method
  • Rotation method is generally not used by itself but rather is incorporated in combination with other hashing methods.
  • It is most useful when keys are assigned serially.


Example:


Pseudorandom Hasing
A common random-number generator is shown below.
y= ax + c
To use the pseudorandom-number generator as a hashing method, we set x to the key, multiply it by the coefficient a, and then add the constant c. The result is then divided by the list size, with the remainder being the hashed address.

Example:
Y= ((17 * 121267) + 7) modulo 307
Y= (2061539 + 7) modulo 307
Y= 2061546
Y=41


Hashing Algorithm


Collision Resolution
There are several methods for handling collisions, each of them independent of the hashing algorithm.

Collision resolution methods

In this section I will discuss the Linear probe and Quadratic probe


Linear Probe
In a linear probe when data cannot be stored in the home address we resolve the collision by adding 1 to the current address.


Example for Linear Probe collision Resolution



Quadratic Collision Resolution
In the quadratic probe, the increment is the collision probe number squared. Thus for the first probe we add 12, for the second collision probe we add 22 etc.

12 comments:

Unknown said...

Nice one..
i was trying to revise the hashing technique..
this was one among the good ones...

rajib Cazz said...

hashing techniques are nicely explained here......!!!!!!!!

Addy said...

this is well explained

Unknown said...

simple and super

Syed Husnain Bukhari said...

here very nicely explain hashing algorithm. Thanks..

shivani srivastava said...

thanks u make it really easy to understand i was totally fedup with messy contents of the books

Afreen said...

useful content.........

Afreen said...

useful content....

Afreen said...

useful content....

Unknown said...

Hi, nice description about Algorithms & Data Structures in Hashing.Thanks for your help..

-Aparna
Theosoft

Unknown said...

Amazing. All the hashing techniques are explained very well. You have provided a brief introduction with proper examples to show how each of them executes.
e signatures

Unknown said...

Useful