Page 1 of 1

Hash Table and Its Usage how to use a simple hash function and table Rate Topic: -----

#1 Anarion  Icon User is offline

  • The Persian Coder
  • member icon

Reputation: 282
  • View blog
  • Posts: 1,456
  • Joined: 16-May 09

Post icon  Posted 03 November 2009 - 01:58 AM

Hi! this is my first tutorial and I'm going to explain how hashing and hash table works and how to use it. OK, but what is Hash Table actually and how can it benefit us? well, a Hash Table is a type of table in which we can use each record's id to retrieve the it's data easily. So, each record must have a unique id, which is going to be an int higher than or equal to 1. And another thing to mention, we use a value of -1 for id for each empty record's id.


Hash Function
For this tutorial, I used the following simple function to hash a given id:
hash(id) = id%size;

size is the length of our table, which has to be a prime number. I used 51 in this tutorial. The reason why it needs to be prime is that, if we don't use a prime number, the resulting hashed value will be zero for each id divideable by size. But if we use a prime number for size, then we decrease the chance of id being divideable by size.


Collisions
When inserting items into a hash table, it sometimes happens that an item to be inserted in table has a hash value that points to an already existing location in table. The new item typically is inserted somewhere else, though we must be sure that there is still a way to look up the item.

One of the simplest ways to overcome collisions is to use a rehash function:
rehash(id) = (id+1)%size;

Note: When we hash an id and check and see that it is already taken, we give the hashed id to rehash function.

OK, now here's an example program I wrote to implement a hash table and function. It's quite straight forward but I commented it too:

#include <iostream>
#include <string>
using namespace std;

//data structure to hold id and data, our data-structure we want to use
struct data {
public:
		int id; //to hold a unique id for each element
		int data; //the data for each element, I used a simple int
};

class hasher {
		data dt[51]; //the table to hold hashed data structs
		int numel; //number of elements in table, to check if it's full
public:
		hasher();
		int hash(int &id);
		int rehash(int &id);
		int add(data &d);
		int remove(data &d);
		void output();
};

/*this is the function to give hashed id, it's a simple one...
It's better that we use a prime number for our length, thats why I used 11
and the bigger possible, it's better because we reduce our collisions*/
int hasher::hash(int &id) {
		return (id%51);
}
/*in case of ay collision(a hashed value which is already occupied before)
we use rehash function instead of hash*/
int hasher::rehash(int &id) {
		return ((id+1)%51);
}

hasher::hasher() {
		//create an array of data structure
		int i;
		for(i=0;i<=50;i++) {
				dt[i].id = -1; //set all ids to -1 to show they're empty
				dt[i].data = 0; //set all data values to 0, which is default
		}
		numel = 0;
}

int hasher::add(data &d) {
		if(numel < 51) {
				//table has empty places...
				int hashed = hash(d.id);
				if(hashed >= 0 && hashed <= 50 && dt[hashed].id == -1) {
						//slot is empty, assign new data
						dt[hashed].id = d.id;
						dt[hashed].data = d.data;
						return 0;
				} else {
						//we need to rehash the id
			int i=0;
			//try every place in table to find an empty place
			while(i<=50) {
							hashed = rehash(hashed);
							if(dt[hashed].id == -1) {
									dt[hashed].id = d.id;
									dt[hashed].data = d.data;
									return 0;
							} else if(i==50) {
									//we couldn't find the empty place
									return -1; //terminate function with error
							}
				i++; //update value of i
			}
				}
		} else {
				//table is full
				return (-1);
		}
}

int hasher::remove(data &d) {
		int hashed = hash(d.id);
		if(dt[hashed].id == d.id) {
				//this is the right one to delete
				dt[hashed].id = -1;
				numel -= 1;
				return 0;
		} else {
				//we need a rehash to find the one
		int i=0;
		while(i<=50) {
					hashed = rehash(hashed);
					if(dt[hashed].id == d.id) {
							dt[hashed].id = -1; //set the id to -1 because it is going to be empty
							numel -= 1; //decrease 1 from number of elements
							return 0; //success
					} else if(i==50) {
							return -1; //terminate function
			}
			i++; //update the value of i
				}
		}
}

void hasher::output() {
		int i;
		for(i=0;i<51;i++) {
				cout<<i<<" ->  "<<dt[i].id<<"	 "<<dt[i].data<<endl;
		}
}

int main() {
		int id=45; //first id
		int ret;
		data d;
		d.data = 52005; //a value for all records, simply
		hasher h1;
		while(ret != -1) {
				d.id=id;
				ret = h1.add(d); //add this record to table
				id += (id/5); //update the id, I wanted to show how different ids are put in table, look at output
		}
		d.id = 271861; //set to one of the ids we added to table in above loop
		h1.remove(d); //remove that record from table
		h1.output(); //see the output, notice there is one empty record, in index number 33, that was the record with id of 271861
		return 0;
}


Final Thoughts
In this tutorial, I used a simple hash function which was good enough for working with integer values, but what if we want to hash strings ? Yup we need a more advanced function to do that. There are many ways to do that which is beyond this tutorial's goals. But at least you are now familiar with the way hashing works. Hope this tutorial helped you! :D


Is This A Good Question/Topic? 1
  • +

Replies To: Hash Table and Its Usage

#2 Guest_AlekseiB*


Reputation:

Posted 08 February 2010 - 01:44 PM

very simple and nice tutorial! 51=17*3, so it's not a prime number and numel is not incremented on adding data, but it is not important. thank you
Was This Post Helpful? 2

#3 Anarion  Icon User is offline

  • The Persian Coder
  • member icon

Reputation: 282
  • View blog
  • Posts: 1,456
  • Joined: 16-May 09

Posted 12 February 2010 - 04:20 AM

View PostAlekseiB, on 08 February 2010 - 11:14 PM, said:

very simple and nice tutorial! 51=17*3, so it's not a prime number and numel is not incremented on adding data, but it is not important. thank you

Oh! *facepalm* what a lame mistake :basecase:
Sorry, it was one of my first articles... :D Thanks for addressing it :^:

This post has been edited by Anarion: 12 February 2010 - 08:47 AM

Was This Post Helpful? 0
  • +
  • -

#4 Guest_Allen*


Reputation:

Posted 01 May 2010 - 02:28 PM

your code edited for it actually runs and doesn't give a run time error



#include <iostream>
#include <string>
using namespace std;

//data structure to hold id and data, our data-structure we want to use
struct datar {
public:
                int id; //to hold a unique id for each element
                int data; //the data for each element, I used a simple int
};

class hasher {
                datar dt[51]; //the table to hold hashed data structs
                int numel; //number of elements in table, to check if it's full
public:
                hasher();
                int hash(int &id);
                int rehash(int &id);
                int add(datar &d);
                int remove(datar &d);
                void output();
};

/*this is the function to give hashed id, it's a simple one...
It's better that we use a prime number for our length, thats why I used 11
and the bigger possible, it's better because we reduce our collisions*/
int hasher::hash(int &id) {
                return (id%51);
}
/*in case of ay collision(a hashed value which is already occupied before)
we use rehash function instead of hash*/
int hasher::rehash(int &id) {
                return ((id+1)%51);
}

hasher::hasher() {
                //create an array of data structure
                int i;
                for(i=0;i<=50;i++) {
					dt[i].id = -1; //set all ids to -1 to show they're empty
                                dt[i].data = 0; //set all data values to 0, which is default
                }
                numel = 0;
}

int hasher::add(datar &d) {
                if(numel < 51) {
                                //table has empty places...
                                int hashed = hash(d.id);
                                if(hashed >= 0 && hashed <= 50 && dt[hashed].id == -1) {
                                                //slot is empty, assign new data
                                                dt[hashed].id = d.id;
                                                dt[hashed].data = d.data;
                                                return 0;
                                } else {
                                                //we need to rehash the id
                        int i=0;
                        //try every place in table to find an empty place
                        while(i<=50) {
                                                        hashed = rehash(hashed);
                                                        if(dt[hashed].id == -1) {
                                                                        dt[hashed].id = d.id;
                                                                        dt[hashed].data = d.data;
                                                                        return 0;
                                                        } else if(i==50) {
                                                                        //we couldn't find the empty place
                                                                        return -1; //terminate function with error
                                                        }
                                i++; //update value of i
                        }
                                }
                } else {
                                //table is full
                                return (-1);
                }
}

int hasher::remove(datar &d) {
                int hashed = hash(d.id);
                if(dt[hashed].id == d.id) {
                                //this is the right one to delete
                                dt[hashed].id = -1;
                                numel -= 1;
                                return 0;
                } else {
                                //we need a rehash to find the one
                int i=0;
                while(i<=50) {
                                        hashed = rehash(hashed);
                                        if(dt[hashed].id == d.id) {
                                                        dt[hashed].id = -1; //set the id to -1 because it is going to be empty
                                                        numel -= 1; //decrease 1 from number of elements
                                                        return 0; //success
                                        } else if(i==50) {
                                                        return -1; //terminate function
                        }
                        i++; //update the value of i
                                }
                }
}

void hasher::output() {
                int i;
                for(i=0;i<51;i++) {
                                cout<<i<<" ->  "<<dt[i].id<<"    "<<dt[i].data<<endl;
                }
}

int main() {
                int id=45; //first id
                int ret;
				ret = 0;
                datar d;
                d.data = 52005; //a value for all records, simply
                hasher h1;
                while(ret != -1) {
                                d.id=id;
                                ret = h1.add(d); //add this record to table
                                //id += (id/5); //update the id, I wanted to show how different ids are put in table, look at output
                }
                d.id = 271861; //set to one of the ids we added to table in above loop
                h1.remove(d); //remove that record from table
                h1.output(); //see the output, notice there is one empty record, in index number 33, that was the record with id of 271861
                return 0;
}









Next time you submit a tutorial make sure it runs....
Was This Post Helpful? 0

#5 RuhOutlilehes  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 1
  • Joined: 05-February 11

Posted 05 February 2011 - 08:17 AM

то что я искал, спасибо
Was This Post Helpful? 0
  • +
  • -

#6 Dogstopper  Icon User is offline

  • The Ninjaducky
  • member icon



Reputation: 2857
  • View blog
  • Posts: 10,960
  • Joined: 15-July 08

Posted 05 February 2011 - 09:01 AM

Tranlation:
what I was looking for, thanks.

Please post in English because this is an English forum. Thanks.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1