A Programming Blog by a Full Sail Student February 8, 2010
  Home
  Blog Entries
  Browse Topics
  My Projects
  Tips
  About Me

  Latest Tweet:
  
@Tocom11 dude the roller coaster levels are so much fun. That series is definately top 10 for me

Follow me on Twitter!


Project: Boggle - Using Hash Tables
Posted: April 4, 2009, 11:51PM

One of my projects I had to do for my data structures class was a Boggle game in C++. It presented a very neat challenge of cataloging the English dictionary into the program. Throughout our data structures class, we found the best way to store the dictionary was through a hash table.

Hash tables are used to store very large amounts of information for fast searching, inserting, and removing. A hash table is compromised of an array of buckets. Each bucket is used to store a select set of information, usually within a singly linked list. You will want to write your hash table to work along with a singly linked list. In the case of the buckets being a singly linked list, you would use your array you made, and the [] operator, to access the appropriate singly linked list.

Check out how I constructed my Hash Table...

#include "SLList.h"     //I made my own Singly Linked List, check out my download to see more of this

template <typename Type = int>     //The class must be templated to create a proper SLL
class HTable
{
public:

HTable (unsigned int numOfBuckets, unsigned int (*hFunction) (const Type &v));
virtual ~HTable ();
void insert (const Type &v);
bool findAndRemove (const Type &v);
void clear ();
int find (const Type &v);
int numEmpty();     //I only made these next 4 functions to show stats on my Dictionary Stats page
unsigned int getNumBuckets();
unsigned int largestBucket();
unsigned int smallestBucket();

private:

SLList<Type>* arr;     //This is going to be dynamic memory so its going to be pointer
unsigned int (*hFunc) (const Type &v);     //This will store my hash function, I'll get into that soon
unsigned int numBuckets;

};

template <typename Type>
HTable<Type>::HTable(unsigned int numOfBuckets, unsigned int (*hFunction) (const Type &v))
{

arr = new SLList<Type>[numOfBuckets];     //Dynamic memory (our buckets), now we can use arr[ANUMBER] to access our SLL lists
hFunc = hFunction;
numBuckets = numOfBuckets;

}

After constructing a hash table I have all the buckets I need, and I can call my saved hash function to find out where any data sent into find, insert, and remove, would be.

Hash functions are written to find which bucket a specific item should be in. They need to be written in accordance to what type of information you will be storing. That is why you pass in the hash function that you will write alongside your program, rather than writing it within your Hash Table data structure.

Lets see how I need to write my hash function in this situation. I could have stored the entire dictionary within a 26 bucket hash table, with a bucket for each letter of the alphabet. Then if I needed to find if a word the user typed was real, I would grab the first letter of the input, head to the appropriate bucket, and see if it was there. Unfortunately a hash table like I just described would be very slow. There would be thousands of words in some buckets. It would take quite a bit of time to iterate through all of these words.

I needed to write something more efficient as one of my goals for the project. My hash function was completed after quite a bit of trial and error. Once I knew what had to be done it was all about finding which numbers would work out the best. In the end, this is what my hash function looked like...

unsigned int wordHash (const Type &word)
{
unsigned int bucket = 0;
short length = word.size();
for(int i = 0; i < length - 2; i++)
{
bucket += word[i] * word[i+2] + word[i+1];
}
return bucket % 16691;

}

I wanted to generate a large number based on nearly every character in the word in order to make it very unique. I did not want to spend too much time looping through the words so I decided to loop through the length minus 2. I made 16691 buckets to store this information. I chose this number for two reasons:

  • It is a prime number, meaning I can perform a modulus to it and get a number between 0 and 16690
  • It is a good number to evenly distribute 110307 words, leaving about 6 words per bucket on average
  • These are the results...

    I was amazed at how powerful that formula was! Out of 16691 buckets only 30 of them were empty and the largest bucket only held 18. Now my Boggle game can search the dictionary very quickly. It would only take a call to the hash function and iterating through a maximum of 18 words rather than searching through 110307 words.

    The rest of the game consisted of programming the grid and scoring. If you need the source or you need help with programming our own version of Boggle feel free to contact me at mikesprogramming@gmail.com

    Here is a link to the zipped up executable: Boggle.zip



    Delicious Bookmark this on Delicious  


    Last PostNext Post



    Related Links:


    Comments (most recent first):

    Post by: program_dude Posted on:2009-10-03 13:27:18
    where can i find the codes for this? this game got me really curious and i want to try it our for myself!haha.. where can i get the codes for this?


    Post by: C Posted on:2009-05-07 01:12:20
    Just finished creating my own Hash Table class over at lc. Used it to store a database of album names, artists, and catalog numbers for easy user searching. Glad to see that you're learning similar stuff.


    Post by: mikeisprettyfly Posted on:2009-04-07 12:52:04
    I didn't read about that. I'll have to check it out.


    Post by: Bruce Posted on:2009-04-07 11:40:07
    I was upset that we weren't allowed to use the formula given to us in that brown book. If you try that formula you'll have no empty buckets and no spikes, seriously.


    Post by: Wildex999 Posted on:2009-04-06 08:24:19
    Hey, this was quite an interesting read, always wondered how hash tables worked, but never realy got around to read up on them =P


    Post by: mikeisprettyfly Posted on:2009-04-05 16:31:59
    I removed the "color" code because that section of the code was given to me just to make these apps look better, thus I couldn't include it. Everything else is there for you to check out!


    Post by: mikeisprettyfly Posted on:2009-04-05 15:26:48
    I will be posting the actual source soon. I need to change some of the "color" code before I will be able to release it.




    Submit a comment about this entry:
    Name:
    Message:
    Contact Me | Privacy Policy