course-web-page-Fall-2021

Repository for the Fall 2021 course web page

View the Project on GitHub IUDataStructuresCourse/course-web-page-Fall-2021

Wrapping up AVL Trees

Sorting with AVL Trees:

Abstract data types that can be implemented with AVL Trees:

Lecture: Hash Tables

Java’s HashMap and HashSet classes are implemented with hash tables

Most modern languages have hash tables built-in or in the standard library.

The Map Abstract Data Type (aka. “dictionary”)

interface Map<K,V> {
   V get(K key);
   V put(K key, V value);
   V remove(K key);
   boolean containsKey(K key);
}

Motivation: maps are everywhere!

We could implement the Map ADT with AVL Trees, then get() is O(log(n)).

A simple Map implementation

If keys are integers, we can use an array.

Store items in array indexed by key (draw picture) use None to indicate absense of key.

What’s good?

get() is O(1)

What’s bad?

  1. keys may not be natural numbers
  2. memory hog if the set of possible keys is huge, if much larger thant than the number of keys store in the dictionary.

Prehashing

Prehashing fixes problem 1 by mapping everything to integers. (Textbook calls this the creation of a hash code.)

In Java, o.hashCode() computes the prehash of object o.

Ideally: x.hashCode() == y.hashCode() iff x and y are the same object (but sometimes different objects have the same hash code)

User-definable: a class can override the hashCode method, and should do so if you are overriding the equals method.

Algorithm for prehashing a string (aka. polynomial hash code) Map each character to one digit in a number. But there are 256 different characters, not 10. So we use a different base.

prehash_string('ab') == 97 * 256 + 98
prehash_string('abc') == 97 * (256^2) + 98 * (256^1) + 99

Hashing

Hashing fixes problem 2 (reduce memory consumption).

The word “hash” is from cooking: “a finely chopped mixture”.

Chaining fixes collisions.

Towards proving that the average case time is O(1).

Takeaway: need to grow table size m as n increases so that alpha stays small.

hash functions

division method: h(k) = k mod m

need to be careful about choice of table size m

if not, may not use all of the table

table size 4 (slots 0..3)
suppose the keys are all even: 0,2,..

		0 -> 0           (0 mod 4 = 0)
		2 -> 2           (2 mod 4 = 2)
		4 -> 0           (4 mod 4 = 0)
		6 -> 2           (6 mod 4 = 2)
		8 -> 0           (8 mod 4 = 0)
		...

Never use slot 1 and 3.

            Good to choose a prime number for m, not close to
            a power of 2 or 10.

Multiply-Add-and-Divide (MAD) method

h(k) = ((a * k + b) mod p) mod m

where

Student exercise

Using the division method and chaining, insert the keys 4, 1, 3, 2, 0 into a hash table with table size 3 (m=3).

Solution:

0 -> [0,3]
1 -> [1,4]
2 -> [2]