1 Java Classes and Operations on Hashtables
2 Using Hash Maps
2.1 A Design Aside: how to design add Customer
2.2 Back to the Banking Service
3 Defining Equality on Keys
4 Summary

Hashtables and Java HashMaps

Kathi Fisler

Our bank from the last few lectures has grown to have hundreds of thousands of customers and accounts. The system has long since outgrown the LinkedList data structure for these data; even the AVL-tree based implementation has gotten too slow for looking up customers. Ideally, the bank needs data structures that provide even faster techniques for getting from account numbers to Account objects and from names to Customers.

This lecture is about hashtables, a data structure that can perform lookups in near-constant time. A hashtable maps keys to values without traversing a data structure over all the keys. The diagram at the top right of the Wikipedia entry on hashtables illustrates the concept nicely. Every hashtable has a fixed number of "buckets" to which it maps keys. If you have more buckets than actual keys, hashtables provide a perfect match between keys and data. If you have more keys than buckets, keys sometimes collide, costing the accuracy of the data retrieved from the table. We won’t talk about hashtables in depth here, as you’ll see them in detail in Algorithms.

We will study hashtables in the context of providing a better data structure for implementing the IAcctSet interface in our banking system.

1 Java Classes and Operations on Hashtables

Java provides two data structures for hashtables: one called Hashtable and one called HashMap. HashMap is generally preferred, unless you have to deal with threads and synchronization (not a topic for this course). Hashtable is a legacy data structure from earlier versions of Java. We will therefore use the HashMap classes in Java as our implementation of the general hashtable data structure.

Hashtables are built into Java. To use them, include the line

  import java.util.HashMap;

at the top of your file.

HashMap is a generic class, defined over type variables for the key (K) and value (V). The most useful HashMap<K,V> methods are:

For the full list of HashMap methods, see the Java HashMap documentation.

2 Using HashMaps

Here’s a simple AcctSetHM class that implements IAccountSet using a HashMap:

  class AcctSetHM implements IAccountSet {

    private HashMap<Integer,Account> accounts =

      new HashMap<Integer,Account>();

  

    public Account findByNum(int num) {

      return accounts.get(num);

    }

  }

While this has the right spirit, it fails to properly handle cases in which the given account number is not found. The above description of the get method indicated that it returns null if the requested key isn’t in the hashtable. A null value should not leak out of findByNum. Instead, findByNum should check for it, and throw an exception if necessary:

  class AcctSetHM implements IAccountSet {

    private HashMap<Integer,Account> accounts =

      new HashMap<Integer,Account>();

  

    public Account findByNum(int num) throws AcctNotFoundException {

      Account acct = accounts.get(num);

      if (null == acct)

        throw new AcctNotFoundException(num);

      return acct;

    }

  }

For an example of put, we need a method that adds new accounts or new customers, neither of which we have including in the banking system to date. Let’s create a method to add customers.

What will a new customer method need to do? It will need to get the name of the new customer, generate an initial/default password, create a new Customer object, and add that object to the set of customers in the overall banking system. Where should we put such a method?

In the spirit of encapsulation, let’s first look at putting the bulk of the code in the classes that implement ICustSet. For a hashmap-based version, this might look like:

  class CustSetHM implements ICustSet {

    private HashMap<String,Customer> customers = new HashMap<String,Customer>();

  

    public void addCustomer(String name) {

      Customer newCust = new Customer(name, 12345);

      this.customers.put(name, newCust);

    }

  }

This has the put call in the right place, but arguably the default password decision does NOT belong in the CustSetHM class. Remember that we also had CustSetList from when we first did this example. We’d end up copying the newCust line, with the decision on how to generate default passwords, into each implementation of ICustSet. But that decision (a) shouldn’t be written more than once, and (b) really should be up to the BankingService, not the data structure classes. So while the put statement is in the right place, we should change the addCustomer method to take either the entire Customer object or the name and password as inputs.

2.1 A Design Aside: how to design addCustomer

Of course, this raises another design decision as to the arguments to addCustomer. Which of the following three addCustomer methods do you prefer?

  public void addCustomer(Customer newCust) {

    this.customers.put(newCust.getName(), newCust);

  }

  

  public void addCustomer(String name, int initpwd) {

    this.customers.put(name, new Customer(name, initpwd));

  }

  

  public void addCustomer(String name, Customer newCust) {

    this.customers.put(name, newCust);

  }

}

There is no clear winner among these three choices; each has drawbacks as just described. The second option somehow feels worse than the other two, since Customer creation really does belong in the BankingService. The first accommodates more data structures, so even at the expense of the getter, it might be the best of these options. It is’t clear though. The rest of the notes will work with the first option.

2.2 Back to the BankingService

The method in BankingService might appear as follows:

  class BankingService {

    ...

  

    // create a new customer with an initial password and

    //   add new customer to the customer set

    public Customer newCustomer (String custname) {

      Customer newC = new Customer(custname, this.genInitPwd());

      Customer oldC = this.customers.addCustomer(newC);

      return newC;

    }

  

    // generates default passwords for new customers.  Would not

    // generate the same password for all customers in a real system

    // Note: this is a good example of a private method

    private int genInitPwd() {

      return 1234;

    }

  }

For this to work, we have to add the addCustomer method to the ICustSet interface and each of its implementing classes.

  interface ICustSet {

    Customer findByName(String name) throws CustNotFoundException;

    Customer addCustomer(Customer newC);

  }

  

  class CustSetHM implements ICustSet {

    private HashMap<String,Customer> customers =

      new HashMap<String,Customer>();

  

    public Customer add(Customer newC) {

      return this.customers.put(newC.getUniqueID(), newC);

    }

  }

As we did when we called get in findByNum, we should augment this method to check for the null return value and throw an exception if a customer with this name already exists.

3 Defining Equality on Keys

Hashmaps store and retrieve values based on the keys. For hashmaps to work, Java must be able to tell whether data of the key type for a hashmap are the same. When we use integers or Strings as keys, Java can easily determine whether two keys are equal. What if we instead want to use objects of some class that we defined as the keys? For example, we might choose to use a hashmap to retrieve the accounts affiliated with each customer, rather than store a list of those accounts inside the customer.

If you need to create a hashmap using some class that you defined as the key, you need to tell Java two things:

You do this by adding two methods to classes that you will use as keys: equals takes another object of the same type and returns a boolean if they should be considered the same object; hashcode returns an int that must be unique (to high probability) across all the objects in the class. One good way to create a hashcode is to multiply the hashcode for each variable in your class by a different prime number, then sum the results. Here’s a simple example showing equals and hashcode methods for a non-built-in class.

If you are working in Eclipse, there is a menu item that helps you generate equals and hashcode based on essential fields in your class. Items 7 and 8 in Chapter 3 of Effective Java also discusses methods for writing your own hashcode methods.

4 Summary

Hashtables and hashmaps are a particular data structure that implement a more general ADT called a dictionary or map. These ADTs are used to map from keys to values. Hashtables and hashmaps are particularly fast implementations that achieves performance by converting keys to positive integers that are largely unique across the set of hashed items. Hashmaps themselves use these positive integers to directly access the locations where data is stored in memory (rather than having to search through memory for values).