531 views
# Lecture 13 -- Introduction to Hashmaps *Note: the first part of the lecture, on the code for implementing wrap-around arrays, is covered in the notes posted for lecture 12.* ## Motivating Question Arrays provide fast access to items based on their position. Can we use them for fast lookup in situations where the notion of position isn't a number starting at 0? ## Conceptual Study Questions - If we used an `ArrayList` instead of a `LinkedList` in our initial approach to problem 3, would we have been able to fetch information in constant time? ## Recap: The Principle of Mapping Indices In studying wrap-around/cyclic arrays, we saw a useful technique in data structure design: ## Three Design Problems Consider the following three design problems. They are all similar, but they differ in the nature of the labels on which we want to look up information (see italics across the problems). **Design problem #1:** A professor is trying to manage enrollments for several lab sections (*numbered 01 through 14*). For each lab, the professor needs to store the capacity of the room and the number of students in the lab. Propose specific data structures to organize this information. **Design problem #2:** A department is trying to manage enrollments several courses (*numbered 1000 through 1999*). For each course, the department needs to store the capacity of the room and the number of students in the course. Propose specific data structures to organize this information. **Design problem #3:** A professor is trying to manage enrollments for multiple lab sections, each labeled with the day of week and start time (*such as Mon 8-10, Tues 4-6, etc*). For each lab, the professor needs to store the room where the lab is meeting. **Activity:** Propose data structures for the information in each of these problems. Think in particular about the nature of the labels and how they affect your choices. *Actually try this before looking ahead to the answers below* ### Design problem 1 An array of course objects seems an obvious choice for this problem. The lab section numbers start close to 0, which means they align naturally with array indices. ```=java= class CourseData { int seats; int enrollment; } class Main { CourseData[] labs = new CourseData[15] // 15 so allow 14 to be a valid index // look up the data for a lab section public CourseData get(int labnum) { return CourseData[labnum]; } } ``` In this design, we wouldn't use index 0 (the value would remain `null`). A single memory slot isn't a big deal, so this is fine. ### Design problem 2 Perhaps we can use the same approach as for problem 1, but make the array large enough to support all the course numbers as indices: ```=java= CourseData[] courses = new CourseData[2000] ``` While this works, now we're leaving 1000 memory slots unused (indices 0 through 999 would all contain `null`). This is getting wasteful. But, think about how we used an expression to convert between a programmer's positions in a list to array indices with cyclic arrays. We're going to do the same thing here: ```=java= class Main { // only create as many slots as we need CourseData[] courses = new CourseData[1000] // look up the data for a course public CourseData get(int coursenum) { return CourseData[coursenum - 1000]; // adjust index } } ``` Notice the similar pattern to what we did with cyclic arrays. The user of our data structure (here, the department managing courses) has an index scheme in mind, based on numbers between 1000 and 1999. We use a formula to translate from their scheme to one that we will use internally to reduce the space needed for the underlying array. The formula gets used within the square brackets used to access array data. We could have done something similar in problem one to avoid wasting the slot with index 0, by instead writing ```=java= public CourseData get(int labnum) { return CourseData[labnum - 1]; // shift indices down } ``` ### Design problem 3 The twist in problem #3 is that the labels that the professor wants to use are strings, not numbers. This suggests that we cannot use arrays (which only have integer indices). One option would be to just make a list of lab objects and provide a `getLab` method that searches the list: ```=java= class Lab { String label; // like "Mon 4-6" String room; // like "CIT 219" } class Main { LinkedList<Lab> allLabs; public Lab getLab(String withLabel) { for (lb in this.allLabs) { if (lb.label.equals(withLabel)) { return lb; } } throw new RuntimeException("No lab labeled " + withLabel); } } ``` For this problem, the use of the loop means that `getLab` is linear in the number of labs; the `get` method for each of problems #1 and #2 was constant. *This is a good time to consider the study question on ArrayLists vs LinkedLists.* Let's think about the insight from the first two problems: we were able to use an array effectively by having a formula that translated the labels that the professor/department wanted to use into array indices. *If such a function exists, we can use arrays and restore constant time for `get`.* If we could do such a thing, the code might have looked like: ```=java= class Lab { String label; // like "Mon 4-6" String room; // like "CIT 219" } class Main { Lab[] allLabs; public Lab getLab(String withLabel) { // we'd need to write translateToInt return allLabs[translateToInt(withLabel)] } } ``` The question is, how might we do such a thing? ## Introduction to HashMaps Hashmaps (called dictionaries in some languages) are a built-in data structure that do exactly this: they store data with arbitrary labels in arrays, converting those labels into indices to allow for constant time `get` and lookup of data. Before we learn how learn how this is done, let's get some practice using built-in hashmaps as users. Here's an example of using HashMaps in Java. This example is for problem #3 above: we want to map lab times to rooms. ```=java= // Map lab times to rooms HashMap<String, String> labRooms = new HashMap<String, String>(); // Associate this key with this value labRooms.put("Mon 4-6", "CIT219"); labRooms.put("Tue 6-8", "CIT501"); labRooms.get("Mon 4-6"); // Returns "CIT219" // Changes the value mapped to this key labRooms.put("Mon 4-6", "CIT444"); labRooms.get("Mon 4-6"); // Returns "CIT444" labRooms.get("Wed 8-10"); if(labRooms.containsKey("Mon 4-6")) { // . . . } ``` We set up a `HashMap` by providing two type parameters: the first is for the *Key* (what we have been calling to the label) and the second is for the *Value* that we will store under that key. Here are some other examples of declaring `HashMaps`: - Map from ID numbers to student names: - `HashMap<Integer,String>` - Map from ID numbers to student objects: - `HashMap<Integer,Student>` - Map from course names to lists of enrolled students: - `Hashmap<String,LinkedList<Student>>` Keys and values can be objects of any types. Lines 6 and 7 show how to store values under labels in the HashMap (using the `put` method). Line 10 shows how to look up the value that has been stored with a key (using the `get` method). A hashmap stores only one value per unique key, so when we `put` a new room for the `"Mon 4-6"` lab on line 14, the old value (`"CIT219"`) gets kicked out and replaced with `"CIT444"`. We show this with the `get` command on line 15. If we ask `get` for the value for a key that has not yet been `put` into the hashmap, `get` will return `null`. Sometimes, we want to find out whether a specific key has a value in the hashmap. To do this, we use the `containsKey` method as illustrated on line 21. Next lecture, we will learn how all of this works by starting to implement our own HashMap class.