3449 views
# Mutable Lists -- Behavior and Implementation ## Motivating Question What classes do we need to create to implement mutable lists (that remember addition and deletion of elements)? ## Conceptual Study Questions - why do we need a separate `MutableList` class for mutable lists but not for immutable lists? - why don't we have an `EmptyList` class with `MutableList` ## Programming patterns and Mutation Imagine that Akbar wants to maintain his grocery list using a program. Every time he thinks of something he needs, he wants to throw it onto the list (which he'll look at when he gets to the store). He writes the following program (which for now pretends that `NodeList` can contain strings rather than ints -- we'll get back to that later in this lecture): ```=java public class Groceries { ... NodeList buy = new EmptyList(); buy.addFirst("bread"); buy.addFirst("apple"); buy.addFirst("cheese"); ... } ``` When Akbar gets to the store and looks at the contents of `buy`, what will he see? That the list is empty! What happened to all of the items that he added to the list? Did Java lose track of them? Not at all. To see what happened, let's draw out the memory diagram for this program. As a reminder, here is the code for `addFirst`: ```=java public NodeList addFirst(int newElt) { return new NodeList(newElt, this); } ``` ![shopping list memory diagram](https://docs.cs200.io/uploads/c87a301b-6ba7-4353-b654-809396e4ce0e.png =450x500) Each time the program calls `buy.addFirst`, a new `NodeList` item is created and returned as the result of the method. Nothing in the program says "change `buy` to now refer to the updated list". The name `buy` remains mapped to the initial `EmptyList` object. ### An Aside: Garbage in Memory As a side note: observe that none of the `NodeList` objects are actually reachable from any name in the environment. This means that no program can use those objects! You use variable names in your code to refer to objects. The environment is the only way that programs can access data in the heap/memory. If there is no way to follow references/arrows from some name in the environment to a specific object in the heap, that item cannot be used by the program. These stranded objects are known as "garbage". Languages often use a process called "garbage collection" to get those unusuable objects out of memory. We'll study garbage collection later this semester. ## Retaining List Updates If Akbar had instead written his program as ```=java public class Groceries { ... NodeList buy = new EmptyList(); buy = buy.addFirst("bread"); buy = buy.addFirst("apple"); buy = buy.addFirst("cheese"); ... } ``` then as each new `NodeList` object was returned, the `buy =` part of the code would tell Java "and update the environment so that `buy` now refers to the new value". With that code, we would have gotten the following picture: ![shopping list memory diagram with buy updated](https://docs.cs200.io/uploads/f16e94c9-8e59-415e-8e1f-a8901f8b2e8c.png =450x500) This picture uses color to show what happened on each call to `addFirst`: when the first call (blue) happens, `next` is set up to refer to the `buy` object at that point, then the name `buy` is redirected to the resulting `NodeList`. The second call (pink) does the same thing: this time, `buy` referred to the `NodeList` object from the previous call, not the original `EmptyList`. This repeats one last time (green). Now, if Akbar asked what is on the list `buy`, we'd follow the reference from the environment to see items "cheese", "apple", and "bread", in that order. By manually updating `buy` to refer to the new objects each time, the name `buy` reflects changes to the list. ## Wouldn't it be nice to avoid `buy =`? Ideally, we want to write the first version of Akbar's groceries program but have it behave like the second. In order to make that work, we have to avoid having to redirect `buy` after each `addFirst`. We do this by making one small change to our design. If we leave `buy` referring to the same object each time we call `addFirst`, then Akbar could write the first program rather than the second. In other words, we want the picture to look like the following: ![diagram with mutable list object and how it evolves over two calls to addFirst](https://docs.cs200.io/uploads/589b3302-48bb-4946-9853-d73d82d8d4f8.png) In other words, we create a `MutableList` object, and insert new nodes "behind" that object. The `MutableList` object provides a fixed reference for the name `L`, while the list of objects changes behind it. Here's what it looks like in code: ```=java public class MutableList { private IList start; public MutableList() { this.start = new EmptyList(); } public void addFirst(int newElt) { NodeList newNode = new NodeList(newElt, this.start); this.start = newNode; } } ``` Note that `addFirst` still creates a new `NodeList` object, just as we did before. This time, however, instead of returning the new node, we make it the start of the list stored in the `MutableList` class. ## Alternative Design: No EmptyList Class In practice, `LinkedList` implementations often elide the `EmptyList` class entirely and instead rely on a special value which is called `null` in Java (other languages may have different names for the same concept). `null` is a valid value under every type that expects an object. Here's a revised picture of the `MutableList` using `null` instead of the `EmptyList`. ![A MutableList memory diagram variation with only a chain of Node objects, but no EmptyList object](https://docs.cs200.io/uploads/4f83cb07-930d-4432-bc0d-08873d8c01f2.png) In this diagram, we changed the name of the inner class to `Node` (rather than `NodeList`) to avoid confusion when contrasting the two approaches. How does using `null` change our implementation? Let's start with our existing code for `MutableList`. :::info **Stop and Try**: Mark up the original code (below) to match our revised diagram. If you aren't sure where to start, remember that the idea is to replace the `EmptyList` object with `null`, so find uses of `EmptyList` and go from there. ::: ```=java= interface IList { public NodeList addFirst(int newItem); public int length(); } public class AbsList implements IList { public NodeList addFirst(int newItem) { newNode = new NodeList(newItem, this); return newNode; } } public class EmptyList extends AbsList {} public class NodeList extends AbsList { int first; IList next; } public class MutableList { IList start; // front of the list public MutableList() { // constructor this.start = new EmptyList(); } public void addFirst(int newItem) { this.start = this.start.addFirst(newItem); } public int length() { return this.start.length(); } } ``` What changes did you find? - rename `NodeList` to `Node` - eliminate the `EmptyList` on line 24 by replacing it with `null` - since we changed the value assigned to `this.start`, check the type of `this.start` on line 21. It says `IList`, but we created that interface to allow for either a node or an empty list. Between that insight and the new diagram, we can change the type of `start` to just `Node` - remove the `EmptyList` class - with just a `Node` class, there's no need for `AbsList` or `IList` With these changes, our code now looks as follows. This version also moves the `addFirst` code that used to be in the `NodeList` class into the `MutableList` class because it is the list itself that should have an `addFirst` method. ```=java= public class Node { int first; Node next; } public class MutableList { Node start; // front of the list public MutableList() { // constructor this.start = null; } public void addFirst(int newItem) { this.start = new Node(newItem, this.start) } public int length() { return this.start.length(); } } ``` ### Fixing the `length` method One issue remains with our new code: the `length` method in `MutableList` calls the `length` method in the `Node` class, but what if `this.start` (in `MutableList`) or `this.next` (in `Node`) is `null`? The `null` value has no methods, so this would yield an error. To fix the error, we need to check whether `this.start` and `this.next` are `null` before trying to call the `length` method. Here's the fix in the `Node` class: ```=java public class Node { private int data; private Node next; public int length() { if (this.next == null) { return 1; } else { return 1 + this.next.length(); } } } ``` This method has the same general pattern as in the `NodeList` class, except we first check whether we're at the end of the list. :::info **Try it:** Make a similar modification to the `length` method in the `MutableList` class ::: Note that there are two `length` methods: one in the `MutableList` class, which calls the corresponding (recursive) method in the `Node` class. This is a common pattern when implementing `MutableList` methods. ## Contrasting our List Implementations Let's step back and contrast the features and design choices across our two list implementations: | | NodeList | MutableList | | -------- | -------- | -------- | | Edits visible through environment name | no | yes | | End of list marker | `EmptyList` | `null` | | Methods calls check for emptyness | no | yes | | Needs interface over nodes | yes | no | :::info **Stop and Think:** Which of these features are related to one another? ::: The bottom three features are all related, while the first is independent: if you choose to use `null` to represent the end of the list, any method calls must first check whether a value is `null`. Thus, we have really studied just two high level design choices (one on each axis): | | Mutable | Immutable | | -------- | -------- | -------- | |**`null` for empty** | `MutableList`, `LinkedList` | Doable but we didn't write this combo | | **`EmptyList` for empty** | Our first try at `MutableList` | `NodeList` | Which choices are better? - The choice between mutable and immutable lists is based on your programming problem. Sometimes we want one, and sometimes we want the other (we'll see more examples of this choice through the semester) - The choice between `null` and `EmptyList` is one of internal design. - The `EmptyList` approach respects the philosophy of object-oriented programming because there are different classes for different types of data (the `EmptyList` class holds the computation for empty lists, rather than rely on an `if` statement). Pure OO programs (almost) never use `if` statements to check what type of data you have. - The `null` approach is what you are more likely to encounter in practice. People like it because it involves fewer classes and interfaces, despite having to check for `null` values everywhere. In this course, we teach you both because we want you to know both the pure OO approach (which is very powerful in many settings) while also preparing you for the code you are likely to see in the real world. That said, if you use coding assistants to generate code for you, make sure that you are getting the design you intended if you want the pure OO solutions. ## Defining Classes over Different Types (Generics) What if we wanted lists of strings rather than ints? In Java, we can add a "type parameter" to a class. This is what you use when you write something like the following to use plain Java Linked Lists. ```=java new LinkedList<Integer> ``` Here, `Integer` is the type of the list contents. To set up our `MutableList` class to work on any kind of list, we add a similar notation (the <T> that follows the name of the class). We then use the type T anywhere that we want to type parameter to appear (such as the type of the `data` field in `Node`). ```=java public class MutableList<T> { // reference to the first node in the list private Node<T> start; public void addFirst(T newElt) { Node<T> newNode = new Node<T>(newElt, this.start); this.start = newNode; } } public class Node<T> { private T data; private Node next; public Node(T newElt, Node next) { this.data = newElt; this.next = next; } } ```