Lecture 8 - Mutable Lists

# Mutable Lists -- Behavior and Implementation ## Motivating Question What classes do we need to create to implement mutable lists (that remember addition and deletion of elements)? ## Conceptual Study Questions - why do we need a separate `MutableList` class for mutable lists but not for immutable lists? - why don't we have an `EmptyList` class with `MutableList` ## Programming patterns and Mutation Imagine that Akbar wants to maintain his grocery list using a program. Every time he thinks of something he needs, he wants to throw it onto the list (which he'll look at when he gets to the store). He writes the following program (which for now pretends that `NodeList` can contain strings rather than ints -- we'll get back to that later in this lecture): ```=java public class Groceries { ... NodeList buy = new EmptyList(); buy.addFirst("bread"); buy.addFirst("apple"); buy.addFirst("cheese"); ... } ``` When Akbar gets to the store and looks at the contents of `buy`, what will he see? That the list is empty! What happened to all of the items that he added to the list? Did Java lose track of them? Not at all. To see what happened, let's draw out the memory diagram for this program. As a reminder, here is the code for `addFirst`: ```=java public NodeList addFirst(int newElt) { return new NodeList(newElt, this); } ``` ![shopping list memory diagram](https://docs.cs200.io/uploads/c87a301b-6ba7-4353-b654-809396e4ce0e.png =450x500) Each time the program calls `buy.addFirst`, a new `NodeList` item is created and returned as the result of the method. Nothing in the program says "change `buy` to now refer to the updated list". The name `buy` remains mapped to the initial `EmptyList` object. ### An Aside: Garbage in Memory As a side note: observe that none of the `NodeList` objects are actually reachable from any name in the environment. This means that no program can use those objects! You use variable names in your code to refer to objects. The environment is the only way that programs can access data in the heap/memory. If there is no way to follow references/arrows from some name in the environment to a specific object in the heap, that item cannot be used by the program. These stranded objects are known as "garbage". Languages often use a process called "garbage collection" to get those unusuable objects out of memory. We'll study garbage collection later this semester. ## Retaining List Updates If Akbar had instead written his program as ```=java public class Groceries { ... NodeList buy = new EmptyList(); buy = buy.addFirst("bread"); buy = buy.addFirst("apple"); buy = buy.addFirst("cheese"); ... } ``` then as each new `NodeList` object was returned, the `buy =` part of the code would tell Java "and update the environment so that `buy` now refers to the new value". With that code, we would have gotten the following picture: ![shopping list memory diagram with buy updated](https://docs.cs200.io/uploads/f16e94c9-8e59-415e-8e1f-a8901f8b2e8c.png =450x500) This picture uses color to show what happened on each call to `addFirst`: when the first call (blue) happens, `next` is set up to refer to the `buy` object at that point, then the name `buy` is redirected to the resulting `NodeList`. The second call (pink) does the same thing: this time, `buy` referred to the `NodeList` object from the previous call, not the original `EmptyList`. This repeats one last time (green). Now, if Akbar asked what is on the list `buy`, we'd follow the reference from the environment to see items "cheese", "apple", and "bread", in that order. By manually updating `buy` to refer to the new objects each time, the name `buy` reflects changes to the list. ## Wouldn't it be nice to avoid `buy =`? Ideally, we want to write the first version of Akbar's groceries program but have it behave like the second. In order to make that work, we have to avoid having to redirect `buy` after each `addFirst`. We do this by making one small change to our design. If we leave `buy` referring to the same object each time we call `addFirst`, then Akbar could write the first program rather than the second. In other words, we want the picture to look like the following: ![diagram with mutable list object](https://docs.cs200.io/uploads/bc3c66a3-6e7d-4158-b83a-661bc457ec0a.png =500x200) In other words, we create a `MutableList` object, and insert new nodes "behind" that object. Here's what it looks like in code: ```=java public class MutableList { // reference to the first node in the list private Node start; public void addFirst(int newElt) { Node newNode = new Node(newElt, this.start); this.start = newNode; } } public class Node { private int data; private Node next; public Node(int newElt, Node next) { this.data = newElt; this.next = next; } } ``` Note that `addFirst` still creates a new `Node` object, just as we did before. This time, however, instead of returning the new node, we make it the start of the list stored in the `MutableList` class. ## Initializing `start` The above code fails to show one small detail: in the constructor, what is the initial value of the `start` field? We have two options: 1. have `start` refer to a new `EmptyList` object 2. have `start` refer to `null` () The first option would mimic what we did with teh `NodeList`/immutable implementation. From a pure object-oriented perspective, that would be the correct approach to use here, because our list operations still need different behavior depending on whether the list is empty or non-empty That said, using `null` in such situations is common in industrial practice. We don't want to leave you unprepared for what you might see in practice, so we will follow that approach here. If we do that, the `MutableList` class will look like: ```=java public class MutableList { // reference to the first node in the list private Node start; public MutableList() { this.start = null; } public void addFirst(int newElt) { Node newNode = new Node(newElt, this.start); this.start = newNode; } } ``` ## Methods on MutableLists What would something like the `length` method look like on a `MutableList`? ```=java public int length() { return this.start.length(); } ``` This needs a corresponding method in the `Node` class: ```=java public class Node { private int data; private Node next; public int length() { if (this.next == null) { return 1; } else { return 1 + this.next.length(); } } } ``` This method has the same general pattern as in the `NodeList` class, except we first check whether we're at the end of the list. The other pattern to note here is that there are two `length` methods: one in the `MutableList` class, which calls the corresponding (recursive) method in the `Node` class. You'll follow this for every method that you implement on a `MutableList`. ## Defining Classes over Different Types (Generics) What if we wanted lists of strings rather than ints? In Java, we can add a "type parameter" to a class. This is what you use when you write something like the following to use plain Java Linked Lists. ```=java new LinkedList<Integer> ``` Here, `Integer` is the type of the list contents. To set up our `MutableList` class to work on any kind of list, we add a similar notation (the <T> that follows the name of the class). We then use the type T anywhere that we want to type parameter to appear (such as the type of the `data` field in `Node`). ```=java public class MutableList<T> { // reference to the first node in the list private Node<T> start; public void addFirst(T newElt) { Node<T> newNode = new Node<T>(newElt, this.start); this.start = newNode; } } public class Node<T> { private T data; private Node next; public Node(T newElt, Node next) { this.data = newElt; this.next = next; } } ```