Lecture 19 -- Graphs Traversal and Shortest Paths

# Lecture 19 -- Graphs Traversal and Shortest Paths ## Motivating Questions ## Conceptual Study Questions :::danger These notes are still being filled in, but they have the essence of what you need to start looking at project 2. They will be done by Saturday ::: # An Alternate Way to Write `canReach` Our current recursive version of `canReachVisit` works fine, but we could also write this method non-recursively. The key to seeing how it works is to realize that the recursive calls and the for loop set up an order in which the vertices will be explored. Let's look at the program trace again: ```=java= man.canReachVisited(har, {}) bos.canReachVisited(har, {man}) pvd.canReachVisited(har, {man, bos}) return false wor.canReachVisited(har, {man, bos, pvd}) har.canReachVisited(har, {man, bos, pvd, wor}) return true ``` When pvd gets called, Java remembers that the for-loop still needs to process wor. Rather than let Java remember that, let's maintain an explicit list of the vertices to process in order. We'll call this list `toCheck`. When we get to the neighbors, we'll add them to `toCheck`. Then, we keep taking vertices off of `toCheck` and process them until there are none left. Here's the code. It uses a stack to store `toCheck` (recall from lab that a stack is like a list in which we only add elements to the front and remove them from the front): ```=java= public class CityVertex { ... public boolean canReachStack(CityVertex dest) { HashSet<CityVertex> visited = new HashSet<CityVertex>(); Stack<CityVertex> toCheck = new Stack<CityVertex>(); // put the start node into the toCheck Stack toCheck.push(this); // as long as we have unvisited nodes to check, keep processing them while (!toCheck.isEmpty()) { CityVertex check = toCheck.pop(); visited.add(check); if (check == dest) { // found a solution! Cancel the while loop and return return true; } else { // add each unvisited node to the stack for (CityVertex neighbor : check.toCities) { if (!visited.contains(neighbor)) toCheck.push(neighbor); } } } return false; } } ``` Note that this program is not recursive. But it parallels what happens in the recursive call by maintaining the list of vertices that need to be processed, then processing them all in turn. Why would we both writing this version? Because being explicit about the order in which the vertices are explored turns out to be important in understanding how to process graphs. We'll see why in a moment. ## Finding Shortest Paths Up until now, we've asked "is there a route". We haven't asked for the route itself. In many cases, however, we do want to know how to get from point A to point B in a graph. Moreover, we often want to get there in as few steps as possible. Thus, we are turning our attention to finding a *shortest path* between two cities. To work through this problem, let's consider a different graph. Assume we are trying to find the shortest route from A to F. We determine "shortest" by the number of edges in the route. ![new graph](https://docs.cs200.io/uploads/6b600ba3-06e8-452b-ad46-636db40872a7.png =250x290) By inspecting the graph, we see that there are no routes from A to F of length 1. There are, however, two routes from A to F of length 2: A-C-F and A-B-F. Either one of these could be returned as a shortest route from A to F. ### Computing shortest paths As we have done recently, let's try working out a plan for computing shortest routes. Here's one proposal: **Plan to Find Shortest Route by Enumerating All Paths** 1. Generate a list of all routes in the graph 2. Find those that go from A to F 3. Sort them in order of length 4. Return one of the shortest ones What do we think of this approach? Generating a list of all the routes seems complicated, as well as overkill. In a large graph, it would also be a pretty long list, and sorting that list would get expensive (either $r^2$ or $r * log(n)* where $r$ is the number of routes). This really isn't viable, but we show it here just to show the contrast with a better approach. **Plan to Find Shortest Route by Checking Paths in Order of Length** If we instead try to mimic the explanation we gave just below the graph, we might try an approach like the following: 1. Generate all paths of length 1 from A; check whether any goes to F 2. Generate all paths of length 2 from A; check whether any goes to F 3. repeat at increasing lengths until path found or out of paths This is an improvement -- by checking the routes in order of length until we find one that works, we avoid having to generate all of them or having to sort them. We just need a way to generate the paths in order of length. How might we go about this? We start with the start vertex (here A). The routes of length 1 from A can be generated by taking each neighbor of A and creating a path from A to that vertex. This would yield routes A-D, A-C, and A-B. The routes of length 2 could be computed by taking each route of length 1 and extending it with the neighbors of the last vertex on the route. Thus: - A-D would yield new route A-D-E - A-C would yield new routes A-C-G and A-C-F - A-B would yield new route A-C-F Summing up, we would like to consider routes in this order: A, A-D, A-C, A-B, A-D-E, A-C-G, A-C-F (stop here, since have route from A to F) **Stop and make sure you understand this approach** Actually creating and storing all of these paths would end up using up a fair bit of memory. It would be nice to optimize this. Rather than building the whole path, we will take two steps: 1. have `toCheck` store the vertices from which to expand potential routes, not the full paths - vertices will be explored in the order A, D, C, B, E, G, F 2. track which vertex was responsible for adding each new vertex to `toCheck`. This information looks like: - D came in from A - C came in from A - B came in from A - E came in from D - G came in from C - F came in from C For step 1, all we have to do is switch the Stack in our previous code to a Queue: that way, we would explore the nodes closer to A before nodes farther from A. Here's the code with step 1 implemented: ```=java= public class CityVertex { ... // different approach to canReach -- we make the call stack explicit public boolean canReachQueue(CityVertex dest) { HashSet<CityVertex> visited = new HashSet<CityVertex>(); Queue<CityVertex> toCheck = new LinkedList<>(); // put the start node into the toCheck Stack toCheck.add(this); // as long as we have unvisited nodes to check, keep processing them while (!toCheck.isEmpty()) { CityVertex check = toCheck.remove(); visited.add(check); if (check == dest) { // found a solution! Cancel the while loop and return return true; } else { // add each unvisited node to the (end of the) queue for (CityVertex neighbor : check.toCities) { if (!visited.contains(neighbor)) toCheck.add(neighbor); } } } return false; } } ``` For step 2, we have to add some data structure to our code that will store the "came from" entries. We're leaving this as an exercise for you on the project. ### Reconstructing Paths Assume we had stored the following information. - D came in from A - C came in from A - B came in from A - E came in from D - G came in from C - F came in from C To reconstruct the route, we start with F in this list of information and work back to A, noting each vertex we encounter. Here, we see that F came in from C, which came in from A. Thus, our path must be A-C-F. Our implementation to construct a path would be a separate method that we call once we determine that a path exists. Here's an outline of the code one last time: ```=java= public class CityVertex { ... // different approach to canReach -- we make the call stack explicit public LinkedList<CityVertex> shortestRoute(CityVertex dest) { HashSet<CityVertex> visited = new HashSet<CityVertex>(); Queue<CityVertex> toCheck = new LinkedList<>(); ??? cameFrom; // a data structure storing how vertices came in // put the start node into the toCheck Stack toCheck.add(this); // as long as we have unvisited nodes to check, keep processing them while (!toCheck.isEmpty()) { CityVertex check = toCheck.remove(); visited.add(check); if (check == dest) { // found a solution! // Cancel the while loop and reconstruct the route return this.reconstructRoute(dest, cameFrom); } else { // add each unvisited node to the (end of the) queue for (CityVertex neighbor : check.toCities) { if (!visited.contains(neighbor)) toCheck.add(neighbor); } } } // return empty list or throw exception if no route found return new LinkedList<>(); } } ``` Figuring out what `cameFrom` looks like and how to reconstruct the route from it is part of your work for project 2, so we won't give you the details here.