CS 170 Project 1 Minimum Spanning Trees: Prim's Algorithm and Dynamic Recomputation Due 4pm Friday, March 16, 2001 This is an individual project. Copy the Project 1 directory by doing the following, starting from your home directory. Don't forget the "-r" switch in the cp command. mkdir pj3 cd pj3 cp -r ~cs170/hw/pj1/* . The Project =========== In this project, you will program two implementations of Prim's algorithm. The binaries should be called PrimDense and PrimSparse. PrimDense will use an adjacency matrix graph representation and an array-based priority queue. It should run in O(n^2) time. PrimSparse will use an adjacency list graph representation and a binary heap- based priority queue. It should run in O(e log n) time. It will also include a dynamic MST algorithm which recomputes the MST in O(n + e) time when one edge is added to or removed from the graph, in a manner similar to the solution of Problem 4 of Homework 5. Your implementation should run faster (on large graphs) than computing the MST from scratch. Both programs must be written in C or C++. They read a graph from the standard input, compute its MST, and write the MST to the standard output. PrimSparse can also read information about what edges are to be dynamically added or removed, and writes the MST of the original graph and the MST after each edge addition or removal. Both programs should work correctly even if the input graph is not connected. This necessitates a small change to Prim's algorithm as given in the lecture notes. CLR has it right: all the vertices are initially on the priority queue, but all but one have a priority of infinity. With this change, Prim's algorithm computes the minimum spanning forest, which contains one spanning tree for each connected component of the input graph. The input graphs may have self-edges. Obviously, however, these will have no effect on the MST. Both programs should divide computation into three separate stages: input, MST computation, and output. Part of your task is to measure the running time of the MST computation stage, so make sure that the (possibly much slower) input and output stages are completely separated from the computation stage so as not to compromise your timings. The dynamic MST recomputation in PrimSparse must also be divided into these three separate stages, but you may rerun all three stages once for each changed edge. (You do not need to read _all_ the edge changes at the beginning, or save the intermediate MSTs to be printed at the end.) It's up to you to figure out the algorithms for dynamically adding or removing an edge in linear time, but here are some hints. Prim's algorithm expresses the MST as a directed graph (where each vertex has a parent pointer). To dynamically add and remove edges, you will need an undirected graph representation (unless you are very, very clever) so that you can determine adjacencies and run DFS on the MST in linear time. Make sure that when you convert the MST to an undirected representation, the conversation takes O(n) time. Adding an edge to the graph requires you to check if that edge should be added to the MST, and if so, what other edge should be removed. This was partly covered in Homework 5, but don't forget to consider the case where the graph is not connected and the new edge connects two previously disconnected components (and therefore no edge should be removed). (Don't worry; you won't need a connected components implementation to check this. You will need DFS, though.) Removing an edge from the graph requires you to determine whether that edge is in the MST, and if so, what edge replaces it. (It is possible that no edge replaces it, and the number of connected components increases by one.) This was covered in Homework 5, and is also closely related to Problem 4 on Midterm I. Your project may not call any external data structure libraries. Provide your own code for all functions, including the graph data structures, the binary heap, and all graph algorithms. Graph Format ============ The input to your programs will have the following format. [First section specifies the original graph:] First line: <# of vertices> <# of edges> One line for each edge: [Second section specifies changes made dynamically to the graph:] One line: <# of edge changes> For each edge change, one of the following two lines: 0 1 If there are n vertices, they are labeled from 0 to n-1. The weights are integers. In the edge changes, a prefix of 0 specifies an edge deleted from the graph. A prefix of 1 specifies an edge added to the graph. Both PrimDense and PrimSparse read the first section and compute the MST. PrimDense ignores the second section altogether, but PrimSparse should read and process it. Two random graph generators are provided. (Although they are both random, they always start with the same random number seed to make it easier to replicate bugs in your code and more reasonable to compare timings.) Their source code is in MakeDense.c and MakeSparse.c. Compile them using the provided make file "makemake" with the command make -f makemake MakeDense and MakeSparse are invoked by running them with one or two command- line parameters. The first parameter is the number of vertices in the graph, and the optional second parameter is the number of edge changes that should be generated. For example, "MakeSparse 10 2" will generate a sparse ten-vertex graph, followed by two edge changes. If not specified, the second parameter defaults to zero. Both programs write a graph to the standard output. By running these programs and observing the output, you can confirm your understanding of the graph format. It should be possible to pipe the output of the random graph generators to both your implementations of Prim's algorithm, for instance by typing MakeDense 500 | PrimDense MakeSparse 100 8 | PrimSparse MakeDense and MakeSparse rarely generate disconnected graphs, so you will want to generate a few disconnected graphs by hand to test your code's correctness. Incidentally, the scanf() function should do everything you need to read graphs from the standard input. MST Format ========== Your programs' output will have the following format. First line: <# of vertices> <# of edges> One line for each edge: One line: Again, the n vertices are labeled from 0 to n-1. Here is an example. If the input graph is 4 6 1 3 88 2 0 1 0 3 77 2 1 28 3 3 61 0 1 0 0 then the output MST is 4 3 0 1 0 0 2 1 0 3 77 78 Note that the order of the edges in the MST does not matter, as long as they're all there. Also note that number of edges in the MST is equal to the number of vertices minus the number of connected components in the input graph. Your PrimSparse implementation should also process the sequence of edge additions and deletions that follows the original graph, and output one additional MST for each change to the graph. Each MST is printed in the same format as the first. For example, if the input graph is 3 4 1 0 58 2 2 9 0 0 48 0 2 83 2 1 1 2 1 0 0 1 then the output is 3 2 0 1 58 0 2 83 141 3 2 0 1 58 1 2 1 59 3 2 2 1 1 0 2 83 84 Timing ====== Measure the speed of your MST algorithms in milliseconds. (Hint: read the man page for the Unix gettimeofday() procedure.) Your measurements should time Prim's algorithm and the dynamic MST algorithm only, and not include any input or output operations. Draw graphs (the other kind of graph--with an x-axis for vertices and a y-axis for running time) illustrating the running times of PrimDense and PrimSparse, each on graphs generated by both MakeDense and MakeSparse. (That's 4 separate graphs.) For each of the four combinations, find the running times for graphs of 100, 200, 500, 1,000, 2,000, and if possible 5,000 vertices. When running PrimSparse on a graph generated by MakeSparse, you should also be able to time 10,000-, 20,000-, and perhaps 50,000-vertex graphs. You may draw these graphs by hand or use any plotting program. Hand them in the same way you hand in the homeworks, unless you have them in electronic form and prefer to submit them electronically. Make sure your name, login ID, student ID, and discussion section number are written on any paper items you submit. Which is faster on a large dense graph--your implementation of PrimSparse or of PrimDense? Explain why (in the GRADER file). Which is faster on a large sparse graph--your implementation of PrimSparse or of PrimDense? Explain why. On several graph sizes, time your algorithm for dynamically recomputing the MST (in PrimSparse) when you add an edge to the graph. There are two cases: the MST changes, or it does not; give timings for both cases. How much faster is this than recomputing the graph from scratch? On several graph sizes, time your algorithm for dynamically recomputing the MST (in PrimSparse) when you remove an edge from the graph. There are two cases: the MST changes, or it does not; give timings for both cases. How much faster is this than recomputing the graph from scratch? Make sure that the versions of PrimDense and PrimSparse you submit do NOT print timings to the standard output! They should only print the MST. Submitting your Solution ======================== Edit the file called GRADER to answer the questions in it. Your code must be submitted electronically. You may resubmit as often as you like; only the last submission will be graded. Figuring out how to submit the project is part of the project, so we recommend you try submitting your solution several days in advance (even if your project is incomplete) just to make sure you can get the submission software to work correctly. If you find out two minutes before the project is due that you can't run "submit" because your environment variables aren't set right, we won't return the points you lose for being late. Before submitting, you must have a make file named "makefile" that completely compiles your project. (However, your makefile should NOT compile the graph generators MakeDense and MakeSparse.) Make sure that your project compiles and runs correctly on the lab machine mingus.eecs just before you submit it, because we plan to run the grading software on that machine. There are two ways to submit. One way is through the Web page http://saidar.eecs.berkeley.edu:8080/~register/ . The other way is by using the submit program from one of the lab machines. Make sure that your shell's environment has a MASTERDIR variable set to /home/ff/cs170 : in sh : MASTERDIR=/home/ff/cs170 in csh: setenv MASTERDIR /home/ff/cs170 Change (cd) to your pj1 directory, which should contain your GRADER, your makefile, and all the other files needed to compile PrimDense and PrimSparse. Type "submit pj1". If the submit program isn't found on your path, try running "/share/b/grading/bin/submit pj1". (If that doesn't work, consult with root@cory about where to find "submit" for your architecture. Be sure to tell them which lab machine you are trying to submit from.) Make sure that ALL the files your project needs to compile are in your pj1 directory and get submitted! You will lose 1% of your earned score for each hour your project is late.