CS 61B: Data Structures (Spring 2002)
Final Exam

### Solutions

Problem 1. (9 points) Quickies.

b. Some examples of acceptable answers:

• A reference is a pointer to a pointer.
• Pointers can be assigned to ints, or vice versa. References cannot.
• Pointer arithmetic is permitted. References have no analogue.

c. 6 7 4 - / 5 3 1 * + *

d. Θ(n2).

e.

```
void increment(int *ptr) {
*ptr = foo(*ptr);
}
```

f. The insert operation negates each key before inserting it into the min-heap. The removeMax operation removes the minimum key from the min-heap and negates it before returning it.

g. Replace SensorIn with a stub that generates faked earthquake data.

Problem 2. (9 points) Operations on Data Structures.

a. findElement(1), then findElement(3).

b.
 x 1 3 4 8 6 5 7 9

c. Insert 1, 2, and 3.
 1       2       3

Insert 1, 2, 3, and 4; then delete 4.

2
/    \
1      3

d. All the keys are equal.

Problem 3. (6 points) Sorting.

a.

7 2 9 3 0 8 4 1
2 7 9 3 0 8 4 1
2 7 3 9 0 8 4 1
2 3 7 9 0 8 4 1
2 3 7 9 0 8 1 4
2 3 7 9 0 1 4 8
0 1 2 3 4 7 8 9

b. Here are two acceptable answers. There might be others.

5 4 1 3 2
4 5 3 1 2

c. Radix sort the array. Choose the key with median index.

Problem 4. (8 points) Hash Tables.

a. When the table is resized, the keys might no longer collide.

b. If the board is empty except one chip, then moving the chip along a diagonal on which i - j is constant doesn't change the board's hash code. Therefore, collisions are likely.

c. Θ(n).

d. Θ(n2). Let N be the number of buckets. The table is resized when n/N > 1/n, which implies that n2 is just slightly greater than N. Resizing increases the size of the table by a factor of n, so right after a resize, the table has Θ(n3) buckets. These buckets take Θ(n3) time to create.

The last resizing is bigger than all the previous ones put together, so the total time for all the resizing operations to date is in Θ(n3). Since the number of items is n, at least n insertItem operations have taken place, so the average time per insertItem operation is in Θ(n2) time.

Problem 5. (7 points) Disjoint Sets.

a. Θ(u + f + fu). (We accepted Θ(fu) for full marks, but it's nice to write Θ(u + f + fu) because it encompasses the cases where u = 0 as f approaches infinity, and vice versa.)

b. Θ(u + f log u).

c. Assign each parent pointer a timestamp, an extra field in each node that states the ``time'' at which its parent reference was set to a non-null value. Each union operation creates one non-null parent reference with a timestamp. The find operation never traverses a parent reference whose timestamp is newer than the parameter time.

Problem 6. (4 points) Weighted Multigraphs.

a. Here are two acceptable answers.

• Use a hash table to map vertex pairs to linked lists of edge weights.

b. Use the data structure above, with the addition of one of these two ideas.

• Maintain a doubly-linked list of all the edge weight lists.
• Maintain a doubly-linked list of edges, in which the edges for any vertex pair are contiguous. For each vertex pair (e.g. entry in the adjacency matrix), maintain references to the first and last edges for that pair, so you can splice edges out in O(1) time.

Problem 7. (4 bonus points) Bonus Questions!

a. Let T be a random variable equal to the running time of the algorithm. Assume that the actual time to choose a random index i and check z[i] is at most \$1. The probability that the algorithm will guess right on the first try is 1/n, so the probability that it will guess wrong is 1 - 1/n. If the algorithm guesses wrong, then it has to start over from scratch. Therefore, the running time of the algorithm is \$1 for the first guess, plus an additional cost which is zero if the guess is right, or E[T] is the guess is wrong. So the expected running time is

E[T] = \$1 + (1 - 1/n) E[T].
Solving this equation for E[T] gives
E[T] = n dollars.
Here's another way to look at it. Each round costs \$1. We always do the first round. The probability that we need to do a second round is 1 - 1/n; the probability that we need to do a third round is (1 - 1/n)2; the probability that we need to do a fourth round is (1 - 1/n)3; and so on. So the expected running time is
E[T] = \$1 + (1 - 1/n) ⋅ \$1 + (1 - 1/n)2 ⋅ \$1 + (1 - 1/n)3 ⋅ \$1 + …
= n dollars.
b.

Run Kruskal's algorithm and find the MST T.
For each edge (u, v) in G but not in T {
Use DFS to find the path p linking u to v in T.
// Note that (u, v) and p form a cycle in which (u, v) has the maximum weight.
Find the largest edge weight in p that is strictly less than the weight of (u, v).
Let du,v = the weight of (u, v) minus the weight of the next-largest edge.
}
Let d = minu,v { du,v }.
Let (u, v) and (w, x) be the two edges responsible for d.
Remove (w, x) from T and replace it with (u, v).

Problem 8. (7 points) A Prefix Expression Evaluator.

```
if (next.isNumber()) {
while (top().isNumber()) {
Stacker operand1 = (Stacker) pop();
Stacker operator = (Stacker) pop();
next = operator.doOeration(operand1, next);
}
}
push(next);
```

Mail inquiries to cs61b@cory.eecs