Midterm 2 solutions ------------------- Question 1 ---------- Question: Given the following BST, show its value after deleting 95. 95 / \ 40 113 / \ / \ 15 42 110 125 / \ 41 112 Answer: 110 / \ 40 113 / \ / \ 15 42 112 125 / 41 Question 2 ---------- Question: Given a BST, there are some keys that, if inserted, will increase the height of the tree. For the BST below, list all the integer keys (distinct from those already in the tree) for which this is true. If no such keys exists, explain why not. (Each key should increase the height if inserted alone, not with other keys.) 95 / \ 40 113 / \ / \ 37 42 111 125 \ \ 39 112 Answer: 38 Question 3 ---------- Question: For a given set of keys there are many possible BSTs. Right and left rotations (as defined for Red-Black trees), produce legal BSTs from others. For the BST given below, give another BST with the same keys that CANNOT be formed by performing a sequence of left and right roations on T. If no such tree exists, explain why not. T = 95 / \ 40 113 / \ / \ 15 42 110 125 / \ 41 112 Answer: No such BST exists. One way to see this (although it is not a proof), is that you can perform rotations until the BST is a chain in one direction, and from there, get back to any other BST through a series of rotations, including a BST that is a chain in the opposite direction. Note that the question asks you to perform a sequence of left and right rotations on any node, not just from the root. Furthermore, note that we can't get *any* T with the same elements using rotations, only those that are also BSTs. Question 4 ---------- Question: Given the following Red-Black tree, show its value after inserting the key 29. (Mark the red nodes with an "R" and the black nodes with "B" and show the black nulls at the leaves as in your homework assignment.) 25B / \ 18B 34R / \ 30B 36B / 28R Answer: (this omits the black nulls, due to limits in our ability to draw pictures with text): 25B / \ 18B 34R / \ 29B 36B / \ 28R 30R Question 5 ---------- Question: Given the following array that represents a heap, show the value of the heap after removing an element: -- 20 12 18 7 3 6 4 Answer: -- 18 12 6 7 3 4 -- A Remove( ) operation on a heap removes the first element from the heap, and then restores the heap properies of the data structure by 1) placing the last element in the heap into the root position, then 2) swapping that element with the greater of its two children until the element reaches a position where it is greater than both of its children initial array: -- 20 12 18 7 3 6 4 tree representation: 20 / \ 12 18 / \ / \ 7 3 6 4 after removing root: -- -- 12 18 7 3 6 4 tree representation: -- / \ 12 18 / \ / \ 7 3 6 4 after swapping last element into root: -- 4 12 18 7 3 6 -- tree representation: 4 / \ 12 18 / \ / \ 7 3 6 -- after swapping with largest child: -- 18 12 4 7 3 6 -- tree representation: 18 / \ 12 4 / \ / \ 7 3 6 -- after swapping again: -- 18 12 6 7 3 4 -- tree representation: 18 / \ 12 6 / \ / \ 7 3 4 -- now the element (4) has no children greater than it (it has no children at all), so we're done Question 6 ---------- Question: Given a k-ary complete tree stored as an array, give the formulas for the j-th child and j-th parent of the i-th node. Answer: Conceptually, it might be easier to find formulas for all the children first. Assume that the root resides at index 1, and that the first child of i corresponds to j == 1. (The assumption that the root is at index 1 was true in the textbook for Fall 97. Many other books uses 0 as the index, which complicated the indexing expressions, but doesn't require wasting the zero-th location in the array.) i indices of all of i's children --- ------------------------------ 1 2, ..., k+1 2 k+2, ..., 2k+1 3 2k+2, ..., 3k+1 4 3k+2, ..., 4k+1 ... i (i-1)k+2, ..., ik+1 The last line in the table gives the indices for all of the children of i; to get the j-th child, simply add j-1 to the first index: j-th child of node i = (i-1)k + j + 1 To find a formula for the parent, one approach might be "trial and error," i.e., draw some trees and guess a formula. This worked for many of you. Here is a more systematic way to do it. Notice that the formula above gives us the index of a child given the parent index i; so it should seem reasonable that we could try solving the above equation for i --- that would give us a formula for the parent. Let x be the j-th child of the i-th node: x = (i-1)k + j + 1 Solve for x's parent, i: x - 1 - j x - 1 k - j i = --------- + 1 = ----- + ----- k k k ^ ^ | | A B Let's consider the two terms labeled A and B. Notice that B will always be between 0 and 1, i.e., B is a fraction. But we know that A and B should add up to an integer. For example, if A is 3.7, then B has to be .3. In other words, we should *round A up* to the nearest integer. In a more mathematical notation, take the "ceiling" of A. So the final answer for the parent i of a node x is: +-------+ | x - 1 | i = | ----- | (that's supposed to be the notation for "ceiling") | k | ------ EXTRA: A lot of you came up with different formulas that turn out to be equivalent to the formula given above. We claim (without proof) that the formula above is equivalent to the following floor formula: +-------+ | x - 1 | | x - 1 | | ----- | = | ----- + c | | k | | k | +-----------+ where 'c' must satisfy the following conditions: 0 <= c < 1 if ((x-1) % k) == 0 AND (x-1) % k (x-1) % k 1 - --------- <= c < 2 - --------- if ((x-1) % k) is non-zero k k If you think your formula is correct, check it against the conditions above--- it must satisfy BOTH conditions to be correct for all x, k. Question 7 ---------- Question: Consider the function f(x) = x * x as a probe function in an open-addressed hash table. For what inputs will this function fail to work properly? Answer: The answer that was sought was: "if x = 0, then x * x = 0, and the probe function will fail to probe to an empty slot. The probe function should be modified to be max(x * x, 1), so that it will never return 0." Answers that we reluctantly accepted included: "x * x could produce a number that is larger than the table size, and produce an index that is out of bounds." Because the problem didn't explicitly state that bounds checking would be performed on the index returned by the probe function, we accepted this answer. Normally however, you should assume that bounds checking is *not* done inside the hash or probe functions, but in the place in the code where you are actually selecting the next slot to check. For example: nextHashTableSlotToCheck = (thisSlot - probe(HashKey)) % TableSize; This way, you have more flexiblility in choosing your hash and probe functions, and you can assume that the result of your functions will be brought back into range before any actual table location is checked. "x * x could produce a number that is a multiple of the table size, and therefore could produce a probe that fails to check all table locations." This is a better answer than the above, and can occur even if you've taken all the right precautions such as selecting a prime table size and doing modulo arithmetic to keep the "nextHashTableSlotToCheck" value within the table bounds. Imagine you have a table size of 11 (nice and prime) and your hash key for an object being inserted is, coincidentally, 11 (or some multiple thereof). Your probe function would always produce a multiple of 11, and (11 * n) % 11 will always be 0, so your probe function will again be effectively zero. This means that, in fact, you *do* need to do some bounds checking inside your probe function, at least to make sure that the function will not return a multiple of the table size. Standish suggests that you restrict the probe function to return values in the range 1 <= probe(key) < TableSize, to avoid this problem as well as the problem stated above in reference to the "correct" answer. -------------------------- INCORRECT ANSWERS A lot of people demonstrated confusion about the difference between a hash function and a probe function. There were a lot of answers like: "x*x will always select the same numbers (1,4,9,16,etc.), so most of the positions in the table will never be used." This would be a problem if you used x*x as your HASH function, but it is NOT a problem if you use x*x as your PROBE function. If you're confused by this, please review the difference between hash and probe functions carefully. Question 8 ---------- Question: Describe a data structure that: (1) provides access to data via multiple (2) keys (2) has constant time key lookup, insertion, and deletion Draw a picture of your data structure. Consider for example an IRS database keyed by name or SS#. The constant time lookup (and insertion and deletion) needs to work when you are given a SS# or given a name (not both). Answer: Property 2 strongly suggests that a hash table should be the basis of this structure. However, a single hash table cannot function with two separate keys, because any such table that did utilize two keys would always require both as input to the hash function. Thus, two hash tables are necessary, both storing the same data but using the different keys. Both hash tables should point to the same objects - or if multiple copies exist, they should refer to each other. Question 9 ---------- Question: Give code for right rotation on the left child of a TreeNode p. class TreeNode { TreeNode left; TreeNode right; Object data } Answer: There were a few ways to write this, but probably the easiest to follow uses 2 temporaries. Many people misunderstood the question tried to rotate at the root (p), rather than at the left child of p. This doesn't correctly rotate anything, because p is a reference to a treenode passed intot he rotate method, and if it is reassigned, the change is not reflected outside of the method. static void rotateRightOnLeftChild ( TreeNode P ) { TreeNode tmp1 = p.left; TreeNode tmp2 = p.left.left.right; p.left = p.left.left; p.left.right = tmp1; p.left.right.left = tmp2; } P becomes P / \ / \ L R A R / \ / \ A B C L / \ / \ C D D B Question 10 ----------- Question: Assuming you are sorting an array of 9 elements that contains the numbers 0 through 8 in some order. What order will produce worst-case running time for quicksort. Answer: In general, the worst case for quicksort happens when the pivot is repeatedly chosen to be the largest or smallest elements of the array. In that case, each recursive call is smaller by only one element. This problem depends on exactly what implementation of the quicksort algorithm you are using, in particular, how the pivot element is chosen. In the book used this year (Fall 1997), the quicksort algorithm chose the middle element, i.e., n/2 in an n-element array. This element is then swapped it with the first element before partitioning, and swapped with the element between the two partitions after pivoting. In the scenario below, the second swap is included in the partition step, and the partition itself never has any effect, because the pivot is always the largest remaining element. [ 7 2 4 6 8 3 1 5 0 ], n = 9, n/2 = 4, a[n/2] = 8 [ 8 2 4 6 7 3 1 5 0 ], after swap [ 0 2 4 6 7 3 1 5 8 ], after partitioning [ 0 2 4 6 7 3 1 5 8 ], n = 8, n/2 = 4, a[n/2] = 7 [ 7 2 4 6 0 3 1 5 8 ], after swap [ 5 2 4 6 0 3 1 7 8 ], after partition [ 5 2 4 6 0 3 1 7 8 ], n = 7, n/2 = 3, a[n/2] = 6 [ 6 2 4 5 0 3 1 7 8 ], after swap [ 1 2 4 5 0 3 6 7 8 ], after partition [ 1 2 4 5 0 3 6 7 8 ], n = 6, n/2 = 3, a[n/2] = 5 [ 5 2 4 1 0 3 6 7 8 ], after swap [ 3 2 4 1 0 5 6 7 8 ], after partition [ 3 2 4 1 0 5 6 7 8 ], n = 5, n/2 = 2, a[n/2] = 4 [ 4 2 3 1 0 5 6 7 8 ], after swap [ 0 2 3 1 4 5 6 7 8 ], after partition [ 0 2 3 1 4 5 6 7 8 ], n = 4, n/2 = 2, a[n/2] = 3 [ 3 2 0 1 4 5 6 7 8 ], after swap [ 1 2 0 3 4 5 6 7 8 ], after partition [ 1 2 0 3 4 5 6 7 8 ], n = 3, n/2 = 1, a[n/2] = 2 [ 2 1 0 3 4 5 6 7 8 ], after swap [ 0 1 2 3 4 5 6 7 8 ], after partition [ 0 1 2 3 4 5 6 7 8 ], n = 2, n/2 = 1, a[n/2] = 1 [ 1 0 2 3 4 5 6 7 8 ], after swap [ 0 1 2 3 4 5 6 7 8 ], after partition [ 0 1 2 3 4 5 6 7 8 ], n = 1, stop There are other inputs as well that will produce worst-case behavior. We were lenient in grading, and gave solutions 2 out of 3 possible points, as long as they have zero or eight as the middle elements.