CS267 Assignment 3: Knapsack

Due Wednesday March 21th, 2007 at 11:59PM

[ Background | Code and Ideas | Requirements | UPC | Titanium | Submission | Resources ]

This assignment's goal is to expose you to a global address space language UPC and its programming paradigm. This problem highlights challenges of more irregular load balancing and communication. You will write UPC code to solve a 0-1 knapsack problem by dynamic programming.

Background

Imagine that you have N old textbooks with integer weights W[i], for i from 1 to N. You also have backpack with an integer weight capacity of C. There is a used book store down the street that will give you P[i] dollars for book i. How much money can you make in one trip?

That's the 0-1 knapsack problem in a nutshell. The obvious solution of trying every possible combination is silly. A slightly better method is known as branch-and-bound; you run a breadth-first search of the combination space, but prune branches that cannot lead to optimal solutions.

A much, much better solution follows from expressing the problem as a recurrence relation and taking a dynamic programming approach. For the non-CS people (and the CS people who have forgotten), dynamic programming algorithms store common subproblems in a table and fill the table until they reach the solution.

We will construct a C-by-N table T indexed by a capacity and a book number. Entry T[i, j] is the maximum profit that can be obtained with a capacity i using books 1 through j. Suppose we have a backpack of capacity i, and want to know what is the maximum profit using books 1 through j. Well, we can use the book j or not use the book j. In the former case, we decided to use book j, so the profit is P[j] plus the maximum profit T[i-W[j], j-1] that can be attained with the remaining capacity and books. If we decide not to use book j, then the maximum profit is simply T[i, j-1]. We just take the best out of the above two possibilities. Thus we can compute T[i, j] recursively:

You'll have a choice when the above two quantities are the same. The sample code decides not to include book j in that case. It takes constant time per entry in the C-by-N table, so algorithm runs in O(CN) time (which is still linear in the input size).


Figure 1: To decide on packing a book, compare the previous total profit at the current capacity to the book's profit plus the profit at the current capacity minus the book's weight (here, capacity - 3). If the latter is larger, include the book.

There are a few initial cases to think about... What if i-W[j] is negative? What is T[1, j] ? What is T[i, 1]?

After all this, the total profit is in entry T[C, N]. You can find the books that achieve that profit by backtracking over the choices.

Codes and Ideas

A pair of sample programs are provided in C and UPC (also in tar.gz format). You may also choose to use Titanium instead of UPC. If you are planning on using Titanium, you will need to write up your own code from scratch.

The UPC program is horribly inefficient when using a network back-end. The data is distributed as shown by the colors in Figure 1, ensuring too much communication at every step.

The UPC code can be compiled to use various backend communication: SMP (using pthreads), MPI, various network (on top of which MPI is usually implemented), etc. On the CITRIS, you can use the SMP and GM versions, while on Bassi you can use the SMP, MPI, and LAPI versions. On Jacquard, the VAPI version of the code is suggested. See the Makefile for details.

An efficient implementation will likely give each processor a contiguous block of bag capacities and access its columns via local pointers. You may put all the book profits on each processor if that's the most efficient (for large numbers of books). You may also want to investigate pipelining the computation and fetching the necessary total profits in bulk.


Figure 2: Pipelining the computation of totals may increase the computation per communication.

Requirements

You may work in groups of 2 or 3. One person in your group should be a non-CS student if possible. You have the option of using either UPC or Titanium for your knapsack code.
For UPC users:

For Titanium users: In addition, you should briefly describe the key optimizations/techniques/algorithms used to get a good performance from your code.

UPC Compiler

There is a prebuilt version of the UPC compiler available on the three machines we'd like you to run on. If you have any issues that you things are bugs with the compiler send them to me and i will forward them on to the UPC group.

Titanium Compiler

Submission

Your group should put together a write-up describing the changes you have made to the programs and their effects on performance. Be sure to answer the questions raised in the Requirements section. Mail me a URL to a tar file containing the report, the modified programs, and any necessary or useful additional information.

Resources


[ Main CS 267 | GSI Page ] Last updated March 7, 2007