CS267 Assignment 0

Quadratic Assignment Problem

Problem Description

In the Quadratic Assignment Problem (QAP), we are given a set of n locations and n facilities, and told to assign each facility to a location. There are n! possible assignments. To measure the cost of each possible assignment, we multiply the prescribed flow between each pair of facilities by the distance between their assigned locations, and sum over all the pairs. Our aim is to find the assignment that minimizes this cost.

Mathematically, we can formulate the problem by defining two n by n matrices: a flow matrix F whose (i,j)-th element represents the flow between facilities i and j, and a distance matrix D whose (i,j)-th element represents the distance between locations i and j. We represent an assignment by the vector p, which is a permutation of the numbers {1, 2, ... , n }. p(j) is the location to which facility j is assigned. With these definitions, the QAP can be written as

QAP is NP-hard. Arguably, it is the most difficult NP-hard combinatorial optimization problem. Solving problems of size greater than 30 (ie. more than 900 0-1 variables) is computationally impractical. Among the exact algorithms used to solve QAP, branch and bound has been the most successful. However, the lack of a sharp lower bound is one of the major difficulties. Indeed, either the bound is too loose, or the time needed to compute the bound is prohibitive.

Computational History

In 1968, Nugent, Vollman, and Ruml posed a set of problem instances of size 5, 6, 7, 8, 12, 15, 20, and 30 noted for their difficulty[1] (Note: From now on, a problem instance of size 8 will be called Nug8 and so on). These QAP instances have multiple global optima. Even worse, these globally optimal solutions are at the maxially possible distance from other globally optimal solutions.

In their 1968 paper, Nugent, Vollman, and Ruml solved the first four of these instances by enumerating all possible solutions on a GE 265 computer. Between 1978 and 1980, Nug12 and Nug15 were solved using branch and bound heuristics on a CDC-CYBER76 computer. In 1990s, techniques such as simulated annealing, genetic algorithms, randomized adaptive search, etc were used to solve instances upto Nug24 on supercomputers such as the Cray. However, it was thought that Nug30 was beyond the capability of existing computing resources.

How the Nug30 was solved

In 2000, the team of Kurt Anstreicher, Nathan Brixius (University of Iowa), Jean-Pierre Goux (Northwester University), and Jeff Linderoth (Argonne National Laboratory) succeeded in solving Nug30 exactly. Their recipe for success was:

A state-of-the-art sequential solver
The method of choice was the branch-and-bound technique. It consists of dividing the initial search space in smaller pieces and bounding what could be the best possible solution that one could find in each of this smaller regions. In order to minimize the total amount of time required by the algorithm, a trade-off has to be found between the quality and the speed of these decisions. Mediocre decisions will perform very well for small problems but will not cut the complexity for large problems. Excellent decisions will also be too expensive for large problems. The classic bounds for the QAP problem are (ranked by increasing complexity): The team, however, developed a new quadratic programming bound that had an excellent cost/quality ratio and was the best candidate to tackle nug30. Details on this bound can be found in [2].
In addition, the team developed advanced branching techniques such that the exponential growth of the computational time esd better than any other known algorithm. The different branching rules and the sequential branch-and-bound are described in [3].

A careful parallel implementation on a powerful computational pool
The sequential solver by itself was not powerful enough to solve nug30 in a reasonable amount of time. Estimates to solve this problem with the best existing desktop workstations were around 7 years. Therefore, the team decided to work on a parallel implementation of the branch-and-bound algorithm to reduce the wall clock time.

The dramatic improvements in the performance of networks and in the computational power of individual workstations/computers mades it possible to assemble geographically dispersed into a powerful "computational grid". Members of this computational pool can be seen in the following table:

Number Arch/OS Location
414 Intel/Linux Argonne
96 SGI/Irix Argonne
1024 SGI/Irix NCSA
16 Intel/Linux NCSA
45 SGI/Irix NCSA
246 Intel/Linux Wisconsin
146 Intel/Solaris Wisconsin
133 Sun/Solaris Wisconsin
190 Intel/Linux Georgia Tech
94 Intel/Solaris Georgia Tech
54 Intel/Linux Italy (INFN)
25 Intel/Linux New Mexico
12 Sun/Solaris Northwestern
5 Intel/Linux Columbia U.
10 Sun/Solaris Columbia U.
Table 1: Computational Pool

In June 2000, the 1024 SGI/Irix processors at NCSA ranked as the 52nd fastest supercomputer in the world, according to Top500.org. with max rate of 264.00 GFlops and a peak rate of 327.00 GFlops. However, it was a very heavily used machine at the time, and only 41 processors were acquired at any given time.

Tools to distribute and manage the computations
Grid computing platforms are very complex to use. While resources are potentially much larger, they may go away at any time without any notice, their number can vary considerably over time and the connectivity between resources may be very bad. Therefore advanced tools are needed to draw the raw power provided by grid computing platforms. The Condor system, developed at the University of Wisconsin, was used to detect idle workstations and match these available resources with Condor users requests. In order to parallelize algorithms on the fault prone and dynamic platforms provided by Condor they used the MW master-worker system described in [4]. MW is also able to handle supercomputing resources joining the computational pool during the course of the computations. This was made possible by the glide-in tool developed by the Condor and Globus teams.

Nug30 Computation Statistics

Nug30 was solved in 7 days. The optimal solution to the nug30 QAP instance is:
14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23.

In order to prove the optimality of this solution, 11,892,208,412 nodes of a branch and bound tree were explored. Solving the associated node subproblems and computing the branching information required 574,254,156,532 Frank-Wolfe iterations.

On average, there were 653 machines participating in the computation, with a maximum of 1009. One of the most remarkable features of the run was that almost 1 million linear assignment problems (LAPs) were solved each second during the course of the run. (One LAP must be solved for each Frank-Wolfe iteration).

Today's supercomputing power is greater by 2 orders of magnitude and therefore, should solve Nug30 even faster. The team did a remarkable job parallelizing the branch-and-bound algorithm.

References

  1. Nugent, Vollman, and Ruml (1968). An Experimental comparison of techniques for the assignment of facillities to locations, Operations Research, Vol:150-173.
  2. Anstreicher, and Brixius (1999). A New Bound for the Quadratic Assignment Problem Based on Convex Quadratic Programming
  3. Brixius, and Anstreicher (2000). Solving Quadratic Assignment Problems Using Convex Quadratic Programming Relaxations
  4. Goux, Kulkani, Linderoth, and Yoder (200). An Enabling Framework for Master-Worker Computing Applications on the Computational Grid.

Other useful links

  1. Nug30 press coverage
  2. QAPLIB
  3. Condor
  4. Globus