Before diving into all the details of Arctic's functional testing system, it will be helpful to look at some of the recent thoughts and experiences of others who have addressed this problem. The ideas presented in this chapter are relevant to any VLSI testing effort, and they will help to explain why I made some of the choices that will be presented later in this thesis. First, I will look at some abstract functional testing methods, and then I will focus on more practical methods.
The purpose of testing a system is to prove that the system is ``correct,'' and it is natural, therefore, that abstract thought on testing is concerned mostly with methods that prove the correctness of a system. Some theoreticians have created logical languages for specifying hardware and software systems with which it is possible, given a specification and a set of axioms, to prove certain properties about each hardware module or block of code. This might be thought of as a way of proving ``correctness,'' and such methods have been used successfully with hardware systems. It is arguable, however, that these proving languages are not as useful with very large and complex systems, and even their supporters agree that this ``testing'' method cannot replace other testing methods [6]. All of the interesting theoretical work that relates to functional testing of integrated circuits focuses on randomization.
Randomized testing methods seek to automatically generate and test a subset of a hardware system's possible states with the intent of finding all design bugs with a certain probability. This descends from the tried and true ``brute force'' testing method. In ``brute force,'' every possible state of a system is entered, and in each state it is determined whether or not the system is functioning properly. This assumes, of course, that there exists some method for determining whether or not the system is functioning properly in every state, but such methods do exist for many simple systems. The difficult problem is how to enter every possible state of a hardware system. As hardware systems continue to get more complex, the number of possible states in a typical is growing exponentially. For many modern systems, it is not possible to enter each possible state for as little as a nanosecond and expect to have entered all possible states in any one person's lifetime. This makes the brute force method impractical for most VLSI systems.
It hardly seems necessary, though, to enter every possible state in a hardware system. Logic is frequently replicated, and from a design standpoint, if one piece of a datapath is working, it is frequently the case that all the other pieces are working, too. In fact, the only states that need to be entered are those where design bugs can be detected, and, together, these states form a small subset of the possible states. Randomized testing methods generate a small set of states with a pseudo-random number generator. If the random number generator is implemented carefully, the designer can be sure that every bug in the system can be detected in at least one of the generated states. The trick, then, is to run the random number generator long enough to generate a subset of states so large that it includes every needed state. Unfortunately, it is generally impossible to determine how long is long enough. It is comforting to know, though, that the probability of detecting every bug gets higher if more tests are run. Also, since many bugs can propagate their effects over large portions of a chip, they can be detected in many different states. Chances are high, then, that one of these states will be entered early in testing if the random signal generators routinely exercise all parts of the chip. This can make random testing a very practical method, as well.
This probabilistic argument for proving correctness may seem a bit suspicious, but this method has been successfully used to test some very large and complicated chips. Wood, Gibson, and Katz generated random memory accesses to test a multiprocessor cache controller, and with it found every logical bug ever detected in the system except for two that were missed because of oversights in their testing system [3]. Clark tested a VAX implementation successfully by executing random instruction sequences on a simulation of the new implementation and cross-checking the results by executing the same instructions on the VAX that was running the simulation [2]. These are some of the examples that have proven randomization to be a powerful technique in testing hardware systems.
Hardware systems were being built long before any formal methods were developed to prove their correctness, and as we have noted, formal methods are not always practical for the largest and most complicated chips. How, then, is testing done in the real world? A designer that is presented with the task of performing functional testing on an chip usually has to figure this out for him or herself, but there are a few general methods that have been successful, historically. Most designers who wish to test their chips have the ability to simulate their design in some way. Let us assume that all testing projects have this in common and look at how the designers might choose the tests that are run on this simulation to show its correctness.
One way to choose tests is to attempt to list all the functions of the chip that need to be tested and try them out, one by one. This is a very satisfying method because there is a clear point when it can be said that testing is complete [2]. It is difficult to make this method work in modern designs, however. The number of cases will be huge, and it is likely that some tests that should be run will be overlooked. This method remains, however, as the naive approach to functional testing.
A slightly different approach to testing is to carry out a focused search for bugs. This differs from the previous method in that testers are not merely trying out every function of a chip, but rather they are experimenting with it to see how it might break. There are several advantages to this method. Often, the tests run on the system using this testing method will be very similar to the actual inputs the system will be given when it is in regular use, giving the testers greater confidence that the system will work for its intended use. Also, the testers will be consciously trying to make the system break, and a conscious effort to find a problem with a system will probably uncover more bugs since many bugs tend to be related. Another big advantage of this method is that it has no set completion time. This is an annoying trait from the manager's point of view, but from the designer's, it emphasizes the fact that testing should stop only when no new bugs have been found for a long time, and not when some possibly incomplete list of tests has been finished. This testing method is fairly common, and is makes up at least part of the testing strategy of several successful chip projects, such as MIT's Alewife project [7]. It can be very time consuming and exhausting, however, and may not be appropriate for smaller design teams or teams with a tight schedule.
The above methods can be categorized as ``directed testing,'' where a human is responsible for choosing the tests run on the system. We have seen before, however, that the computer can take responsibility for this task, which brings us back to the randomized testing ideas from the previous section. Most VLSI systems will not be as regular as a cache controller or a microprocessor, however, and they may not be so easily tested with randomization. With these devices, it sufficed to randomize memory accesses or instruction sequences, and this gave a set of inputs that was sufficiently random to check nearly all the functions. Other systems may have many disjoint sets of state, all of which can have a significant effect on the behavior of a system. For example, imagine a DSP chip that can process four signals at once, each one with a different function, and imagine that the four signals can even be combined to make new signals. The way a testing system should go about randomizing inputs becomes less clear.
The answer, again, is to make the system capable of randomizing every piece of state in the system, but now the programmer must be very careful that the testing system does not inadvertently introduce some kind of pattern into the input sequences. Since all inputs are pseudo-random, there is always a danger of introducing a pattern and missing some important group of inputs, and the more disjoint groups of state are, the more likely that the programmer will slip up and miss something. If the four signal processors in our DSP example above were each given the same function at the same time, for example, the test would hardly be as complete as a test where the function of each processor was independent of the other three.
There is another difficulty a designer encounters when building a random testing system. How does the system determine ``correct'' behavior of the chip? In the directed testing methods, the user defines what the outputs of the chip should be for each test, and this problem does not arise. The random tester, however, must be able to define ``correct'' outputs by itself. This was simple for the cache controller and VAX implementation mentioned above, because models of correct behavior were easy to come by. In most cases, though, such a model is not so easy to find, and the only solution is to limit the range of behaviors that the system can simulate, so that the model can be made simpler. As a result, the random testing system loses the ability to detect certain kinds of bugs.
It seems, then, that each method has its own advantages and problems. A designer facing the task of functionally testing a chip design might be disappointed with the options, but some combination of the above ideas can lead to a fairly reasonable approach to testing. When designing Arctic's testing system, for example, we chose to combine a random testing strategy with a focused search for bugs. The resulting system is capable of both random and directed testing, where each approach makes up for the failings of the other. Since our design team is small, the random tester can find the subtle bugs we do not have the time to search for. Since the random tester cannot find all the bugs, we can search for the ones it cannot find with the directed tester. This is an approach to testing that fits well with the constraints of an ASIC design team, and it is this combination of ideas that has actually been used in the Alewife cache controller and VAX examples mentioned above [2,7]. In the chapters that follow, we will see how this approach is used to build Arctic's testing system.