In this chapter, I will discuss the implementation of Arctic's functional testing system. There are two conceptually separate parts to Arctic's testing system, the ``hardware'' that connects to the Arctic chip and the ``software'' used to control that hardware. However, since the entire system is being simulated in the Verilog HDL, this distinction is blurred. I will begin by discussing the hardware side of the design, which consists of a number of signal generators and monitors connected to the Arctic module, and then describe the software side, which manages the tests themselves.
Since a great deal of attention was paid to the speed of this system, I will also describe one of the system's special operating modes, quick mode. In this mode, operations that read or modify configuration and control information inside the chip can be completed almost instantaneously, speeding up most tests by an order of magnitude. I will explain how this mode is entered and show how the addition of such a feature is made easy because Verilog is used as the implementation language.
Ideally, this system would be entirely contained within a Verilog simulation, but unfortunately, Verilog is not powerful enough to perform all the functions this testing system needs to perform. For example, Verilog has very primitive file input capabilities, and it completely lacks the ability to allocate memory dynamically. This necessitates an additional pre-simulation step which can be thought of as completely separate from the other parts, and I will therefore present it at the end of this chapter.
This organization keeps functionally distinct units separate. In an actual system, each of Arctic's input ports would be connected to an output port (either on another Arctic chip or on some other network interface unit) and vice versa, and the maintenance interface would be connected to a JTAG controller. In this system, each smaller module imitates each separate functional unit. This gives the system a clear structure, making it easier to expand and modify it.
It is also important to note that this structure is very flexible. Some testing systems have kept signal generators and monitors in separate C or Lisp processes that communicate with the hardware simulation through Unix sockets [7]. By keeping each unit in a Verilog module, we are able to modify the interface between the module and Arctic easily, should Arctic change, and we can easily modify the interaction between these modules.
Each of these smaller modules is really more like a stub than an actual functional unit. Whereas, in some of our early experiments, we hooked up a model for an entire output port to communicate with each input port, in this system we tried to keep each of these ``stubs'' as simple as possible. All contain logic that generates encoded signals (such as clocks and Manchester encoded signals) and records output patterns or signals and error if an incorrect pattern is received. Some contain functions that allow the user of the testing system to utilize the specific functions of the ``stub.'' This gave more flexibility than using fully functional modules, which would have been easier to implement, but would not have allowed us to generate unusual (or erroneous) input patterns or record outputs in any way we saw fit. Following this reasoning, let us refer to a module connected to an input port the input port's stub, and the module connected to an output port the output port's stub. The module connected to the maintenance interface will likewise be called the maintenance interface's stub.
The recording machinery mentioned above is actually some of the most complex and varied logic in the entire system. Each stub receives Arctic's rather complicated output patterns and reduces it to a small piece of usable information, such as ``packet X was received at time Y from port Z,'' or ``signal B has just generated a Manchester encoding error at time T.'' Information that needs to be recorded, such as the received packet above, is stored as a record in some appropriate log file, to be scanned later when the system checks that all packets were received (this step will be discussed in the next section). Information that does not need to be recorded, such as the Manchester encoding error above, is error information and can stop the simulation immediately.
Most of the logic that transmits input signals to Arctic, on the other hand, is very simple. Most are simply registers or simple clock generators. The real complexity here lies in the control for these transmitters, which I consider software that is bound to the particular module in which it is used. This will be discussed further in the next section.
I believe this design offered the best flexibility and modularity of any organization. The different modules could be easily divided among the workers for implementation and maintenance, and the complexity of the system could be managed by observing the module boundaries. Another subtle advantage of this organization is that the Arctic module itself can be removed and replaced with any other model. This makes it easier to update the system when new releases of Arctic's design are made available, even if the new design is a gate-level model. This allows the testing system to be useful at every point in the design process.
We decided to organize all tests in small groups. These groups were to be the smallest independent testing units in the system. In other words, the smallest piece of code that can accomplish a useful test is a ``test group.'' This ``test group'' (called a ``test sequence'' in the user's manual in Appendix B) consists of a configuration phase, where Arctic is placed in a known state, an execute phase, where a number of actions are carried out on the chip, and a check phase, where the state of Arctic is checked against a specification of what the state should be. This must be the smallest unit of a test, because it is the smallest unit that can allow the user to attempt to put the chip in a certain state and check whether it worked or not.
This organization mimics several other functional testing systems [3,2,7]. Most of these systems have some kind of testing unit that consists of an initial state for the system, a set of actions to be performed, and a specification of the final state of the system. It does seem to be the the one point that designers of such systems agree on, because it forces the users of the system to think about testing in a structured way, and it gives a convenient way to organize groups of related tests, any number of which can be run in one simulation, if desired.
In our system, test groups are specified with a number of files, each named testxxx.type, where ``xxx'' is the number of the test group and ``type'' is an extension such as ``code'' or ``config.'' In the following sections, I will describe each phase of a test group and describe the file types that are relevant to it.
Before the test actually begins, however, a little bookkeeping needs to be done. Several files that will be used to record the outputs and state of the chip during the execute phase need to be opened for writing. These files will be closed in the check phase or if the simulation dies suddenly due to an error.
These remaining actions are predominantly calls to other functions that perform tasks such as writing Arctic's control register, sending a set of packets through the system, or reading and storing Arctic's statistics information. The system is very versatile, though, and allows the user to write any block of sequential Verilog code in this file, making it possible to specify any desired input pattern or check for any error that might not be detected in the check phase. This kind of versatility was necessary if the system was to meet our generality goal. Arctic is very complex, and it would be difficult to build special functions to generate all possible input patterns or detect every possible error. With the ability to use actual Verilog code, the user can test all those functions that cannot be tested with the existing input/output machinery.
Most of the actions performed during a test group do, however, conform to a relatively small number of patterns. This indicated that a set of functions that facilitated the generation of these patterns was a good idea. Many of these actions, such as writing a certain value to the control register or resetting the chip, can be implemented simply, but one action that deserves some explanation is the action that sends a set of packets into the chip.
Sending packets into Arctic is the most common and most complex input pattern. Transmitting packets is the basic function of the chip, but it requires the coordination of many encoded signals for a long period of time. For this reason, there is considerable machinery in this testing system that is devoted to sending packets.
First of all, all packets in the universe of the testing system have a unique number called a packet identifier. This identifier is actually part of the payload of the packet, which reduces some of the generality of the system by fixing those bits of the payload to be a certain value. This was necessary, however, in order to track the packets as they go through Arctic. A library of packets is read in at the beginning of a simulation.
These packets are used by the function send_packets(file_name), where ``file_name'' is a file specifying a list of packet send commands. Each of these commands specifies a packet to be sent, the time it should be sent, and the input port it should be sent to. The name of this file is testxxx.y.pkts, where ``pkts'' indicates that this is a list of packet send commands, and the ``y'' can actually be any letter. This extra letter is useful because it is often necessary in a test group to use this function to send several sets of packets, and the files used by the different calls need to be distinguished in some way. As one might imagine, this is the most frequently used function, and it greatly simplifies the problem of sending packets though Arctic. Whether or not it helps enough is debatable, as we shall see in Chapter 6.
Up to this point, I have detailed many ways that the user can specify input patterns to Arctic, but I have not discussed how Arctic records outputs. Recall that during the configure phase of the test group, several files were opened for output. Each of these files has a name of the form testxxx.log.logtype, where ``logtype'' is either ``pkts,'' ``stats,'' or ``errs.'' These files are used to store a specific type of output from the chip so that, in the check phase, this output may be checked against another file, testxxx.chk.logtype, which contains a specification for how the log should appear. The information in these log files is gathered either automatically, or when the user specifies.
The only information that is gathered automatically is a set of records specifying packets that are received at each output port's stub. Whenever a stub receives a packet, the stub checks that the packet matches the packet that was transmitted, and then a record specifying the packet's identifier, time of receipt, and output port is placed in the file testxxx.log.pkts. This file will be checked against testxxx.chk.pkts in the check phase to make sure that the correct packets emerged from the correct ports.
The other two kinds of log files gather information only when the user specifies. The function write_stats reads all the statistics counters in Arctic and stores their values in the file testxxx.log.stats. This information can be gathered any number of times during a test group, so that the user can monitor statistics before and after some action is taken. The function write_errs_contr is a similar function that records all the bits of state not accounted for in any other check and writes to testxxx.log.errs. It reads the Arctic control register and the error counters in addition to some other small bits of information. As with write_stats, this information can be gathered any number of times, and at the end of the execute phase, these two files are checked against the files testxxx.chk.stats and testxxx.chk.errs.
With this carefully defined structure of a test group, we have completed a clear definition of the hardware and software portions of the testing system. This defined, it is much easier to decide which new functions are possible, and how they should be added to the system.
The truly aggravating thing about these long maintenance interface operations is that they did not seem to be accomplishing any ``real'' work. The ability of the chip to scan in control and configuration information needed to be tested, but most of the time, these functions were only being used to set up the chip for another test and weren't performing any ``interesting'' actions. If there were a way for us to bypass this long configuration step, we thought, we could significantly shorten the time needed to run a test.
This observation resulted in the creation of ``quick mode.'' In this mode, any ``uninteresting'' actions can be abstracted away, i.e. they can take place without being simulated. Operations performed through the maintenance interface, for example, can be done instantly, without using the maintenance interface at all. These operations can be given an extra flag as an argument. If this flag is 1, then the function will bypass its maintenance interface operation. By ``bypassing,'' here, we mean that the operation will directly read from or write to the registers holding the state information that needs to be manipulated. With Verilog, any register buried deep inside a module can be accessed directly with a hierarchical name, which makes it easy for each function to read or modify any of Arctic's registers. Therefore, any function that manipulates state through the maintenance interface can be told which registers to access, giving it the ability to run in quick mode. The user must only keep in mind that, after executing an operation in quick mode, the system will not necessarily be in exactly the same state it would have been if the operation were performed normally. In most cases, though, the similarity is close enough to make a simple test run smoothly.
Since quick mode is implemented by accessing state inside Arctic, it might stop working whenever the model of Arctic is changed significantly. Since this was likely to happen often, and we did not want the entire testing system to stop functioning when a change was made, we needed a way to turn off quick mode for the entire simulation. We defined a global variable, SPEED, that would be set at the beginning of a simulation to determine whether it should be run in quick mode or not. Even if the simulation were running in quick mode, however, we knew it was possible that the user might not want to run every possible function quickly. For this reason, every function that can run quickly is passed the extra argument SPEED that determines whether or not the function should run quickly for each separate call. In this system, then, it is possible to turn quick mode on and off globally, and it is possible to control it locally, at each function call.
This quick mode seems like a simple idea, but it is a very powerful one because it reduces the time needed to run a simple test from 45 minutes down to about 5 minutes, an order of magnitude improvement! For other tests that deal almost exclusively in maintenance interface operations, such as the the ones described in Appendix Section A.2, the improvement can be as much as 50 to 1.
../packets/lib1
../packets/lib2
../packets/lib3
../packets/lib5
sequences
test003
test009
test010
test051
test082
Figure 4-2: Master Simulation File
The user actually runs the system by typing run_test filename where ``filename'' is the name of a master simulation file such as the one in Figure 4-2 that contains a list of packet libraries followed by a list of tests that need to be run. The libraries contain all the packets that the system can use, and must be generated before the simulation, either by hand or with some generation tool. This organization gives the user precise control over the packets generated, and easily accommodates new groups of packets. A list of tests follows this list of libraries. This list specifies which tests will be run and what order they will be run in.
The pre-simulation step sets up the Verilog simulation by instructing it to begin by loading in the specified libraries of packets. It then scans each of the tests listed and instructs the Verilog simulation to run each test. This is done by splicing the testxxx.code file for each test into the top level simulation. After this step completes, the Verilog simulation begins and performs all the tests that were specified in the master simulation file.
This pre-simulation step solves another rather difficult problem. Verilog does not have the ability to allocate memory dynamically, and the size of every data structure needs to be specified before the simulation can begin. This is difficult to deal with in this system, because the size of the buffers needed to hold log files or lists of packet send commands will vary widely from test to test, and unless we used some gross overestimate of the maximum space needed, the simulation would risk running out of memory. To avoid this problem, and to avoid wasting memory, the pre-simulation step scans through the tests to be run and sets up some important data structures in the simulation to have just the right size. This makes the system a bit more difficult to understand, but it does make it much easier to use when the user can abstract away the details of this step.
This completes a full picture of Arctic's functional testing system. The system has other operating modes which the user has access to, but the information given above is sufficient for a user to begin writing useful tests. The amount of detail presented here may seem excessive, but I believe it is useful as an example of the problems that arise when designing such a system. For those readers desiring even more details about how this system is used, Appendix A presents many example test groups. The first example presented there is particularly helpful, demonstrating how the different parts of the system interact in a simple case. Appendix B contains the user's manual for this system, and may also be of interest for readers desiring greater detail.