In the remaining parts of this thesis, I will explore the problem of functional testing by describing the functional testing system of the Arctic router chip. Arctic chips will form the fat tree network that will allow the processors in the *T multiprocessor to communicate with each other. In this chapter, I will go over the goals the design team had in mind when designing the testing system for Arctic, but before I begin, it may be helpful to give an overview of Arctic itself.
Figure 3-1: The Arctic Router
Figure 3-1 (above) shows a block diagram of Arctic [5]. Arctic consists of four input ports connected to four output ports by a crossbar, and maintenance interface section through which Arctic is controlled. Message packets enter an input port and can exit out of any output port. Since all ``links'' are bidirectional in an Arctic network, input ports are paired with output ports that are connected to the same device. Packets vary in size and can be buffered upon arrival in each input port. Flow control in an Arctic network is accomplished with a sliding window protocol similar to the one used by TCP/IP. A transmitter (output port) and a receiver (input port) are both initialized with an initial number of buffers, and a receiver notifies the transmitter when a buffer is freed so that the transmitter knows to send packets only when buffers are available.
So that the system can tolerate any clock skew for incoming signals, each input port runs on a different clock which is transmitted with the data. Data has to be synchronized into a local clock domain before it can be sent out. Also, data on the links is transmitted at 100 MHz, though the chip itself operates at 50 MHz, which causes a little more complexity. More sources of complexity are an extensive set of error checking and statistics counting functions, two levels of priority for packets, flow control functions such as block-port and flush-port, a ``mostly-compliant'' JTAG test interface, and manufacturing test rings that are accessible in system (not just during manufacturing tests).
Most of these details about Arctic can be ignored unless the reader wishes to dive into the examples given in Appendix A or the user's manual in Appendix B. The details are mentioned here only to give the reader an impression of Arctic's complexity. Arctic falls into that large category of ASICs that have many complex functions and for which there is no obvious way to design a functional testing system. We chose to begin by implementing a directed testing system, and approached the problem by first drawing up a set of goals as guidelines to help us with our implementation. The sections that follow list each goal we had for our system and explain why we found that goal important.
An additional reason to speed up the system was the group's lack of computing resources. The members of the design team were sharing a fairly small number of workstations. We hoped to keep the load on these machines to a minimum by making simulations take as little time as possible.
Arctic was being designed with Verilog and compiled to gates with Synopsis, so, in a sense, all simulations were at a very low level. The Verilog model described every interaction between the sub-modules in full detail, and the gate description was generated automatically. We decided to follow Clark's wisdom to the letter and chose to make our system capable of simulating both the pre-compiled Verilog description and the compiled, gate-level description, which could be represented as Verilog code. This, we felt, would be a more rigorous test, and since we hoped to have a working chip in only three months, such a rigorous test was necessary.
In the next chapters we will see that it is nearly impossible to reach all of these goals simultaneously. The desire to make the system general, for example, is almost diametrically opposed to the desire to make it easy to use, because the addition of functions always complicates a system. After taking a close look at the implementation of Arctic's testing system, we will return to this set of goals and evaluate the system's performance with respect to each of them.