Figure 1 presents the programming model for SPACE. Data is stored and processed in an ordered array of 36b associative memory words. Two 36b control registers, the Write Enable Register ( wr) and the Mask Register ( mr), modify the effect of write and search operations as described below. The 36b data width was chosen to match current 32b microprocessors, allowing a full 32 bits of address or data to be stored along with a few tag bits in each word.
Figure 1: SPACE Programming Model.
Each word w of the associative memory has a single flag bit, f[w ] . The flag bits are used to select the words that will participate in an instruction and can be conditionally set or reset as a side effect of most instructions. Each word w has connections to the flags of the word before, f[w-1 ] , and the word after, f[w+1 ] , and these flag values can be used in place of f[w ] to select the words that will be active during a given instruction. This flag chain connects the words in a linear array. Multi-word operations to be constructed through sequences of primitive instructions. This simple linear communications topology performs well for algorithms of interest and is easily scaled to large numbers of processors. The first word in the array (word 0) has its flag input from the previous word ( f[-1 ] ) set to 0.
The architecture includes a priority resolution tree that can activate only the first (lowest numbered) selected word for an instruction. Alternatively the resolution tree can be used to update the flags of all words after the first matching word during a search operation. Data words in SPACE are addressed only by their contents, or by the contents of their neighbours.
SPACE instructions are encoded in seven bits as an orthogonal combination of an opcode, a select mode and a flag update mode. The opcode occupies four bits and is encoded as shown in Table 1.
Table 1: Opcode encoding, x indicates don't care.
Two further bits in the instruction encode the select mode. These are the Adjacent/Own Flag bit, AOF and the Previous/Next Flag bit, PNF . Select modes are SPACE's equivalent of the addressing modes found in conventional processors and determine the flag value used to select participating words as shown in Table 2.
Table 2: Select Mode Encoding
New Flag, NF , is the final bit in the instruction format and this encodes the flag update mode. The two modes are set
() and clear (). In the assembler syntax either s or c is appended to the opcode plus select mode. For read and write operations, the flags of active words are updated to hold NF . For search operations with , flags of search hits are set and flags of all non-hits are cleared. When , a search operation clears the flags of hits and leaves other flags unaffected.
Instructions can be divided by the CD bit into those that act on the control registers and those that act on the array. Array instructions can be further divided into three major categories, search, read and write.
An early decision in the SPACE design was to specify search operand don't cares and write-enabled bit columns through separately loaded control registers rather than by adding a second 36b operand to search and write instructions. In many of the applications we have investigated, the same mask and write-enable values are reused over a number of instructions. By adding programmer visible registers for these values we reduce off-chip operand bandwidth requirements substantially.
The wwr , wmr , rwr and rmr instructions allow a 36b value to be written to and read from a control register. The wbr instruction allows both registers to be written with the same 36b value in one cycle.
Search instructions take a single 36b operand, the search key, and modify the values of flag bits. Don't cares in the search key are specified with mr. Any bit i, for which mr[i ]=0, is a don't care that always matches.
As well as allowing don't care values to be specified in the search key, SPACE also allows don't cares to be stored with the data. One of our applications is Prolog pre-unification, which relies heavily on stored don't cares to represent unification variables in database clause heads [Asanovic and Howe1989]. To reduce the cost of implementing stored don't cares, we adopted a compromise whereby a data word can be masked in units of a byte rather than bit by bit. Software can be used to simulate stored don't cares, but this simulation is expensive. Simulating the four maskable fields per word in software would take 49 SPACE instructions.
The 36b data word is grouped into four 8b data bytes, D0 -- D31 , three tag data bits, D32 -- D34 and an Exact/Masked bit, EM . The value of the EM bit stored in a word affects the way the word responds to searches. If the EM
bit of a stored data word is set, all 36 bits must match the corresponding masked search key bits. This is an Exact word with no stored don't cares; all 35 data bits are available for data storage. If , the word is Masked and byte-wise masking is enabled. The most-significant bit of each of the four data bytes controls whether that byte is a don't care. All four bytes and the four tag bits must match the search key to produce a hit. The 8b size of the maskable field provides sufficient granularity for Prolog pre-unification and is convenient for storing and searching 7b ASCII text, while minimising don't care storage overhead.
There are two search instructions: smo is search for match only, smf is search for match and following. Smo compares all selected words with the masked search key and all matching words are considered hits . Smf compares all selected words with the masked search key and the first matching word and all words following are considered hits . The smf instruction is typically used to flag blocks of words, first flagging all words following a block header word, then clearing all flags following a block trailer word.
If , then the flags of hits are set and the flags of all non-hits are cleared. If , then the flags of hits are cleared and the flags of all non-hits are unchanged. This behaviour differs from that of read and writes, which simply set the flags of active words to NF . The scheme used for searches allows the results of successive searches to be AND-ed together, whereas employing the read/write flag update semantics would have OR-ed results instead. We introduced this asymmetry on the basis of our experience in writing various associative algorithms. Especially when coding multi-word operations, it is more common to AND together the results of search operations. If required, search OR-ing can be readily accomplished with a simple series of instructions using a data bit as temporary flag storage.
Write instructions take a single 36b operand and update the value of array data words, modifying flag bits as a side effect. The Write-Enable Register determines which bits of array words are updated in a write instruction. Only those bit positions, i, for which wr[i ]=1 are written; others are unaffected. This makes it possible to address specific bit columns as well as specific words. This ability to selectively write individual bit columns, together with the ability to write multiple selected words in parallel, distinguishes a content addressable parallel processor (CAPP) such as SPACE from a less powerful content addressable memory (CAM) [Foster1976]. There are two write instructions: wal writes to all selected words, wfi writes only to the first selected word (if any). The flags of the written words are updated to the value of NF .
There is a single read instruction rfi which returns the 36b value of the first selected word in the array. The read word's flag bit is assigned the value of NF . A value of all 1s is returned if there are no selected words. In many cases, this allows the readout of the last in a sequence of flagged words to be determined without separate explicit checks of flag status.
The read status instruction rst
returns a single bit indicating
if any flagged words would be selected by a given select mode.