Two trends call into question the current practice of microprocessors and DRAMs being fabricated as different chips on different fab lines: 1) the gap between processor and DRAM speed is growing at 50% per year; and 2) the size and organization of memory on a single DRAM chip is becoming awkward to use yet size is growing at 60% per year.
Intelligent RAM, or IRAM, merges processing and memory into a single chip to lower memory latency, increase memory bandwidth, improve energy efficiency, and reduce size. Surprisingly, the integration of the processor/cache/memory of IRAM with with high-speed serial I/O lines may also lead to very good I/O performance.
This talk explores some of the opportunities and challenges for IRAMs, suggests that to IRAM's potential we need better ideas than "Let's build a bigger cache!" We propose reviving vector architectures to leverage the high-bandwidth, low-latency memory of IRAM.
I conclude by speculating on applications for a DRAM-size chip in 2-3 years that consumes 1-5 watts of power, contains 16-24 MBytes of memory, has about 1 GByte/sec of I/O, and crunches at the rate of 2-4 GFLOPS (64-bit floating point) and 16-32 GOPS (8-bit fixed point).
Today, the semiconductor industry is sharply divided into processor and memory camps. If IRAM proves successful, unification may come to the semiconductor industry. In such a future, its unclear who will ship the most memory and which will ship the most processors.