Abstract:
Current methods used to improve the performance of a microprocessor require significant investments in time and yield larger complicated designs. This paper explores a transformation called C-slow retiming to quickly and automatically convert a standard single threaded microprocessor into a multithreaded microprocessor with improved performance. Our experiments have demonstrated multithread instruction throughput improvement of 21% on a 2-slow and 31% on 3-slow design with minimal area cost.