Mechanisms for Parallelism Specialization for the DySER Architecture
- TR1773 (1.706Mb PDF)
- University of Wisconsin-Madison Department of Computer Sciences
- Jun 13, 2012
- Specialization is a promising direction for improving processor energy efficiency. With functionality specialization, hardware is designed for application-specific units of computation. With parallelism specialization, hardware is designed to exploit abundant data-level parallelism. The hardware for these specialization approaches have similarities including many functional units and the elimination of per-instruction overheads. Even so, previous architectures have focused on only one form of specialization. Our goal is to develop mechanisms that unify these two approaches into a single architecture. We develop the DySER architecture to support both, by Dynamically Specializing Execution Resources to match program regions. By dynamically specializing frequently executing regions, and applying a set of judiciously chosen parallelism mechanisms--namely region growing, vectorized communication, and region virtualization--we show DySER provides efficient functionality and parallelism specialization. It outperforms an OOO-CPU, SSE-acceleration, and GPU-acceleration by up to 4.1x, 4.7x and 4x respectively, while consuming 9%, 86%, and 8% less energy. Our full-system FPGA prototype of DySER integrated into OpenSPARC demonstrates an implementation is practical.
- Permanent link
- Export to RefWorks