Serializing Instructions in System-Intensive Workloads: Amdahls Law Strikes Again

File(s)
Date
2007Author
Wells, Philip
Sohi, Gurindar
Publisher
University of Wisconsin-Madison Department of Computer Sciences
Metadata
Show full item recordAbstract
To maintain a reasonable level of complexity, processor implementations contain Serializing Instructions (SIs) � instructions, such as those that write control registers, that cannot be executed out-of-order (OoO). Maintaining sequential semantics may force SIs to serialize the pipeline and execute as the only instruction in the window.
We examine the frequency of SIs in three ISAs, SPARC V9, X86-64, and PowerPC, for several system-intensive workloads. Across ISAs, we observe 2�8 SIs per thousand instructions for most workloads. As explained by Amdahl�s Law, such frequent SIs, which create serial regions within the instruction-level parallel execution of a single thread, can have a significant impact on performance. For the SPARC ISA (after removing TLB and register window effects), we observe a 4�17% performance difference between a modest out-of-order processor and a hypothetical processor which idealizes serializing instructions.
We examine the consumption of values produced by several SIs, and observe that most values are consumed, but that the values are Effectively Useless (EU) � i.e. they do not actually change the execution of the consuming instructions. To improve the performance of such SIs, we propose EU prediction, which can allow younger instructions to proceed, possibly reading a stale value, and yet still correctly execute. This simple technique improves the performance of five of our seven workloads by 8�12%.
Permanent Link
http://digital.library.wisc.edu/1793/60578Type
Technical Report
Citation
TR1606
