- Student Information Sheet – please fill out
this form, attach a recent recognizable photograph, and turn it in Wednesday,
February 1st.
This section lists the papers referenced in class. Some links may require you to login using your UT EID if accessed off-campus.
-
Simultaneous Multithreading
- Burton Smith. “Architecture and applications of the HEP multiprocessor computer system,” Proc. SPIE, vol. 298 Real-Time Signal Processing IV, 1981, pp. 241-248.
- Mario Nemirovsky, Forrest Brewer, Roger C. Wood. DISC: Dynamic Instruction Stream Computer. MICRO'91, 1991.
- Hirata, H.; Kimura, K.; Nagamine, S.; Mochizuki, Y.; Nishimura, A.; Nakase, Y.; Nishizawa, T. An Elementary Processor Architecture with Simultaneous Instruction Issuing from Multiple Threads. ISCA-19, 1992.
- Donalson, D.; Serrano, M.; Wood, R.; Nemirovsky, M. DISC: dynamic instruction stream computer-an evaluation of performance.Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, 1993.
- Yamamoto, W.; Serrano, M.J.; Talcott, A.R.; Wood, R.C.; Nemirosky, M.Performance estimation of multistreamed, superscalar processors.Proceedings of the Twenty-Seventh Hawaii Internation Conference on System Sciences, 1994.
- D.M. Tullsen, S.J. Eggers, H.M. Levy. Simultaneous Multithreading: Maximizing On-Chip Parallelism. Proceedings of ISCA-22, June 1995.
- Robert S. Chappell, et. al. Simultaneous subordinate microthreading (SSMT). ISCA-26, 1999.
Branch Prediction
- James E. Smith, A Study of Branch Prediction Strategies ISCA-8, 1981.
- J.K.F. Lee, Alan J. Smith. Branch Prediction Strategies and Branch Target Buffer Design. IEEE Computer Vol. 17, Iss. 1, 1984.
- Tse-Yu Yeh and Yale Patt. Two-Level Adaptive Training Branch Prediction. MICRO-24, 1991.
- Tse-Yu Yeh and Yale Patt. Alternative implementations of two-level adaptive branch prediction. ISCA-19, 1992
- Shien-Tai Pan, Kimming So, Joseph T. Rahmeh. Improving the accuracy of dynamic branch prediction using branch correlation. ASPLOS-V, 1992.
- Scott McFarling. Combining Branch Predictors. WRL Technical Note TN-36, 1993.
- Ravi Nair. Dynamic path-based branch correlation. MICRO-28, 1995.
- Eric Sprangle, et. al. The Agree Predictor: A Mechanism For Reducing Negative Branch History Interference. ISCA-24, 1997.
- Daniel A. Jiménez and Calvin Lin. Dynamic Branch Prediction with Perceptrons.HPCA-7, 2001.
- Andre Seznec. Analysis of the OGEHL predictor. ISCA-32, 2005.
- Andre Seznec, Pierre Michaud. A case for (partially) tagged Geometric History Length Branch Prediction. Journal of Instruction Level Parallelism, Feb. 2006.
Predication
-
Out-of-Order and Superscalar
- Tomasulo, R. M. An Efficient Algorithm for Exploiting Multiple Arithmetic Units. IBM Journal of Research and Development, 1967.
- Joseph A. Fisher. Very Long Instruction Word architectures and the ELI-512. ISCA-10, 1983.
- James E. Smith. Decoupled Access/Execute Computer. 1984. (revised journal version)
- Yale Patt, Wen-mei Hwu, and Michael Shebanow. HPS, a new microarchitecture: rationale and introduction. MICRO-18, 1985.
- Yale Patt, Stephen W. Melvin, Wen-mei Hwu, and Michael Shebanow. Critical issues regarding HPS, a high performance microarchitecture. MICRO-18, 1985.
Dynamic Instruction Scheduling
-
Trace Cache
- Stephen W. Melvin and Yale N. Patt. Performance benefits of large execution atomic units in dynamically scheduled machines. ICS 3, 1989.
- Alexander Peleg and Uri Weiser. Dynamic flow instruction cache memory organized around trace segments independent of virtual address line. U.S. Patent 5381533, 1994.
- Daniel H. Friendly, Sanjay J. Patel, and Yale N. Patt. Alternative Fetch and Issue Policies for the Trace Cache Fetch Mechanism. MICRO'97, 1997.
- Sanjay J. Patel, Marius Evers, and Yale N. Patt. Improving trace cache effectiveness with branch promotion and trace packing. ISCA 25, 1998.
- Eric Rotenberg, Jim Smith, and Steve Bennett. Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching. MICRO'96, 1996.
- Eric Rotenberg, Quinn Jacobson, Yiannakis Sazeides, and Jim Smith. Trace processors. MICRO'97, 1997.
- Bryan Black, Bohuslav Rychlik, and John Paul Shenn. The block-based trace cache. ISCA 26, 1999.
Cache Management Techniques
- Moinuddin K. Qureshi, David Thompson, and Yale N. Patt The V-Way Cache : Demand-Based Associativity via Global Replacement. ISCA, 2005.
- Moinuddin K. Qureshi, Daniel N. Lynch, Onur Mutlu, and Yale N. Patt. A Case for MLP-Aware Cache Replacement. ISCA, 2006.
- Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely Jr., and Joel Emer. Adaptive Insertion Policies for High Performance Caching. ISCA, 2007.
-
Block-Structured ISA
Superblocks and Hyperblocks
Runahead Execution
- James Dundas and Trevor Mudge. Improving data cache performance by pre-executing instructions under a cache miss. ICS-11, 1997.
- Onur Mutlu, Jared Stark, Chris Wilkerson, and Yale N. Patt. Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors. HPCA-9, 2003.
- Onur Mutlu, Hyesoon Kim, and Yale N. Patt. Techniques for Efficient Processing in Runahead Execution Engines. ISCA-32, 2005.
- Onur Mutlu, Hyesoon Kim, and Yale N. Patt. Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns. MICRO, 2005.