#Number TR-PDS-1998-007 #Title Architectural Support for Relaxed Memory Consistency Models #Author Brian Grayson and Craig Chase #Abstract Shared-memory multiprocessors frequently allow reordering of memory accesses to help hide the latency of remote accesses. A ``memory consistency model'' describes which reorderings of memory accesses are allowed. Researchers have investigated several memory consistency models that provide a high degree of latency tolerance around synchronization operations, which typically incur the highest latency. However, most processors implement stronger models. This paper presents a set of techniques for instruction ordering that can be used to provide the processor with the ability to implement a highly aggressive Release Consistency model. The same techniques can be used to provide support for even weaker memory models, such as allowing two critical sections to be performed out-of-order. Our analysis shows that the biggest impediment to instruction-level parallelism around synchronization operations is the spin-loop inherent in a traditional lock acquisition. Weakening the memory model from a typical Release Consistency model to a very aggressive Release Consistency model provided a smaller but still significant improvement. Simulations using all of the techniques described in this paper show savings of dozens to hundreds of cycles per critical section for some SPLASH-2 benchmarks, and a speedup of 6 or more for a fine-grain microbenchmark. #Bib @techreport{YourBibNameHere, Author = "Brian Grayson and and Craig Chase", Title = "Architectural Support for Relaxed Memory Consistency Models", Number = "TR-PDS-1998-007", Institution = "Department of Electrical and Computer Engineering, The Universit of Texas at Austin", month = "July", year = "1998", note = "Available from {\tt http://maple.ece.utexas.edu/TechReports/1998/TR-PDS-1998-007.ps.Z}." }