Presented at the 1998 ACM/IEEE International Symposium on Microarchitecture

Evaluating MMX Technology Using DSP and Multimedia Applications

Ravi N. Bhargava, Lizy K. John, Brian L. Evans and Ramesh Radhakrishman,

Department of Electrical and Computer Engineering, Engineering Science Building, The University of Texas at Austin, Austin, TX 78712-1084 USA - - -



Many current general purpose processors are using extensions to the instruction set architecture to enchance the performance of digital signal processing (DSP) and multimedia applications. In this paper, we evaluate the X86 architecture's multimedia extension (MMX) instruction set one a set of benchmarks. Our benchmark suite includes kernels (filtering, fast Fourier transforms, and vector arithmetic) and applications (JPEG compression, Doppler radar processing, imaging, and G.722 speech encoding). Each benchmark has at least one non-MMX version in C and an MMX version that makes calls to an MMX assembly library. The versions differ in the implementation of filtering, vector arithmetic, and other relevant kernels. The observed speedup for the MMX versions fo the suite ranges from less than 1.0 to 6.1 In addition to quantifying the speedup, we perform detailed instruction level profiling using Intel's VTune profiling tool. Using VTune, we profile static and dynamic instructions, microarchitecture operations, and data references to isolate the specific reasons for speedup or lack thereof. This analysis allows one to understand which aspects of native signal processing instruction sets are most useful, the current limitations, and how they can be utilized most efficiently.

The full paper is available in PDF - Postscript - GNU-Compressed Postscript formats.

The source code for the performance evaluation is available at

Last Updated 01/20/99.