Presented at the 1998
ACM/IEEE International
Symposium on Microarchitecture
Evaluating MMX Technology Using DSP and Multimedia Applications
Ravi N. Bhargava,
Lizy K. John,
Brian L. Evans
and
Ramesh
Radhakrishman,
Department of Electrical and Computer Engineering,
Engineering Science Building,
The University of Texas at Austin,
Austin, TX 78712-1084 USA
ravib@ece.utexas.edu -
ljohn@ece.utexas.edu -
bevans@ece.utexas.edu -
radhakri@ece.utexas.edu
Presentation
Abstract
Many current general purpose processors are using
extensions to the instruction set architecture to enchance
the performance of digital signal processing (DSP) and
multimedia applications. In this paper, we evaluate the
X86 architecture's multimedia extension (MMX) instruction
set one a set of benchmarks. Our benchmark
suite includes kernels (filtering, fast Fourier transforms, and
vector arithmetic) and applications (JPEG compression,
Doppler radar processing, imaging, and G.722
speech encoding). Each benchmark has at least one
non-MMX version in C and an MMX version that
makes calls to an MMX assembly library. The versions
differ in the implementation of filtering, vector
arithmetic, and other relevant kernels. The observed
speedup for the MMX versions fo the suite ranges from
less than 1.0 to 6.1 In addition to quantifying the
speedup, we perform detailed instruction level
profiling using Intel's VTune profiling tool. Using VTune,
we profile static and dynamic instructions, microarchitecture
operations, and data references to isolate the
specific reasons for speedup or lack thereof. This
analysis allows one to understand which aspects of native
signal processing instruction sets are most useful, the
current limitations, and how they can be utilized most
efficiently.
The full paper is available in
PDF -
Postscript -
GNU-Compressed Postscript
formats.
The source code for the performance evaluation is available at
http://www.ece.utexas.edu/~ljohn/mmxdsp/.
Last Updated 01/20/99.