EE382C Embedded Software Systems - Pentium vs. TMS320C62

Prof. Brian L. Evans, Dept. of Electrical and Computer Engineering, The University of Texas at Austin,

The choice of processor(s) depends largely on the constraints on size, area, volume, power, throughput, weight, and so forth of the target system. Below, I compare Intel Pentium MMX "mobile" processors with Texas Instruments TMS320C62x processors. As of August 1998, the TMS320C62 is the fastest, most power-hungry digital signal processor on the market. For the same performance on a set of digital signal processing benchmarks, the 150 MHz TMS320C62 has 1/11th the interrupt service routine latency, 1/3rd the power consumption, 1/2 the price, and 1/74th the volume when compared to the Pentium 266 MMX processor.

Processor MHz Peak MIPS DSP Benchmarks ISR Latency Power Price Dimensions (in) Volume
Pentium MMX 233 466 49 BDTImarks 1.14 us 4.25 W $25 5.5 x 2.47 x .647 8.789 in3
Pentium MMX 266 532 56 BDTImarks 1.00 us 4.85 W $40 5.5 x 2.47 x .647 8.789 in3
TMS320C6211 150 1200 74 BDTImarks 0.12 us 1.45 W (est.) $21 1.3 x 1.3 x .07 0.1183 in3
TMS320C6201 200 1600 99 BDTImarks 0.09 us 1.94 W $96 1.3 x 1.3 x .07 0.1183 in3

References:

Digital Signal Processing (DSP) Benchmarks are given for fixed-point algorithms in units of BDTImarks. A higher BDTImark rating means that the processor performs the benchmarks faster. BDTImarks were developed by Berkeley Design Technology, Inc., an independent company. According to BDTI: "The BDTImark is a measure of a processor's execution speed on DSP-intensive algorithms. Higher BDTImark scores indicate faster execution times." The DSP-intensive algorithms include real block FIR, complex block FIR, real single-sample FIR, LMS adaptive FIR, IIR, vector dot product, vector add, vector maximum, convolutional encoder, and finite state machine. The algorithms use on-chip memory and do not account for the ability to get data in and out of the processor. Getting data in and out of the processor is measured in the time required to process interrupts given by interrupt service routine (ISR) latency.

Interrupt Service Routine (ISR) Latency is taken to be the total time required to service an interrupt. The ISR latency is the overhead time required to save registers and flush pipelines, plus the time to service the ISR. ISR latency measures how quickly that a processor can process a newly arrived data sample, sample buffer, or image frame. The above ISR Latency figures assume that either no operating system is running on the processor or a minimal real-time operating system is running. For Intel processors running a real-time operating system, typical latencies are 1.38 us (200 MHz Pentium), 1.84 us (100 MHz Pentium), 7.54 us (33 MHz 486), and 14.25 us (33 MHz 386), according to QNX. However, these latency times are not deterministic and can vary by as much as 10% depending on the situation. For the TMS320C62 processor, the exact ISR latency (including overhead) is 18 cycles, according to page 7-24 of the TMS320C6000 CPU and Instruction Set Reference Guide, which corresponds to 0.09 us for a 200-MHz clock rate and 0.12 us for a 150-MHz clock rate. Ole Wolf at BDTI points out that "interrupts are turned off on the 'C62xx when a branch instruction is in its pipeline". Since the latency of a branch instruction is 5 cycles and branches are fairly common, it may be difficult to interrupt the C62 processor.

Under Windows '95, a typical interrupt latency on a Pentium 133 is 20 us according to Jeff Michalski at Concur Systems. Based on these numbers, a reasonable approximation for typical ISR latency for the Pentium 233 is (1.38us/1.84us)*20 us = 15 us under Windows '95. Under Windows NT, the latency worsens to 50 us because it implements a "polling interrupt" that occurs at a high, fixed frequency (20,000 Hz), according to QNX. This latency is imposed by the operating system and is independent of the processor.

Price is the price per unit for a volume order. Pentium MMX processor prices are given for Intel processors. AMD versions are up to 30% less expensive. Volume means 1,000 units for the Pentium MMX processors, 2,500 units for 150-MHz TMS320C6211 processors, and 10,000 units for 200-MHz TMS320C6201 processors.

The author would like to acknowledge Jeff Michalski at Concur Systems and Ole Wolf at Berkeley Design Technology Inc.. for their help in putting this information together.


Back to


Copyright (c) 1997-1999 by Brian L. Evans. Last updated 01/18/00.