Chair: Prof. Brian L. Evans, The University of Texas at Austin, bevans@ece.utexas.edu
Methods: papers 1-2
Tools: papers 3-4
Case Studies: papers 5-7
Contact Author: Mr. Bilung Lee, Cory Hall, Dept. of EECS, University of California, Berkeley, CA 94720-1770, Voice: (510) 642-0395, Fax: (510) 642-2739, bilung@eecs.berkeley.edu
Hierarchical concurrent finite state machines (HCFSMs) dramatically increase the usability of finite state machines (FSMs). However, most formalisms that support HCFSMs, such as Statecharts (and its variants), tightly integrate concurrency semantics with the FSM semantics. We, in contrast, allow FSMs to be combined with multiple concurrency models, enabling selection of the most appropriate concurrency semantics for the problem at hand. A key issue for the combinations is to define how FSMs interact with various concurrency models without ambiguities. In this paper, we focus on the interaction of FSMs and three concurrency models: synchronous dataflow, discrete-event and synchronous/reactive models.
Contact author: Mr. Jose Luis Pino, Measurement and Test Division, HP EEsof, 5601 Lindero Canyon Road, West Lake Village, CA 91362, Voice: (818) 879-6351, Fax: (818) 879-6394, jpino@wlv.hp.com
This paper introduces timed synchronous dataflow (TSDF) which enables the codesign of the synchronous DSP and analog RF portions of an application. The semantics and scheduling techniques of TSDF are detailed. A 16 QAM modem with a QAM synthesizable DSP transmitter, cosimulating with a RF modulator and RF power amplifier is demonstrated.
Contact author: Dr. Vojin Zivojnovic, AXYS Design Automation Inc., 135 Santa Louisa, Irvine, CA 92606, Voice: (714) 478-4787, Fax: (949) 653-8097, vz@axys.de
High complexity and development costs of processor designs permanently force the designers to develop and intensively use processor abstractions in form of abstract processor models for specification, construction, and verification of processor hardware. Even after the physical device is fabricated and tested, abstract processor models help the system designer to understand the details of the processor. At the same time, processor models provide the necessary machine-related information to software and hardware design tools.
Processor models differ in the extent and level of detail. In this paper a taxonomy of DSP and embedded processor models is presented. Although built using the same building blocks at the physical level, the general purpose and DSP/embedded processor models exhibit differences, especially at higher abstraction levels. These differences which are most visible at software-oriented abstraction levels, result from the different applications in which the general-purpose and DSP/embedded processors are used.
Contact author: Hugo Andrade, National Instruments Corp., 11500 N. MoPac Expressway, Austin, TX 78759, Voice: (512) 433-8518, Fax: (512) 433-8641, andrade@natinst.com
The "G" programming language, as implemented in the National Instruments product "LabVIEW", allows the user to describe a program with a graphical dataflow representation. LabVIEW is widely used by scientists and engineers in the instrumentation industry. This paper studies the applicability and adaptability of the techniques and concepts in the latest research on software synthesis from dataflow models for embedded software design to the G programming language and LabVIEW development environment.
Contact author: Dr. S. Sriram, Texas Instruments, MS 446, P.O. Box 655474, Dallas, TX 75265-5474, sriram@hc.ti.com
In this paper, we examine the various functions that are performed in MPEG-2 decoding, and discuss implications for software implementation of these functions on the TI C6x architecture. We include a brief description of the C6x DSP, but we mainly concentrate on implementation of MPEG-2 decoding kernels on the DSP. By "MPEG-2" we will refer to main level, main profile, CCIR 601 format video, at 30 frames per second (fps). The decoding functions include bitstream parsing, variable length decoding (VLD), dequantization, inverse discrete cosine transform (IDCT), and motion compensation. We present cycle count estimates for various functions implemented on the C6x, and discuss the system design issues for a set-top-box designed around this processor. Our cycle count estimates are based on optimized, functionally accurate implementations in some cases, and on analysis of C implementations of the function in other cases; we describe how we arrive at these estimates in detail. We compare our proposed software implementation with published benchmarks for MPEG2 decoding implementation on other software platforms, such as Pentium and Sparc. Finally, based on the cycle count estimates, we propose a system configuration for performing MPEG-2 decoding using two 200MHz C6x processors.
Contact author: Mr. Gregory E. Allen, Applied Research Laboratories, The University of Texas at Austin, P.O. Box 8029, Austin, TX 78713-8029, Voice: (512) 835-3487, Fax: (512) 835-3259, gallen@arlut.utexas.edu
Traditionally, expensive custom hardware has been required to implement data-intensive sonar beamforming algorithms in real-time. We develop a real-time sonar beamformer in software by merging the following recent technologies: (1) symmetric multiprocessing on Unix workstations, (2) lightweight POSIX threads, and (3) the Process Network model of computation. Process Network is a concurrent model that guarantees determinate execution (i.e., no artificial deadlock) in bounded memory if a bounded memory realization exists. Lightweight threads provide a low-overhead, high-performance, scalable framework. Symmetric multiprocessing guarantees efficient utilization of multiple processors, as scheduling of threads is dynamically handled by the operating system. We compare the performance of batch-mode and process network beamformers. We find that it is feasible for a 4-GFLOP digital interpolation process network beamformer to run in real-time on a 16 x 300 MHz UltraSPARC-II workstation. The software beamformer reduces manufacturing costs, development costs, and development time by a factor of three, and volume and weight by a factor two, over an equivalent modern hardware beamformer.
Contact author: Mr. Moinul Khan, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0250, mhkhan@ece.gatech.edu
Temporal models can be used to capture the timing behavior of a system. During the design and selection of a system architecture for high performance applications, it is becoming necessary to model the applications temporal execution characteristics on the target architecture in order to evaluate the number of processors, communication fabric, and partitioning trade-offs required for the complex design. The message passing interface (MPI) standard has recently evolved to facilitate architecture-independent representations of the computational concurrency and the communication structures present in many COTS (Commercial-Off-The-Shelf) systems, ranging from supercomputers to multiprocessor DSP (Digital Signal Processing) systems. The behavior of the communication primitives within these systems uniquely affects the overall system performance. To evaluate the performance of such systems, it is essential that the temporal models reflect the effects of the communication primitives on the system performance. In our proposed VHDL-based (VHSIC Hardware Description Language) modeling environment, we model the hardware as well as its accompanying software which contains MPI communication primitive calls. We capture the overall systems performance of the application and communications software executing on the hardware platform. By incorporating MPI, we also increase the flexibility and architecture independence of our modeling environment. This allows both the application developer and the system engineers to cooperate efficiently without requiring the intimate knowledge of each other's domain.