This Report was presented to the Faculty of the Graduate School of the University of Texas at Austin in partial fulfillment of the requirements for the degree of Master of Science in Engineering
Abstract
Implementation of a 3-D Sonar Beamformer Using the Computational Process Network Model on a Synergy Quad PowerPC G4 with AltiVec Board
Young Hyun Cho, M.S.E.
The University of Texas at Austin, May 2001
Supervisor: Brian L. Evans
Reader: Lizy J. Lohn
Three-dimensional real-time digital sonar beamforming requires 4 to 12 GFLOPS, 1 to 2 GB of memory, and 100-200 MB/s of I/O bandwidth. Allen and Evans have implemented a 4-GFLOP sonar beamformer in real-time on a Sun UltraSPARC II server with 16 336-MHz processors by utilizing the Visual Instruction Set (VIS) single-instruction multiple-data (SIMD) extensions and the Computational Process Network (CPN) dataflow model. In the report, I rewrite the horizontal and vertical beamforming kernels to use the Motorola AltiVec SIMD extension for the PowerPC. Then I develop a scalable beamforming software system using the CPN on a Synergy Quad 333-MHz PowerPC G4 symmetric multiprocessing (SMP) board.While the SPARC VIS offers performance increases for signal processing kernels, AltiVec offers better performance due to its wider SIMD register size. In addition to SIMD integer operations, AltiVec can execute up to four 32-bit floating-point multiply and accumulate (MAC) operations per instruction. For the 128-bit SIMD AltiVec register operations, using data prefetching and permutation instructions are necessary to utilize the full capability of AltiVec. For example, transposing matrices in the 3-D sonar beamformer is handled without computational overhead using permutation instructions. I evaluate the performance of vertical and horizontal beamforming kernels on the PowerPC and the UltraSPARC-II to compare the impact of the compiler, SIMD word alignment, and cache block alignment on performance.
For computationally intensive applications such as the 3-D sonar beamformer, scalability is a key aspect of the system. Thus, the Computational Process Network model is the design framework of the beamforming system. This programming model decouples the computation processes (nodes) from the communication processes (queues). In a 3-D beamforming system, the nodes consist of the sonar sensors, the vertical beamforming kernels and the horizontal beamforming kernels. These nodes communicate through the preallocated memory which work as FIFO queues. On an UltraSPARC-II multiprocessor system, the CPN 3-D sonar beamformer shows near-linear speedup up to 16 processors.
I port the CPN 3-D sonar beamformer form the Sun to the Quad PowerPC G4 SMP board using the new beamforming kernels and transposed queues. On the PowerPC board, I discover performance limitations due to the cache hierarchy. I evaluate the importance of interconnection in determining scalable performance with the high-memory bandwidth application which require relatively high memory bandwidth.
This document is available in PDF format.
For more information contact Young Cho at young@ece.utexas.edu