Current Research

There are many ways in which our previous research could be expanded to develop new search methods and low-power consumption techniques. The following description is taken from Section 6.2 of [1].

Advanced Wordlength Search Algorithms

This research has developed search methods based on (1) gradient search and (2) genetic algorithms. The genetic algorithm requires considerable running time, but it is not trapped in local optima. Gradient-based search algorithms take less running time, but they do become trapped in local optima. In case studies, the gradient-based search methods (once converged) find a solution comparable to genetic algorithms but with two orders of magnitude fewer system simulations. However, with additional system simulations, genetic algorithms produces far better solutions. More important, genetic algorithms naturally produce a signal quality vs. implementation complexity tradeoff curve.

Hybrid wordlength optimization: The various search methods could be used together to compensate for their disadvantages. One way to combine the two approaches follows. The first step is use a gradient-based method until it converges to a solution. The second step is to use the trajectory of feasible solutions obtained by the gradient-based search (either with or without the initialization phase solutions included) as the initial generation for the genetic algorithm. The third and final step is to run the genetic algorithm with a high mutation rate to create a genetically diverse population. Another way to combine the two approaches is to run a few generations of the genetic algorithm, and then run the gradient-based search in parallel on each solution obtained from the genetic algorithm. This is probably the less promising of the two ways to combine the methods.

Dynamic wordlength bounds: During wordlength searching process, the bounds of wordlength are fixed to a lower bound and an upper bound. In arithmetic operations, the output bound can be changed according to the input wordlength bound. The upper bound of the wordlength at the multiplier output is the sum of the upper bounds at the multiplier inputs. Thus, the bound at input or output can be dynamically changed according to the wordlength bound at input or output. This approach could reduce the running time due to the reduced search space.

Variable reduction: The running time in the wordlength search algorithms is proportional to the number of variables. Given that all variables in a floating-point program are to be converted to fixed-point variables, the optimum wordlength of each variable is searched. Trivial variables can be removed from the list of wordlength optimization for greater speed. One variable is sufficient in the delay block, which has input and output variables. The wordlength output of the adder can be removed from the list since the maximum wordlength at output is at most one more than the input wordlength.

Adaptive step size: The step size of update directions in gradient-based search algorithms is an integer value. The value 1 is used for the integer step size. If the step size is larger than 1, wordlength set could reach a neighbor of the optimum. After that, the step size of value 1 can be used for refinement. This approach can reduce running time.

Further Analysis on Search Algorithms

Analysis in GA with different genetic parameters: Genetic algorithms mimic the process of plant and animal evolution. There are many options to realizing a genetic algorithm. The gene can be encoded as a real, integer, or binary number. Since wordlength is expressed as an integer number in this research, an integer encoding method is employed. Thus, any number within bounds is selected during generation. If the binary encoding method is used, the genetic operation could be refined and results would be different. A comparison of the different options in GA would provide valuable information. It is worthy of comparing different options in GA.

Weighted sum approach in GA: Multi-objective optimizations have more than one objective. Weighted sum approaches assign weights on each objective to derive a single objective. Pareto ranking approaches assign rank on each candidate by calculating the Pareto rank. We use the Pareto ranking approach because Rohling [2] shows many disadvantages in the weighted-sum approach. Research is needed to demonstrate the performance degradation of the weighted-sum approach compared to the Pareto ranking approach in wordlength optimization.

Comparison with other optimization algorithms: Simulated annealing (SA) and genetic algorithms (GA) are two stochastic methods currently in widespread used for difficult optmization problems. We use GAs to optimize wordlength because it naturally provides a signal distortion vs. implementation tradeoff curve. Some papers show that a GA can outperform a SA for design in some applications. However, it would be useful for future research to consider simulated annealing algorithm in wordlength optimization area.

Low Power Consumption

Low power consumption at the system level: Power consumption can be reduced by decreasing wordlength in the system, even though the hardware architecture is fixed. Chapter 4 shows the wordlength reduction techniques and estimates the power savings on multiplier units. Wordlength reduction can also reduce power consumption within other components such as memory. Powell and Chau show a model for estimating power dissipation in a class of DSP VLSI chips according to wordlength [3]. Thus, it may be useful for future research to explicitly consider power reduction with wordlength reduction at the system level.

Low-power in floating-point hardware: In some applications, the optimum wordlength in floating point can reduce power consumption by 66% [4]. Our research shows that power consumption in multipliers can be reduced by using wordlength reduction techniques. The multiplier would be a major power-consuming unit; however, units for addition and normalization in floating-point hardware could also consume considerable power. Thus, the analysis or estimation of power reduction in floating-point hardware could result in low-power consumption by wordlength reduction techniques.

Distribution of power consumption: Average power consumption is estimated by using a Xilinx Xpower tool in this research. Small multiplications on large Wallace and Booth multipliers could have different distribution of power consumption. Distribution information of power consumption could be helpful for algorithm developments to reduce power consumption.

Electronic Design Automation Software

Enhanced code generator: A simple parser that interprets one arithmetic operation at each line has been developed to implement the code generator our floating-point to fixed-point toolbox. Multiple operations do not work on current parsers. Thus, a given floating-point program should have at most one arithmetic operation for each line. Code generator could be enhanced by adding a decomposition of multiple arithmetic operations.

A parser can search a given commented symbol preceded with floating-point variables in floating-point programs and convert the variables to a fixed-point data type. Search engines search the optimum wordlength of the converted fixed-point variables; however, designers sometime want to set constraints, such as specific wordlength, and rounding methods. The code generator used in our toolbox does not handle fixed-point constraints in variables. Enhancements are necessary for generator to handle such constraints.

Range estimation: Range can be estimated by analytical or statistical approaches. An analytical approach offers faster results by analyzing relationship of operations in the dataflow. However, the estimated range results from an analytical approach are conservative. Furthermore, the estimated range in a feedback loop could grow infinitely. The statistical approach offers robust range estimation at the expense of long simulation times. Thus, the two approaches could be combined. The statistical approach could be used in a feedback loop part, and an analytical approach could be used in other parts.

Dataflow approach: We use simulation-based approaches in wordlength optimization. Range information has been monitored at each variable, and propagated quantization errors are measured at the output of systems. However, the range information and the propagated error can be obtained by analyzing dataflow graphs of systems. This approach could reduce time and reduce wordlength variables.

Optimum DSP Algorithms

Given an algorithm/system, we quantify the wordlength of variables in terms of signal quality vs. implementation complexity. Thus, the algorithm/system is not changed. Rearranging the algorithm/system in filters results in a better finite wordlength effect [5, 6].

The SPIRAL project optimizes the digital signal processing algorithm and automates software and hardware development [7]. SPIRAL is a generator of libraries for fast software implementation of signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the hardware is upgraded or replaced [8]. The next level of abstraction is to develop a system that simplifies, rearranges, and expands the algorithm/system in search of a better tradeoff between signal quality and implementation complexity.

Area Model

We use the area estimation model for field programmable gate arrays (FPGAs) [9]. Adder, gain, and delay units are modeled in terms of wordlength. Other units such as transcendental function computation can be modeled by using series of arithmetic units. Transcendental signal generations such as sine and cosine waveforms can be modeled by using difference equations or lookup tables.

References

Kyungtae Han, A utomating Transformations from Floating-point to Fixed-point for Implementing Digital Signal Processing Algorithms, Dept. of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712, August 2006.
Gregory Rohling, Multiple objective evolutionary algorithms for independent, computationally expensive objective evaluations, Ph.D. dissertation, School of Electrical and Computer Engineering, Georgia Institute of Technology, November 2004.
S. R. Powell and P. Chau, "A model for estimating power dissipation in a class of DSP VLSI chips", IEEE Trans. Circuits and Systems, vol. 38, no. 6, pp. 646-650, June 1991.
F. Fang, T. Chen, and R. Rutenbar, "Floating-point bit-width optimization for low-power signal processing applications", Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Proc., vol. 3, May 2002, pp. 3208-3211.
K. K. Parhi, VLSI Digital Signal Processing Systems. New York, NY: John Wiley and Sons, 1999.
A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-Time Signal Processing. Upper Saddle River, NJ, USA: Prentice-Hall, 1998.
M. Puschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo, "SPIRAL: Code generation for DSP transforms", Proceedings of the IEEE, vol. 93, no. 2, pp. 232-275, 2005.
M. Puschel, B. Singer, J. Xiong, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, and R. W. Johnson, "SPIRAL: A generator for platform-adapted libraries of signal processing algorithms", Int. Journal of High Performance Computing Applications, vol. 18, no. 1, pp. 21–45, 2004.
G. A. Constantinides, P. Y. Cheung, and W. Luk, "Wordlength optimization for linear digital signal processing", IEEE Trans. Computer-Aided Design, vol. 22, no. 10, pp. 1432-1442, Oct. 2003.

Mail comments about this page to bevans@ece.utexas.edu.