This Report was presented to the Faculty of the Graduate School of the University of Texas at Austin in partial fulfillment of the requirements for the degree of Master of Science in Engineering
Abstract
Time-Scale Modification of Audio Signals Using the Dual-Tree Complex Wavelet Transform
Jeffrey Blake Livingston, M.S.E.
The University of Texas at Austin, December 2006
Supervisors: Brian L. Evans and Bruce W. Pennycook
Reader: Russell F. Pinkston
The use of the wavelet transform in place of the short-time Fourier transform (STFT) in the phase-vocoder algorithm for time-scaling audio signals has been investigated in the past, motivated by the fact that the wavelet transform offers variable time-frequency resolution that can efficiently and precisely capture audio signal information in a manner very well-matched to human auditory perception characteristics. Despite this, little has emerged in the audio processing literature likely due to inherent limitations of traditional forms of the wavelet transform, such as lack of phase information from the discrete wavelet transform (DWT), and high computational cost and lack of available inverse transform implementations for the continuous wavelet transform (CWT). In this paper, a new wavelet transform based phase-vocoder algorithm is presented that uses a new form of the DWT, the Dual-Tree Complex Wavelet Transform, DT-CWT, which overcomes many of the deficiencies of the older DWT forms.A preliminary implementation of the algorithm in Matlab resulted in output that was time-stretched as desired, but with the addition of erroneous frequency components due to instantaneous frequency estimation errors caused by the insufficiently narrow octave band frequency resolution of the fully decimated DTCWT. Use of the wavelet packet transform, (WPT) with approximately 1/3rd octave logarithmically spaced subbands instead of the octave band, fully decimated DTCWT is proposed as a solution to remedy the artifacts resulting from inadequate frequency resolution.
This document is available in PDF format.
For more information contact: Jeff Livingston <jeff_livingston@alumni.utexas.net>