EE313 Linear Systems & Signals - Mini-Project #1 Hints

Mini-Project #1 assignment

Section 2.2 Recording the vowel sound. When recording the vowel sound, we're using a standard speech sampling rate of 8000 Hz or equivalently 8000 samples/s. Since we record for one second, we have 8000 samples in our recorded data. We save the recorded data as a wave file so that no data is lost. That is, a wav file is in a lossless format where the original audio samples are fully retained. This is not the case for a lossy format such as MPEG-1 layer 3 (mp3) or MPEG-4 audio (mp4a).
Section 2.3 Time-Domain Analysis. The MATLAB code introduces a variable N for the number of samples in the recorded speech. This will clash with the N used in the sum of sinusoids formula that appears in Sections 1 and 3. They are different quantities.
In this problem, you'll estimate the pitch period of the recorded voice. The pitch period is the fundamental period T₀. We can invert the pitch period to obtain the pitch frequency which is also the fundamental frequency f₀ = 1 / T₀. You'll be able to validate the estimate of the pitch frequency from section 2.3 with the frequency-domain analysis in section 2.4. Typically, the first peak in the plot of the magnitude of the Fourier series coefficients for positive frequencies is f₀.
Section 2.4 Frequency-Domain Analysis. In part (c), you're asked to find "the approximate value of the strongest positive frequency component". You're looking for the brightest yellow patch in the spectrogram, and within the brightest yellow patches, you can use the Data Tips under the Tools in a plot window. After placing a Data Tip, you can use the up-down and left-right arrows to move the tip around. The Data Tip will tell you the time, frequency and power value at that point.
The spectrogram is showing the values in a matrix that has time along the horizontal dimension and frequency along the vertical dimension. So, you could automate finding the maximum value in the spectrogram after executing UTAudioFreqDomainAnalysis.m:
```
spectValues = spectrogram(myRecording, blockSize, overlap, blockSize, fs, 'yaxis');
[maxValue, maxIndex] = max(abs(spectValues), [], 'all', 'linear');
[row,col] = ind2sub(size(spectValues), maxIndex);
```
The maximum value (maxValue) occurs at (row, col) in matrix abs(spectValues).
When computing the Fourier series coefficients for a continuous-time signal, we assume that the observed signal is the fundamental period of a periodic signal, whether it really is or not. This is also the case when working with a discrete-time signal, as we're doing with the recorded audio signal.
To compute the exponential Fourier series coefficients for our discrete-time signal, we'll use the Fast Fourier Transform (FFT). The FFT takes the vector of N samples for the recorded signal, and returns the N exponential Fourier series coefficients. In the discrete-time case, we have a finite number of Fourier series coefficients unlike the continuous-time Fourier series which has infinite terms.
Here's the use of the fft function in MATLAB in line 11 of UTAudioFreqDomainAnalysis.m:
```
fourierSeriesCoeffs = fft(myRecording); 
```
The first half of the fourierSeriesCoeffs vector contains the Fourier series coefficients for non-negative frequencies, and the second half of the vector contains the Fourier series coefficients for negative frequencies.
Here are the first three elements of the fourierSeriesCoeffs vector. Recall that MATLAB starts indexing its vectors at index 1 instead of index 0:
- First element is at index 1. Contains A₀. Should be real-valued and small in magnitude.
- Second element is at index 2. Contains Fourier series coefficient for frequency f_s / N = 1 Hz.
- Third element is at index 3. Contains Fourier series coefficient for frequency 2 f_s / N = 2 Hz.
When plotting the Fourier series coefficients computed by the FFT, we use the FFT shift command fftshift to swap the halves of the fourierSeriesCoeffs vector.
3.0 Synthesizing the Vowel Sound. Each gain A_k and phase Phi_k are obtained from the appropriate element of the array fourierSeriesCoeffs.
A common approach to parts (a) and (b) is to identify frequencies of the 10 peaks in the plot of the magnitude of the Fourier series coefficients where the frequencies are close to being harmonically related. For each frequency, one can look up the Fourier series coefficient in the vector fourierSeriesCoeffs and here's how. First, we've recorded 1s of speech at a sampling rate of 8000 samples/s, which gives a total of 8000 samples. Because of this, the element at index n+1 in fourierSeriesCoeffs corresponds to a frequency of n Hz for n in [0, 3999]. Once we have the Fourier coefficient, we can compute its magnitude A_k using the abs command and its phase Phi_k using the angle command. For part (b), after computing the magnitude and phase of the Fourier series coefficient, each frequency would be changed to be a harmonic of an estimated value of the fundamental frequency f₀.
During playback of the synthesized signal, please use the soundsc command instead of the sound command.

Last updated 09/17/21. Send comments to (Mailbox)

bevans@ece.utexas.edu