env: MATLAB functions for spectral envelope estimation in sinusoidal models

Marine Campedel-Oudot, Olivier Cappé, Eric Moulines
ENST dpt. TSI / LTCI (CNRS-URA 820),
46 rue Barrault, 75634 Paris cedex 13, France.
cappe at tsi.enst.fr

About env

This directory contains MATLAB mfiles and data illustrating the paper "Estimation of the spectral envelope of voiced sounds using a penalized likelihood approach" by Marine Campedel-Oudot, Olivier Cappé, and Eric Moulines, IEEE Trans. Speech and Audio Processing, 9(5), pp. 469-481, July 2001.

The paper is available to IEEE subscribers from IEEE Xplore. A preprint version can also be downloaded from this site or from NEC's ResearchIndex database.

Copyright note: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder (see IEEE copyright policies).

Contents

The content of this directory will remain obscure if you don't you take a look at section IV.D of the paper. The four envelope estimation functions have the same name as the corresponding methods (AR, LS, WLS and OLC - see beginning of section IV). Each matlab mfile has an on-line help and the code is commented.

The signal is a 8kHz, 16 bits quantized, 3.9s recording of a young children uttering (in French) "Le petit lapin a mangé les carottes, les carottes". The pitch was determined using someone else's code so that only the results are provided in the file signal.dat (text file with the first column being the timestamp in s, with a 5ms hop, and the second column contains either the pitch frequency in Hertz, or a NaN value when the signal was classified as unvoiced).

The main script is demo.m which analyzes the complete signal (which is longer than the excerpt shown in the paper), plots the results obtained in each frame and computes two synthetic harmonic parts from the estimated envelopes synth_ar for the AR method and synth_olc for the OLC method. The files in the subdirectory results correspond to these to signals: With res_ar being the difference between signal and synth_ar, synth_ar_2p the signal obtained when the pitch frequency is reduced by 2% (and the amplitudes of the harmonic are recomputed from the envelope) - with the same files with suffix _olc for the OLC method. The .jpegf files contain JPEG images of the color spectrograms of the results with a normalized "jet" colormap that corresponds to a depth of 60dB. When listening to the results, remember that the input signal signal is not voiced all the way, so that the synthetic signals do sound strange in the sections that correspond to plosive consonants (this is only the harmonic part!)

Running the script demo.m should take about 5mn when using the OLC method (default) and not more than 50s if you use the WLS method (by commenting the call to OLC and uncommenting the call to WLS at lines 84-87). Suppressing the plotting of the results in each frame will also speed up the computation. The script and functions have been tested using MATLAB 5.3, Signal Processing Toolbox 4.2 and Optimization Toolbox 2.0. It should easily work with earlier versions of these with some minor changes (see the help of demo.m) but some version of the Optimization Toolbox is required if you want to test the OLC method.


List of files

Name                   Description
LS.m                   Function for the LS method (mfile)
OLC.m                  Function for the OLC method (mfile)
OLC_eval.m             Function (mfile, shoudl be called only through OLC)
WLS.m                  Function for the WLS method (mfile)
cepsval.m              Function for evaluating the envelope from the cepstrum (mfile)
demo.m                 Script to analyze the signal (mfile)
env.tar.gz             Archive that contains the whole dir. (unix, gzipped)
results/               ...
robnest.m              Function to estimate the noise psd (mfile)
signal.dat             Pitch values (text file: time in s, pitch freq. in Hz)
signal.raw             Signal (8kHz, 16 bit data in IEEE little endian short format)

results/res_ar.jpeg            Residual from the AR harmonic synthesis (jpeg color spectrogram)
results/res_ar.wav                            -                 (wave soundfile)
results/res_olc.jpeg           Residual from the OLC harmonic synthesis (jpeg color spectrogram)
results/res_olc.wav                           -                 (wave soundfile)
results/signal.jpeg            Original signal (jpeg color spectrogram)
results/signal.wav                            -                 (wave soundfile)
results/synth_ar.jpeg          AR harmonic synthesis (jpeg color spectrogram)
results/synth_ar.wav                          -                 (wave soundfile)
results/synth_ar_2p.jpeg       AR harmonic synthesis with pitch reduced by 2% (jpeg color spectrogram)
results/synth_ar_2p.wav                       -                 (wave soundfile)
results/synth_olc.jpeg         OLC harmonic synthesis (jpeg color spectrogram)
results/synth_olc.wav                         -                 (wave soundfile)
results/synth_olc_2p.jpeg      OLC harmonic synthesis with pitch reduced by 2% (jpeg color spectrogram)
results/synth_olc_2p.wav                      -                 (wave soundfile)

http://www.tsi.enst.fr/~cappe/env/

Olivier Cappé, May 2001.