logo Ircam

Tracking Partials of Polyphonic Sounds

Associated People: Mathieu Lagrange


This study addresses the problem of tracking partials, i.e. determining the evolution over time of the parameters of a given number of sinusoids with respect to the analyzed audio stream. We first show that the minimal frequency difference heuristic generally used to identify continuities between local maxima of successive short-time spectra [Mac Aulay & al TASP'86] can be successfully generalized using the linear prediction formalism to handle modulated sounds such as musical tones with vibrato.

The spectral properties of the evolutions in time of the parameters of the partials are next studied to ensure that the parameters of the partials effectively satisfy the slow time-varying constraint of the sinusoidal model. These two improvements are combined in a new algorithm designed for the sinusoidal modeling of polyphonic sounds.

The comparative tests show that onsets / offsets of sinusoids as well as closely-spaced sinusoids are better identified and stochastic components are better avoided.

You can hear below some example of modeling some strongly polyphonic contents with 60 sinusoidal oscillators active at a time in the framework of the SinuSoidal Codec (SSC) encoder developed in FTR&D with no quantization.
Two algoritms are used, the famous MAQ algorithm from [Mac Aulay & al TASP'86] and our proposed approach, the HFC algorithm.

Pop SC03 (original low-pass filtered) .wav tracked using MAQ .wav
    tracked using HFC .wav
Classical SC02 (original low-pass filtered) .wav tracked using MAQ .wav
    tracked using HFC .wav

The perceptive test performed in FTR&D demonstrated that the use of the HFC algorithm leads to a reduction of the "smearing" effect (onset/offset are better percieved) and the "bubbling" effect (presence of wrongly identified partials).