logo Ircam

Modelling Musical Instruments using Artificial Neural Network

Associated People: Axel Roebel, Nicolas Obin
Date of Activity: 


Where : IRCAM – Analysis/Synthesis Team
Supervisors : Axel Roebel, Nicolas Obin (IRCAM - Analysis/Synthesis Team)

Context :

The modelling of complete sound sample databases of individual musical instruments with state of the art signal-processing methods is a research direction that has been actively pursued in the analysis/synthesis team for many years. The general idea is to extract signal models of the instrument characteristics from a sound database and then to use these models for sound synthesis that correctly reflects the timbre variations of the instrument related to pitch and intensity changes. One of the approaches has been based on using additive signal models [Hahn 2013, 2015]. More recently and motivated by the progress of large scale training of artificial neural networks for signal synthesis [Oord 2016] we have investigated into extending an approach based on learning the instrument dynamics from state space reconstructions [Roebel 1993, 2001]. Due to the computational complexity of the training phase the early phase of this research was limited to models representing the dynamics of individual notes. Recently, making use of the advances of training algorithms for artificial neural networks as well as increased computational capacity of CPUs and/or GPUs we have started initial investigation into the possibility to represent the dynamics of musical instruments covering the complete space of intensity and pitch variations using a single neural network enabling perceptually coherent sound synthesis [Bru 2016].


The internship aims to continue the research into modelling the dynamics of musical instruments covering the complete space of intensity and pitch variations and to establish instrument models for sound synthesis proposing pitch and intensity control inputs with perceptually valid timbre. The work further aims to investigate into establishing instruments models representing multiple instruments of the same family (violin, cello, ...) using a single network model with additional control inputs. Interpolation of the instrument dynamics in pitch, intensity and instrument type space should be experimentally investigated.

This research will make use of the python library theano http://deeplearning.net/software/theano/. The use of GPU cards should be investigated.

Bibliographie :

[Bru 2016] M. Bru, Expressive synthesis of musical instrument sounds, Internship, IRCAM / Denmark Technical University, 2016

[Hahn, 2013] H. Hahn and A. Roebel, Extended Source-Filter Model for Harmonic


Instruments for Expressive Control of Sound Synthesis and Transformation, Proc. 16th Int. Conf. on Digital Audio Effects (DAFx), 2013, https://hal.archives-ouvertes.fr/hal- 00865683v1.

[Hahn 2015] H. Hahn, Expressive Sampling Synthesis - Learning Extended Source–Filter Models from Instrument Sound Databases for Expressive Sample Manipulations, PhD thesis, University Paris 6 (UPMC), 2015, https://hal.archives-ouvertes.fr/tel- 01263656v1.

[Oord 2016] A. v. d. OOrd et al, WAVENET: A generative model for raw audio, https://arxiv.org/pdf/1609.03499.pdf, 2016

[Roebel 1993] A. Roebel, Neural Models of Nonlinear Dynamical Systems and their Application to Musical Signals, PhD thesis, Technical University of Berlin, 1993.

[Roebel 2001] A. Röbel, Synthesizing natural sounds using dynamical models of sound attractors, Computer Music Journal, Vol 25 No 2, pp 46-61, 2001.

Rémunération : ~550€ / mois + avantages sociaux