resource center

Energy-based modeling and simulation for sound synthesis: application to a quasi-1D vocal apparatus - Questions from the jury

video

audiovisual archive
information
Work by
speakers
recommendations

Do you notice a mistake?

information

event: Soutenance de thèse de Thomas Risse
Type: Soutenance de thèse/HDR
performance location: Centre Georges Pompidou, Salle Triangle (Paris)
date: October 3, 2025

Thomas Risse's thesis defense

Thomas Risse, a doctoral student at the SMAER Doctoral School, conducted his research entitled "Energy-based modeling and simulation for sound synthesis: application to a quasi-1D vocal apparatus" within the S3AM team at STMS (IRCAM, Sorbonne University, CNRS, Ministry of Culture) under the supervision of Thomas Hélie and co-supervised by Fabrice Silva.

The jury is composed of:

Brad STORY, Reviewer
Paul KOTYCZKA, Reviewer
Michele DUCCESCHI, Examinator
Nathalie HENRICH BERNARDONI, Examinator
Yann LE GORREC, Examinator
David ROZE, Examinator

Summary:
This work addresses the energy-consistent modeling and simulation of the vocal apparatus. Voice production arises from the nonlinear interaction of multiple physical domains, including fluid dynamics, tissue mechanics, and acoustics. To capture this complexity while preserving energetic consistency, the port-Hamiltonian formalism is employed throughout the thesis. This framework allows the modular construction of individual components and their interconnection within a unified power-balanced system. Particular attention is given to the construction and selection of a suitable quasi-one-dimensional fluid model to represent airflow through the entire vocal apparatus, from the subglottal region to the lips. Numerical simulations are conducted under various configurations, including isolated vocal tract and isolated auto-oscillating larynx setups, and the results are compared with reference data from the literature. The explicit power signals provided by the port-Hamiltonian formulation offer valuable insight into the energetic behavior of the voice production mechanism. Additionally, a linearized model of the vocal tract is developed as the basis for a real-time implementation. As a final demonstration, the full coupled model (larynx and vocal tract) is used to simulate the articulation of a diphthong under auto-oscillating conditions. A second part of the work focuses on the development of efficient numerical methods for the stable simulation of port-Hamiltonian systems. In particular, a contribution is made to the scalar auxiliary variable (SAV) methods through the introduction of a novel stabilization term. The proposed approach mitigates the drift of the auxiliary variable, thereby improving the long-term stability of simulations. While direct application to the vocal apparatus model remains a goal for future work, the method is demonstrated on a nonlinear string model, showcasing its effectiveness. A real-time implementation and a prototype control interface are also presented.