DIStrib: Disrupting Immersive Sound With Distributed FPGAs

DIStrib is a 42 months project funded by the Agence Nationale pour la Recherche (ANR). It will officially begin in October 2025. For any questions regarding this project, feel free to reach out to Romain Michon (romain_dot_michon_at_inria_dot_fr).

Project Overview

Interest around spatial audio has been booming in recent years. An increasingly high number of movie theaters, concert halls, Virtual Reality (VR) platforms in museums, attractions in amusement parks, etc. are equipped with advanced spatial audio systems involving a large number of speakers. The automotive industry has also recently shown interest in spatial audio for its applications in the context of active noise cancellation.
When considering interactive applications (i.e., virtual acoustics, soundscape rendering in VR, noise canceling, speaker correction, speech intelligibility enhancement, etc.) involving real-time operations and the ability the reprogram/customize the system, managing a large number of individual audio channels requires a tremendous amount of computational power and incredibly high bandwidths, which current systems fail to provide.
The norm to implement such systems is to rely on a centralized software-based approach: a powerful computer connected to one or multiple audio interfaces providing a limited number of audio outputs. In that case, the bottleneck is the computer’s throughput and hence its ability to manage a large number of audio streams in parallel with potential computations applied to each of them.
The goal of DIStrib is to rethink the way we approach spatial audio systems by relying on a distributed computing approach leveraging the computational power of large Field-Programmable Gate Arrays (FPGA). In this system, each FPGA is in charge of computing the sound of a limited set of speakers. Conversely, the distributed approach allows for a very large number of speakers to be targeted. In order to reach this goal, multiple challenges ranging from transmitting audio streams to a large number of audio devices with perfect synchronicity to running complex audio Digital Signal Processing (DSP) algorithms on FPGAs must be tackled. We believe that such a system has the potential to be highly disruptive in the field of spatial audio. DIStrib is also the occasion to explore the potential of artificial intelligence in the context of immersive sound and virtual acoustics rendering by relying on emerging platforms such as embedded NPUs (Neural Processing Unit).