Spatialized Dynamic Audio

A well-implemented spatialized audio system contributes to a more immersed and naturally feeling experience. More than you might think!

Although a naive user most likely does not consciously notice a well-implemented spatialized audio system, it will nevertheless very likely contribute to a more immersed and naturally feeling experience. In VR, due to commonly using some sort of headphones, a good spatialized audio implementation is even more important. The binaural setup and the constantly changing individual viewpoint and hearing directions of a VR user are a perfect fit for spatialized audio and the resulting immersive experience.

The audio system in Gooze is based on the ONSP framework (v1.29.0) from Oculus. Correctly configured, the framework already handles audio spatialization topics like head related occlusion, direct sound, reflections, reverb and dynamic room modelling. To include spatialized sound effects, e.g. for object collisions, a custom dynamic audio system was implemented on top of it for Gooze. Requiring a working ONSP setup, FTAudioManager creates a pool of reusable, automatically ad-hoc configured and positioned audio sources (see pink highlight in hierarchy in the image). The initial pool size can be adjusted and if needed it can be dynamically increased until an absolute limit is reached (see pink highlight in FTAudioManager). Using a pool of reusable audio sources, instead of always dynamically creating and destroying them, was implemented to minimize the performance costs of the system.

By invoking a specific method of the FTAudioManager singleton, the manager will be instructed to automatically acquire an audio source GameObject from the pool, which is preferably not in use or the one which is already playing the longest time of the stack of currently active audio sources. The selected audio source will then be configured with individual settings, three-dimensionally positioned and a respective audio clip will start to play. If not re-assigned prematurely, once the playback is finished, it will be treated as inactive and ready to be re-used again.

When a movable FTInteractiveObject collides with the level geometry or another object, the above process is invoked. The speed of the collision will be used to calculate the audio effect’s desired volume and pitch. This and the FTInteractiveObject’s individual audio configuration with a randomly selected audio clip from a pool of pre-configured collision effects (see pink highlight) will be send to the FTAudioManager.

So, the implemented audio system simulates spatialized audio in a performant way, including sound reflections (and optionally reverb). Additionally, e.g. in the Gooze demo, when moving your head from the main room into the corridor, the system dynamically adjusts its internal audio environment, which in turn accordingly adjusts the sound of audio effects, creating a more believable VE. This is further enhanced by not only triggering audio effects precisely positioned at object collision points, but also adjusting their volume and pitch in relation to the collision speed. In other words, one can actually hear where a sound is coming from, audio effects sound according to the room the player is in and e.g. if you just slightly tap the metal door with another object, you hear a gentle bing. Whereas if you pound it hard, you hear a loud BONG, and everything in-between is possible without the need of multiple pre-rendered audio files.