The audio community witnessed a major shift from stereo to surround sound in the 1980s, and shortly after, the subsequent transformation from surround sound to immersive sound in the past decade. Immersive sound – or “surround sound with height” – has introduced listeners to additional, angled height layers, offering extremely realistic reproductions of sound capable of enhancing listening experiences across all forms of entertainment. Now even consumers are familiar with formats like Dolby Atmos, DTS:X and Auro-3D.
There are mainly two schools of thought on how to most effectively reproduce 3D sound: Object-Based and Channel-Based encoding. Object-Based encoding is most frequently associated with the modern concept of immersive sound, where audio segments are encoded separately on a soundtrack and then rendered at playback according to a listener’s specific speaker configuration.
Both Dolby Atmos and DTS:X primarily use Object-Based audio to create that third dimension of audio; that's not the only method of delivering dynamic sound to music, film or other mediums. In fact, many people don’t even realize that immersive sound can be produced using Channel-Based techniques only, which may in fact be more conducive for the production of music in 3D.
The Benefits of Channel-Based Encoding
Channel-Based encoding allows creators to produce music in the actual 3D space where listeners perceive sound, preserving its artistic intent. It achieves a natural spread of the sound energy without sacrificing the sound experience intended by creators – ultimately improving the emotional experience associated with listening.
Sound engineers are able to utilize each channel of the 3D speaker layout in a specific way that creates a stamp, unique to the sound at hand.
Three-dimensional audio formats using Object-Based techniques typically support speaker layouts with only two overhead speakers (like 5.1.2 or 7.1.2) and these really fail to recreate both a 3D space around the listener and sound in the way it’s experienced in reality.
The standard for all Auro-3D speaker layouts is a “vertical stereo field around the listener” permitting the reproduction of sound in a true 3D space.
Additionally, Channel-Based immersive sound maintains the “mastering process.” This is not possible with Object-Based formats as the renderer at playback can’t physically reproduce this key component in the workflow of music production seen in Channel-Based approaches.
Working Together for a Realistic Sound Experience
Most audio setups on the market include only two overhead speakers (including 5.1.2, 7.1.2 or 9.1.2 layouts), which cannot truly reproduce a 3D space. This is to blame for confusion related to Object-Based and Channel-Based audio.
Many incorrectly believe that Object-Based audio is the only approach to generating immersive sound experiences – despite the fact that Channel-Based methods are still being widely utilized across the entertainment and music sectors, including the recent popularity of “High Resolution Audio” (e.g. music streaming like Tidal or Neil Young's portable PonoPlayer) up to 96kHz-24bit which is not possible with so-called “Object-Based” immersive sound formats.
While it’s important to know the differences between these two approaches, and the type of projects that each is best suited to produce, it’s equally imperative for the audio community to understand the two are not mutually exclusive.
Believe it or not, all immersive sound formats on the market do incorporate both Object-Based and Channel-Based encoding in their own ways, making them “Hybrid” formats. Often the chosen approach is simply dependent on the medium at hand.
When it comes to digital cinema, Auro-3D uses Object-Based audio (up to 128 objects) based on the SMPTE standard combined with a three layer Auro 13.1 Channel-Based component. The ultimate immersive sound system is called AuroMax and is favorable when playback will occur across more than 20 individual speakers or amps – as is often the case in cinemas, which generally offer abundant, open spaces that are conducive to larger speaker layouts.
But while Object-Based encoding might be the preferred approach in digital cinema, the immersive experience is still impacted by the positioning of each sound layer, making it necessary to pay close attention to speaker setup.
Dolby Atmos, for example, places surround speakers in home theaters about 28 to 40 degrees above the audience. I would argue that these speakers should be placed around the standardly-recommended 15-degree level, much closer to the audience’s ear-level, where most natural sounds originate from. Think about it: if the lowest speaker in a theater is placed far above head-level, all sounds will be perceived from above, meaning sounds like bike peddling or dogs barking will unrealistically appear like they’re flying over the audience.
Immersive sound has wielded a great influence on technology in entertainment this past decade, and has likewise enhanced content to a level where it can be truly immersive. And while all advances in immersive technology are captivating, it’s important we make efforts to understand how and when each type of sound experience can best be used. Being capable of employing both Object-Based and Channel-Based audio formats will ultimately be the answer in captivating the audiences of the future.