What is the best emerging 3D technology format?
In recent years, new audio technologies have changed the way audio is mixed and processed. From the big screen to the home entertainment center, 3D audio is becoming more prevalent and accessible. Maybe you’re considering using immersive audio for your next project. However, which audio tech is right for your use case?
If you make the wrong investment, you could end up wasting valuable time, energy, and capital. In most cases, depending on the maturity of your growing app or game development team, you probably don’t have all the resources to commit to ineffective software solutions. For this reason, investing in the right solution is critical.
In this article, we’ll cover the latest 3D audio technology solutions. After reading, you should have a clear understanding of the features and capabilities of the most advanced 3D audio technology currently available on the market.
An Introduction to 3D Audio Technology
Of course, increased adoption of 3D audio technology comes as no surprise. In recent years, virtual reality and augmented reality have helped paved the way for 3D audio. With computer processing getting faster and cheaper, it’s placed an increased emphasis on building immersive and realistic audio experiences.
Additionally, people can enjoy 3D audio with any wired headphones now. Plus, products like AirPods Pro and AirPods Max are taking 3D audio to a whole new level because the sound experience will be relative to the listener’s ears. For example, if I hear someone talking in my right ear and rotate my head 180 degrees, I will hear that person in my left ear. 3D audio isn’t just for hardcore gamers, app aficionados, or virtual reality enthusiasts.
So, how do you create 3D audio? Before we begin, let’s discuss how 3D audio works.
3D Audio Technology Primer
3D audio software allows you to manipulate sounds anywhere in a three-dimensional environment, both horizontally and vertically. For example, if you want to place a chirping bird in a tree, you can.
Additionally, some technologies even let you simulate the unique acoustics of any space (indoor and outdoor), so sounds would bounce off the walls, ceiling, and floor just as they would in the real world. These reflections reshape the waveform and cause a delay between the sound hitting one ear before the other.
The phenomena of how our ears receive sounds is called a head-related transfer function, or HRTF. HRTF accounts for the shape and size of our ears (and the thickness of our heads), distance to the sound source, and direction in relation to a sound.
The challenge has been to recreate these audio experiences accurately. Of course, the technology approximates a typical head and ears, but cannot get truly accurate results since everyone has a different body shape and head size. With new advancements in technology, spatial audio has become easier and easier to implement.
Dolby Atmos is a high-quality sound format initially designed for movie theaters. Surround sound, like 5.1 and 7.1, is channel-based and creates the illusion of 3D audio by sending audio to specific channels like left, right, center, etc.
Instead of using channels, Dolby Atmos is object-based, meaning it allows the engineer to send audio to a specific spot in a 3D space. The addition of eighth or overhead speakers works to position sounds vertically above a listener.
There’s a broad spectrum of Dolby Atmos-enabled home theater gear out there. Those not willing to install overhead speakers can purchase Dolby Atmos-enabled soundbars that bounce the sound to the ceiling and reflect it to the center of the room to recreate a 3D soundscape. Dolby Atmos is even available on headphones, using the object metadata to position sounds in a 360-degree space using any pair of headphones.
To produce Dolby Atmos content, it’s a joint hardware and software setup. Depending on if you’re mixing feature films, game projects, or home theatre projects for Netflix, you have a few options.
You’ll need a renderer and software that will run inside of your digital audio workstation. Renders come in three forms:
- Dolby Hardware and Rendering Unit (RMU)
- Dolby Master Suite
- Dolby Atmos Production Suite
Most people are familiar with the Dolby name, so Dolby Atmos is becoming most popular, even having demo rooms at Best Buy to experience a complete setup using the technology.
Auro-3D is a channel-based audio format that creates 3D audio using a three-layered approach to sounds. It's lossless audio that is uncompressed PCM. It offers significantly better audio resolution for its height channels than Atmos which uses a lossy format.
Think of it as an advanced surround sound format, adding additional height speakers, creating a sphere of sound around a listener. Typical home theatre formats are Auro 9.1, 10.1, 11.1. The cinema version of Auro-3D is Auro Max which can encode a mix of an object.
- Top layer - directly above a listener, can be single speakers or multiple.
- Height layer - dominant layer, placed about 40 degrees above the lower layer. Captures natural reflections and improves spatialization of sounds (identifying where they're coming from). This layer helps the listener pinpoint the location of a sound, like a jet flying overhead.
- Lower layer - Ear level layer, speakers placed about 0 to 20 degrees. It's the horizontal plane of sound, grounding the mix with essential sounds like dialogue.
There's an ongoing comparison war as to whether Auro-3D or Dolby Atmos is better. Auro-3D is certainly less popular than Dolby Atmos — as of this blog post, there are roughly 30 movie releases that use it.
Although not genuinely fully three-dimensional like Atmos, the increase in its heigh channels may make it a better option for music audiophiles or engineers looking to create the highest fidelity audio experience.
DTS:X is another object-based audio codec technology like Atmos. It started in the home theatre space and made its way into movie theaters. The result is similar to creating realistic sounds that move anywhere in a space.
DTS:X can work with existing surround sound stems and doesn’t require a specific setup. It’s also an open-source, multi-dimensional audio platform. Like Auro-3D, it supports a higher quality resolution. With DTS:X, you can have a more flexible speaker system that doesn’t require a specific number of speakers like Dolby Atmos. You can arrange your system however you see fit.
From a mixing standpoint, it has the edge over Atmos. You can manually adjust each sound object, so if you wanted to boost the dialogue, you could adjust it separately from raising the entire center channel’s volume. The open system and flexible speaker setup make DTS:X a more compatible audio codec than Atmos, but the increase in quality is mostly imperceptible to the average listener.
Object-based, the primary audience is music creators. It’s built with an open audio standard for music streaming. Sony is partnering up with major record labels and streaming services, making their audio format more readily available to music lovers.
Perhaps the best thing about 360 Reality Audio is that you don’t need any additional hardware to make it work. You can listen with any pair of headphones.
There are hardware options out there if you prefer listening to music through speakers. To make 360 Reality Audio work with a hardware unit, it needs Sony’s custom decode.
Two speaker models are currently available, the SRS-RAS3000 and SRS-RA5000. For 3rd party, it’s available on Amazon Echo Studio. The numbers of 3rd party devices will increase due to the open-source nature of the codecs.
Music producers can install 360 Reality Audio Creative suite on their digital audio workstation (DAW) and place and move sound in a 360-degree sonic field.
The 360 Reality encoder rendered audio files for music streaming services compliant with MPEG-H 3D Audio. Tidal, Deezer, Amazon Music HD, and nugs.net currently support the open-source format. The plan is to have video streaming capabilities. The aim is to recreate the feel of live performances with videos that use 360 Reality Audio.
Taking 3D Audio Technology Further
As you can see, there are multiple ways to create 3D audio. When you consider the total volume of 3D audio technologies, the output is the same. In this scenario, asking which technology is better isn’t the right question. Instead, it all depends on what you’re trying to achieve and where the rendered audio will be used.
However, creating pre-recorded 3D audio is only one side of the story. Recently, there’s been a push for better audio quality in real-time. It turns out there’s actually an audio solution for Zoom Fatigue. It’s challenging to hear and process sounds on Zoom, or the very least exhausting, right? What if Zoom supported spatial audio? 3D audio improves the intelligibility of each voice, making for better communication in real-time because it’s similar to how we process voices speaking in a room together.
Using High Fidelity’s real-time Spatial Audio API is a great way to bring high-quality 3D audio to your app, game, or streaming platform. How will you bring next-generation audio to your project?