Perceived Loudness

2018.08.30 by Gaudio Lab

Perceived Loudness

One of the trade-offs of our content-on-any-device-from-any-source culture is that volume levels can be unpredictable.  Actually, volume isn’t quite the right word: a better way to talk about this problem is “perceived loudness”.


No matter what we call it, the issue is that many people feel like they have to ride volume controls to raise indistinct dialog scenes or engage fast-twitch thumb muscles to lower loud commercials.  We’re involuntarily recruited as sound mixers with just a two button fader!

On a service like Pandora, the famous algorithm is pulling content produced from diverse sources with very different frequency and dynamic ranges. On Vimeo, a quiet narrative might be followed by an in-your-face music video.


Add to this, the devices that we use to consume music and video have distinct speaker profiles that significantly influence perceived loudness.  If you’ve ever gone from listening to bass heavy music that sounded amazing in the car to listening on your built-in phone speakers, you know what I’m talking about.


But there’s good news: technologies that can smooth volumes, and compensate for speaker inadequacies while preserving the contours of the original sound mix are emerging. Those worn-out volume buttons may finally get a break.

The Spatial Audio Decode: Part 1

The Spatial Audio Decode: Part 1 As immersive media and technologies continue to expand, there’s no doubt that spatial audio has gained popularity from a  cross section of industries. Whether you’re a developer, content producer, or perhaps looking to implement some cutting-edge audio technology into an app or enterprise solution you’re developing, you might have spent a little time trying to figure out what spatial audio is exactly. If you’re new to this world, perhaps after a few google searches on spatial audio, you might find yourself in the trenches of dense research papers and overwhelming terminology. Fortunately, Gaudio Lab is dedicated to boiling down and simplifying this information to help us better understand 360 audio technology and production, as well as how it can enrich the quality of your projects.   Before digging in, let’s address the most important question: “What is spatial audio and what does it actually do?” Although the answer may seem apparent, it’s still something that might yield a different response from every person you speak to. A big reason for this is that spatial audio can be utilized across so many different mediums. Is it intended for virtual reality? Video games? Enhanced music listening? Hollywood movies? The list grows faster than our ability to catch up with spatial audio’s potential. Regardless of its uses, all aspects of spatial audio converge on one basic idea: utilizing technology to recreate our natural auditory perception that can be heard on any listening device. Through some fascinating technology (which we will cover in this blog series), we can trick our senses into believing we are in the middle of any auditory world whether it is real or designed. Although traditional stereo and multichannel audio recordings have aimed to do this over the last half century, spatial audio brings new perceptive dimensions to the party that have not been fully realized until recently.   While “immersing” yourself into this world, keep in mind that there are a few top-level concepts that are useful to know and don’t require a Ph.D. in applied physics to understand. That being said, the core of what makes spatial audio so intriguing is how it’s largely based on our perception.   The Listening Position The most common question we get about spatial audio from newcomers is “Oh yeah, isn’t that like surround?” And in some ways, it is. Spatial audio is all about the ability to “localize” a sound, where if you had your eyes closed you could imagine certain sounds emanating from a particular direction and distance.   For years, mixing engineers have tried to create this depth of field in traditional stereo and surround recordings through different techniques. At its most basic, this can be achieved by panning, where the amplitude of a particular sound in a mix is either louder or quieter in that specific audio channel. For example, if a guitar in a stereo song was “panned” to the left, it would increase its amplitude to the left channel while decreasing its amplitude to the right, making the guitar appear to be coming from the left speaker. In 5.1 surround, there are even more loudspeakers to distribute a sound to, enabling the listener to localize sounds coming from both the front and rear horizontal axes. Although this gives an approximate location of a sound source with greater detail than stereo, it still doesn’t provide the very important vertical axis of height.   It’s good to think of sound in your listening position as a representation of spatial axes like X, Y and Z. Typically speaking, X represents sounds coming from the front and back, Y represents the left and right while the Z axis represents sounds from above and below. Though it may look like a trip down a blurred memory lane of mostly forgotten math classes, it is very helpful to visualize the relationship between sound and the listener this way to understand how spatial audio works.   Another complication for using surround configurations as an accurate representation of 360 degrees of sound is that it is dependent on so many variables for the end user.   First of all, not everyone has surround in their home, and even for stereo, many people either don’t have the speakers placed correctly or aren’t listening in the coveted “sweet spot” (which is the optimal position to hear sound the way the audio engineer intended during content creation). The accuracy of audio through surround speakers is largely dependent on the position of the speakers in a room, as well as layout of the room. The idea that more elaborate loudspeaker configurations will solve these problems certainly exists, although this isn’t always the most practical approach for the average consumer and still bounds the listener to a particular room.   Applications of Spatial Audio This leads us to another broad and basic question for spatial audio: “How do we consume it?” The potential applications are virtually limitless, but what has brought spatial audio into the spotlight is its uses in virtual and augmented reality. Whether using a head-mounted display or just viewing 360 videos on your desktop, the most common way to experience spatial audio in VR is through headphones.   Although it seems impossible to be able to hear sounds coming from behind, above or below you out of two tiny speakers wrapped around your ears, the core technologies in spatial audio makes this a reality. To be able to have these immersive experiences over something as ubiquitous as headphones is great news for any consumer, as the vast majority of people already have access to them.   Spatial audio technology is, of course, not limited to headphones. Keeping the issues with loudspeakers stated above in mind, there are very effective applications for spatial audio in sound bars. Compact, affordable and effortless to position, sound bars provide a great way for spatial audio to enter the realm of Hollywood films, music and games.   What makes this such an exciting time for spatial audio is that its uses are growing in such a wide array, even positioning itself beyond the scope of entertainment. Teleconferencing for instance can be improved by having auditory directionality. Before building a new music stage or recording studio, virtualized acoustic spaces can utilize spatial audio to hear the sound of an instrument in a room before its construction. There is even research on the benefits of spatial audio as a navigation aid for pilots. As you can see, the possibilities for spatial audio are as endless as the real world that we are trying to emulate and we at Gaudio Lab are excited to be at the forefront of the industry.   Stay tuned for Part 2 of my series that takes a deeper look into how spatial audio works and the technology that makes it all possible.

NFL Season is Here: Fans Debate Video and Sound Quality on Streaming TV vs. Broadcast

NFL Season is Here: Fans Debate Video and Sound Quality on Streaming TV vs. Broadcast For many football fans this NFL season, the decision to keep paying for cable instead of switching to streaming TV is as much about the quality of experience as it is about access to the games they want to watch. While streaming services offering live TV bundles, such as Sling TV, are less expensive and offer many of the same channels as cable, the quality of the video and audio may not be as good as cable.   With streaming, frame rates are lower than cable for for sports and news channels, and audio consistency as well as surround sound are often nonexistent. Thus, the debate continues among fans this year whether to cut the cord or wait another season for streaming to catch up on sound and video quality.   A Focus on Improving Video Latency Live streaming video gets chopped up into chunks that are delivered over the open internet.  Larger chunks can cause latency and lower frame rates, but solutions are already underway to improve speeding up video delivery. The content delivery network Akamai announced plans last year to process smaller chunks of video more quickly, and a new video standard called WebRTC could also improve latency as it becomes more widely adopted. We may even see video delivery improvements during the current NFL season, but next season will almost certainly bring higher quality.   Catching Up on Audio Quality As latency and video quality improves, we should expect streaming platforms to begin prioritizing audio quality, specifically loudness management. The Gaudio Sol Loudness SDK can provide streaming TV with an improved audio experience for viewers by smoothing variations in loudness, providing continuity of the perceived sound level between streaming content programs.   The Sol Loudness SDK uses a unique architecture in which the server-side performs the loudness measurement and generates metadata that the client-side uses to normalize content to a target loudness setting. Other loudness management software solutions employ a legacy “file-based” approach that destructively modifies the original content and typically produces distortion. Advantages to the server-client solution include the opportunity to set loudness targets per platform, end-user device, or even listening environment. Video streaming services offering live sports programming can leverage Gaudio’s loudness management software to provide a higher quality audio experience for viewers, leading to increased subscriber retention.   The OTTs are catching up to cable in quality and providing subscribers with award-winning, binge-worthy content together with live TV. As the OTT transmission and video technologies solidify and mature, attention will shift to providing higher quality audio experiences. The OTT’s that bring focus to solving loudness management issues will have more sports fans cutting cords, and making the jump to subscriptions.   Stay tuned for more updates on how audio technology can enhance the streaming media experience. To learn more about Gaudio’s Sol Loudness SDK, please contact us!