Gaudio Lab backend engineers: The people behind the service

2025.08.07ㆍ by Heidi

 

Gaudio Lab Backend Engineers

 

 

The team that brings AI Audio technology to users

 

Turning technology into a service, and delivering it to users—today, we’re sharing the story of two developers from Gaudio Lab’s SNA (Service and Apps) squad, Paul and Johnny. Having worked closely together for years, they continue to write a new chapter at Gaudio Lab, building meaningful services powered by AI audio technology. You might not see them on the surface, but these two are the hands behind the service—where the listening experience truly begins. This is the story of two developers quietly building behind the scenes.

 

Q: Hi! Could you briefly introduce yourselves and tell us about your roles on the SNA squad?

 

Paul: Hi, I’m Paul, a backend engineer on the SNA squad. I work on building web-based services that deliver Gaudio Lab’s AI audio technology to end users. Right now, I’m developing a product called Gaudio Developers, which is a different version of Gaudio Studio. It allows customers to explore and apply our latest audio technology through OpenAPI.

 

Gaudio Lab Backend Engineer

Paul and his kids at Hey MA&PA Event

 

 

 

Johnny: I’m a DevOps and platform engineer. Simply put, I make sure the programs built by developers are delivered to users smoothly by automating the middle layers and improving efficiency. I also work on improving our internal development environment and culture so our team can work more effectively. These days, I’m working with Paul on Gaudio Developers, and I also help operate Gaudio Sing.

 

Gaudio Lab Backend Engineer Johnny

 

 

 

What makes working at Gaudio Lab special

 

Q: What was your first impression of Gaudio Lab? Compared to your previous workplaces, how is the team culture or way of working different?

 

Paul: The biggest difference is that we’re empowered to take ownership of our work. Within the resources we’re given, we have the flexibility to manage our schedules and participate directly in everything from system design to decision-making. Because of that, I feel more attached to the projects I work on and more driven to see them through to the end.

 

Johnny: In my previous companies, failure just wasn’t an option. The focus was so heavily on results that people became more conservative when choosing technologies or building new features. It was often like, “This method worked before—why change it? Let’s just do it the usual way.” But at Gaudio Lab, it’s different. If you have a solid reason, people are open to new ideas. Even if something fails, our leader will say, “That’s OK. The decision was mine—that’s not on you.” Instead of blaming, we reflect and learn together, and that opens the door to better results.

 

 

Q: SNA seems like one of the most communicative teams. How do you usually work through problems and collaborate?

 

Johnny: Communication is absolutely critical. If we misunderstand directions or lack clarity on goals, the entire development can veer off course. And once something goes live, fixing it takes even more time and effort. To avoid that, we actively ask questions and discuss openly—until we’re all sure we’re heading in the same direction. We often say things like, “I’m not sure this is the right direction.” “Can I ask why you think this approach works?” “Is there another way we could look at this?” These kinds of conversations help us share diverse perspectives and avoid being stuck in a single mindset.

 

Paul: When a problem arises, the person responsible will first take the lead in solving it. But if things get tricky, the whole team jumps in to help. Our team lead, Min, is always there to guide and support us when needed—which is really reassuring. We mostly use Slack for async communication, and we have short daily standups to share progress. It’s a very open environment where anyone can ask for help or suggest a new idea anytime. That mutual trust really allows us to work independently, yet stay connected as a team.

 

 

Q: With such smooth collaboration, it sounds like you have a lot of respect for your teammates. What are they like?

 

Johnny: Min is a fantastic team lead. He gives clear direction and removes obstacles so we can focus on building. Paul is someone I trust completely. He’s always approachable and reliable—someone I can count on without hesitation. 😎 And our juniors, Handy and Hazel, are growing incredibly fast. I’m genuinely excited for what’s ahead for them.


Paul: I totally agree with Johnny. Every person on our team carries a strong sense of ownership—like, “if I don’t do this, no one else will.” That level of commitment is what makes the team work. Gaudins often say "The best part of working here is the people."—and I think our team is a perfect example of that.

 

 

 

Grow without limits

 

Q: You’ve both been with Gaudio Lab for 4–5 years now. What’s one project that stands out most during your time here?

 

 

Paul: The most memorable for me is the GTS (AI Text Sync) project I worked on right after joining. Back then, we didn’t have any web-based services, and I was the only developer on it—so I had to build everything from scratch. I designed and implemented the entire system architecture based on Docker, including OpenAPI, licensing, and the admin web service. At my previous jobs, roles were more segmented, so I’d only contribute to one part. But here, I handled everything from planning and development to deployment and customer support. There was a lot I didn’t know, so I dove deep—watching tutorials, reading books, and just immersing myself. It was tough, but a deeply meaningful experience for me.

 

Johnny: For me, it’s Gaudio Studio. Before this, I mostly worked on B2B products targeting enterprise clients. Gaudio Studio was the first time I worked on a service designed for direct consumers. Consumer audiences are much more diverse—not just in scale but in behavior, expectations, and how they give feedback. That meant their needs were more varied, too. So I found myself thinking things like, “Would this make the user happier?” and taking a more proactive role in shaping the direction of the service. I really enjoyed that process—it challenged me and made the work exciting. 😊

 

 

Q: Those periods of deep focus clearly shaped your growth. Where do you see yourself heading as a developer?

 

Paul: Lately, I’ve been actively using AI tools to boost my productivity. I’ve automated repetitive tasks and found faster, more effective ways to troubleshoot issues. That’s helped me grow not just technically, but also in how I approach work overall. Looking ahead, I want to combine AI tools with backend systems to create more efficient architectures—ones that deliver maximum impact with minimal resources. Ultimately, I want to build systems that are not only technically solid, but also deliver great user experiences.


Johnny: I want to be what I call a “borderless engineer.” Instead of thinking, “I’m a developer, so I only write code,” I want to cross boundaries when it makes the product better. Whether it’s ops or planning, I want to be part of the problem-solving process—wherever development and real-world needs intersect. That’s how I want to grow: by contributing beyond my job title.

 

Gaudio Lab Backend

 

 

 

We value passion and collaboration

 

Q: Lastly, any words for future Gaudins?

 

Paul: If you’ve ever wanted to turn an idea or technology into a real service, this is the place to do it. At Gaudio Lab, you get to experience the full cycle—from concept to execution. If you’re looking to grow in a tight-knit, high-trust team, there’s a lot to gain here. I hope we get the chance to work together and create meaningful value through audio technology. If you’re someone who takes initiative and learns through action—you’ll be more than welcome here. And yes—it's totally okay to fail! 🙌🏻

 

Johnny: For us, what really matters is sincerity. We actually read every blog post on your resume. We review the code. We care. Having lots of experience is great, but what’s more important is whether you put your heart into that experience. If you’re ready to grow together and care deeply about what you build, we’d love to have you on this journey with us.

 

 

 

If you found the story of the GDK development team intriguing,

Explore Life at Gaudio Lab:)

 

 

pre-image
From LKFS to true peak, the complete guide to Loudness

      What exactly is loudness?   Loudness refers to the perceived amplitude of a sound as interpreted by human hearing. Imagine you want to tell someone about the volume of the song you're currently listening to. If the sound is loud, you'd say its loudness is high; if it's quiet, its loudness is low. However, there's no guarantee that another person will perceive loudness in the same way you do. This assurance diminishes as the number of people increases. In such cases, the most efficient method is to convey the sound's intensity using an objective, numerical metric or unit. Given the widespread need across various fields, research into loudness units has been very active. Here, we'll introduce a unit that is commonly used and highly practical in markets like broadcasting and streaming.   The unit we'll discuss is LKFS (Loudness K-Weighted relative to Full Scale), also known as LUFS (Loudness Unit relative to Full Scale). The parameters associated with this unit were developed by the ITU-R (International Telecommunication Union – Radiocommunication) and the EBU-R (European Broadcasting Union).       What factors are primarily considered when measuring loudness?   When measuring loudness, several key parameters are commonly used. If you examine the loudness meters provided in various measurement tools or Digital Audio Workstations (DAWs), you'll generally find that they include essential items such as Integrated, Short-Term, and Momentary loudness, True Peak, and Loudness Range. In this chapter, we will delve into the meaning of each of these parameters.   2-1. Key Keywords: LKFS, LU, Momentary Loudness, Short-term Loudness, Integrated Loudness, LRA, True-peak   LKFS (Loudness K-Weighted relative to Full Scale) a.k.a. LUFS (Loudness Unit relative to Full Scale) This is one of the units for loudness. It represents the amplitude of an input signal that has passed through a K-weighting filter, which is designed to align with human hearing characteristics. You can understand the K-weighting filter as one that increases signals in frequency ranges that humans hear relatively well, and decreases signals in frequency ranges that are relatively less audible. Loudness can be categorized into Momentary, Short-term, and Integrated Loudness based on the duration over which it is measured. Momentary Loudness refers to the sound level over a 0.4-second window, Short-term Loudness over a 3-second window, and Integrated Loudness represents the overall sound level across the entire duration.   *What are LKFS and LUFS, and what is the difference between them? The unit for loudness was initially conceived by the ITU (International Telecommunication Union), which defined the unit as LKFS (Loudness K-Weighted relative to Full Scale). Subsequently, the EBU (European Broadcasting Union) devised the display methods and defined terms such as Momentary, Short-term, and Integrated Loudness, along with Loudness Range (LRA), and then changed the designation to LUFS (Loudness Unit relative to Full Scale). Consequently, there is a tendency for LKFS to be used in North America, while LUFS is more prevalent in Europe.     LU(Loudness Unit) While LKFS represents an absolute measured value, LU (Loudness Unit) is a relative measurement. It is used to express the difference from a reference level or to describe the range of loudness. For example, if Content A is at -12 LKFS and Content B is at -20 LKFS, one could state, "Content A sounds 8 LU louder than Content B."   Momentary Loudness Momentary Loudness is the sound level corresponding to a 0.4-second segment of the signal after it has passed through a K-weighting filter. It is measured with a 75% overlap (0.1 seconds). This can be understood as the instantaneous sound level.         When the measurement results, as shown in the image above, are accumulated into a histogram, the result is as depicted in the image below.           The histogram of Momentary Loudness is subsequently used in the calculation of Integrated Loudness.     Short-term Loudness Short-term Loudness refers to the sound level corresponding to a 3-second segment of the signal after it has passed through a K-weighting filter. The EBU recommends that this value be updated at a minimum interval of 0.1 seconds.             The histogram of Short-term Loudness is subsequently used in the calculation of Loudness Range (LRA).     Integrated Loudness Integrated Loudness is the average sound level perceived across the entire duration of a piece of content. It represents the overall loudness of the content. The calculation method is as follows:   Step 1) Remove the momentary loudness distribution values below -70 LKFS, then calculate the average of the remaining distribution values.     Step 2) The relative threshold is defined as 10 LU lower than the average calculated in Step 1.      Step 3) The average of the distribution values that are above the relative threshold is the Integrated Loudness.       LRA(Loudness Range) LRA (Loudness Range) is a metric that indicates the variation in loudness over time within a single piece of content. It serves as an indicator of how widely the sound levels are distributed. The calculation method is as follows:   Step 1) Remove the short-term loudness distribution values below -70 LKFS, then calculate a value that is 20 LU less than the average of the remaining distribution values (this is the relative threshold).   Step 2) The difference between the top 5% of the distribution values and the bottom 10% of the distribution values that are above the relative threshold is the Loudness Range.       *In both Integrated Loudness (IL) and Loudness Range (LRA) calculations, the concept of a Relative Threshold is used, but their definitions differ. For IL calculations, momentary loudness is utilized, and the relative threshold is defined as the average of the values above the absolute threshold minus 10 LU. Conversely, for LRA calculations, short-term loudness is used, and the relative threshold is defined as the average of the values above the absolute threshold minus 20 LU.   True-peak True-peak refers to the peak value when a signal is converted to a 192 kHz sampling frequency, and its unit is dBTP. This value can be understood as a measure to prevent degradation in playback environments, particularly when using sufficiently high sampling frequencies (like 192 kHz). Since audio commonly consumed usually has sampling frequencies of 44.1 kHz or 48 kHz, upsampling is a common process. During this upsampling, the peak value can sometimes exceed the original sample peak value. An example of upsampling is provided below.     In addition to the core upsampling technique, other processes are undertaken to prevent the signal from exceeding its representable range (attenuation), to retain only valid signals after upsampling (filtering), and to convert values to the decibel scale (logarithmization). The block diagram below illustrates the series of steps involved in calculating the true-peak when the sampling frequency is 48 kHz.         2-2. Getting to know loudness           Why are LKFS (LUFS) used instead of the traditional RMS?   Previously, RMS (Root-Mean-Square) was used to measure loudness, but it did not align well with actual human auditory perception. Subsequently, the ITU and EBU developed a more sophisticated method for calculating loudness by incorporating a K-weighting filter to reflect human hearing capabilities. As seen in the loudness calculation process, this method also excludes parts that have no influence on the perception of sound pressure. It is likely that this more refined approach, compared to other units, has led to its widespread adoption.       What is the 'Loudness War' and why is it a problem?   I'd like to discuss an issue related to loudness that many of you might already be familiar with: the 'Loudness War.' To summarize it in my own words, it's when content creators produce content with the mindset of, "By making my creation louder than others, I will grab more attention from listeners. As a bonus, it might even trick listeners into thinking the sound quality is better." Alternatively, it could also be a mindset of, "While I won't make my creation significantly louder than others, I'll ensure there isn't a large difference." While a louder sound might lead one to believe the sound quality has improved, in reality, the dynamic range narrows, reducing expressiveness, and the frequency of clipping increases, raising the probability of sound quality degradation. I hope that many consumers will recognize that excessively increasing loudness actually diminishes the quality of the audio itself, and that their consumption patterns will change accordingly.     Loudness Regulations and Recommendations for Streaming Platforms  (masteringthemix)       Hopefully, your questions about loudness have been answered!   This post is a translated version of 'Loudness 101 (KR)', originally published in 2019.   If you're looking for more in-depth or specialized information, we highly recommend consulting the following international standard documents, which contain detailed technical information on loudness measurement and regulation: ITU-R BS.1770-4: provides the standard for loudness measurement algorithms. EBU-R Tech 3341, 3342: offers detailed guidelines on loudness regulation and measurement methods in broadcasting environments.   If you have any further questions about loudness or Gaudio Lab's loudness technologies, please feel free to contact us!           Would you like to know more about loudness? Perceived loudness Loudness Management: How audio technology will impact streaming video      

2025.07.18
after-image
How can broadcasters localize content faster?

      Gaudio Studio Pro (GSP) is here to help anyone dealing with the challenges of conventional localization processes In today’s global media landscape, speed and scale matter more than ever. Broadcasters and distributors face mounting pressure to prepare films and shows for multiple regions at once—but traditional localization workflows are slow, fragmented, and often blocked by missing D/M/E tracks or music rights issues.   That’s why Gaudio Studio Pro (GSP) was built: a cloud-native, AI-powered localization platform that removes the barriers between great content and global audiences, with expert (human)-in-the-loop support for even better quality and reliability.   We’ve gathered the top questions broadcasters ask about GSP—and the answers that show how it can cut turnaround time by up to 90%, simplify copyright compliance, and unlock new revenue from both new and legacy titles.       Product Overview & Target Users Q: What is Gaudio Studio Pro (GSP)? GSP is a cloud-native content localization SaaS designed for film and broadcast industries. It automates DME separation, dubbing, subtitle syncing, music replacement, and cue sheet generation—even when only the final master file is available (no D/M/E stems). With award-winning AI audio separation and copyright-cleared music replacement, GSP turns any film—old or new—into a distribution-ready asset for global markets.   We also offer expert services from our audio post-production team, WAVELAB, to ensure quality and adapt localization to your project’s needs. Q: Who is GSP for? GSP is designed for studios, filmmakers, broadcasters, OTT platforms, distributors, and post-production teams who need to prepare content for international release. It’s also useful for archives and rights holders who want to revive legacy titles for new markets.    Q: Can smaller studios or indie filmmakers use GSP? Absolutely. While GSP supports large broadcasters and distributors, it’s also designed to be accessible for everyone—from directors and engineers to translators and indie creators who want to prepare their content for global release. Its intuitive interface and automation reduce the need for technical expertise or large budgets.   Q: Has GSP been recognized in the industry? Yes. GSP’s core technologies—AI Audio Separation, AI Music Recommendation, and loudness management—have won CES Innovation Awards (2023–2025). The loudness management technology is also an official ANSI/CTA standard, widely adopted across the industry.     Q: Is there a free trial version available? Yes. We will soon open Gaudio Developers and provide access through an API. If you don’t have engineers on your team, please contact us and talk to our experts:)    Q: Have you worked with overseas clients? Yes. We are working and conducting PoCs with a number of Korean companies and several companies in the U.S., Japan, and Europe.       Key Features & Workflow Q: What makes GSP different from traditional localization tools? All-in-one workflow: DME Separation, dubbing, subtitles, music replacement, and cue sheets in one platform. Expert service from our audio post-production team, WAVELAB, for customers who want even better quality. AI-powered automation: Faster turnaround (up to 90% reduction in localization time). Legal readiness: Music replacement without copyright issues Cloud-native collaboration (coming soon): Real-time comments, multitrack editing, and version tracking.   Q: Can GSP be used for live broadcasting or only for pre-recorded content? GSP is primarily designed for film, OTT, and broadcast post-production, so unfortunately, our current technology cannot handle live broadcasting scenario. In the future, however, processing speed may improve to make it feasible.   Q: Can GSP be integrated into existing production pipelines? Yes. GSP is built as a cloud-native SaaS, so teams can use it alongside existing video editing tools or dubbing studios. Its format-flexible export ensures compatibility with any distribution workflow.   Q: How does GSP support collaboration across teams? GSP includes multitrack editing and version control, making it easier for translators, engineers, and producers to work together—even across time zones. Everyone works on the same project environment, eliminating file conflicts and delays.       Efficiency & Productivity Q: How fast is localization with GSP? By consolidating multiple fragmented tools into a single AI-powered tool, GSP can reduce localization time by up to 90%—cutting months of work down to days—without compromising quality. To put it more specifically, without human-in-the-loop (i.e., AI-only processing), localizing one hour of content that once took a month now takes just one hour. When expert quality checks and edits are added, the same work that once took a month can now be completed in about three days.   Q: Is GSP scalable for large catalogs? Yes. GSP was designed for enterprise-level scalability. Whether localizing a single short film or processing entire archives with hundreds of titles, it handles projects in parallel while keeping version control and collaboration centralized.   Q: How does GSP save costs compared to traditional localization? No need to rebuild missing D/M/E stems from scratch. Automated solutions that alleviate music copyright concerns and cue sheet generation reduce manual labor. Faster turnaround (up to 90% time savings) means lower studio and staffing costs.Overall, GSP lets studios localize more titles with fewer resources.   Q: What kind of revenue opportunities does GSP unlock? By restoring and clearing rights for films that were previously stuck in archives, GSP allows studios and distributors to monetize dormant catalogs. It also shortens time-to-market for new releases, helping companies reach more global audiences faster.   Also, it empowers music artists to earn from their original creations. Artists can easily register their music in the GSP library, while users discover fresh tracks through AI-driven recommendations—creating a new ecosystem that connects broadcasters and musicians.       Feature-Specific Questions DME Separation Q: How does GSP handle projects when only a master file is available? Even without original dialogue, music, and effects (DME) stems, GSP’s AI audio separation engine extracts DME with studio-level fidelity. This makes it possible to restore, localize, and repackage films that were previously blocked from distribution.   Q2: How accurate is GSP’s DME separation? GSP uses Gaudio Lab’s proprietary AI stem separation model —one of the most advanced in the world. Its separation performance has been ranked No. 1 by Musicradar, MusicTech, and LANDR. Providing studio-level fidelity, GSP extracts dialogue, music, and sound effects cleanly even from legacy or compressed master files.        Music Replacement Q: How does GSP's Music Replacement handle copyright issues? Even a single copyrighted background track can block international release. Content distribution often stalls due to music rights issues. GSP alleviates them by recommending and replacing tracks from a globally licensed library of 110K+ human-made, high-quality music.       Q: Does GSP use AI-generated music for replacements? No. GSP’s replacement library features 110K+ tracks composed by real artists, not generative AI. Its AI recommends and places the most fitting track, ensuring the emotional tone of the original BGM is preserved while maintaining full copyright clearance.   Q: How accurate is the AI in replacing the original music with suitable alternatives? It depends on the genre you want to replace. For variety shows, replacement is essentially fully automated. For dramas, the numerical accuracy is similar, but the acceptance level or creative bar is much higher. Still, only about 10% of tracks may need human review.   Q: Can the AI adapt to different types of video content, like documentaries and commercials? Commercials can certainly be processed with GSP, but given their short format and the need for a single strong track, advertisers often prefer to handpick music from recommended options. For documentaries, the results are even better than with variety shows. Since the number of music mixes is smaller and the takes are longer—typically 2–3 minutes per track—the replacement process is smoother and more consistent.   Q: Can the AI handle various music genres and cultural nuances? Yes. For example, if the original music is reggae, the recommended replacements will usually come from the same genre, since tempo, instrumentation, and melody align.        Dubbing & Subtitle Synchronization Q: How does AI dubbing in GSP work? GSP’s AI dubbing faithfully replicates the original actor’s voice, tone, and timing in the target language. Powered by Gaudio Spatial Audio, CES Innovation Award-winning technology, it even matches scene-specific room tone and spatial effects—so a voice in a poolside scene, for example, sounds naturally wet and reverberant, not like a dry studio recording.   Q: How does GSP automate subtitle synchronization? GSP uses AI to automatically generate and synchronize subtitles with dialogue tracks. Even when only the master file is available, it aligns timing accurately—producing high-quality subtitles that comply with the strict timing guidelines of the largest video streaming platforms—reducing manual adjustment time and ensuring subtitles match the original speech and scene context.   Q: Can subtitle translations be customized? Yes. Editors and translators can directly review and edit subtitles. Once the real-time collaboration feature is updated, multiple team members will be able to refine translations simultaneously without any version conflicts.   Q: Does GSP support multiple languages simultaneously? Absolutely. GSP can generate and manage subtitles in multiple target languages at once, streamlining international releases.       Cue Sheet Generation Q: Can cue sheets be exported in standard formats? Yes. GSP exports cue sheets in industry-standard formats compatible with broadcasters, OTT platforms, and regulatory requirements worldwide.       Localize your content fast with Gaudio Studio Pro GSP is redefining how broadcasters, studios, and distributors bring stories to global audiences. By combining AI-driven efficiency with industry-standard compliance, it doesn’t just solve the challenges of traditional localization—it opens up new opportunities for revenue, scale, and creativity.   Whether you’re preparing a single indie film or scaling an entire archive, GSP helps you move faster, stay compliant, and connect with audiences worldwide.         👉 Ready to see how GSP can transform your workflow? Let’s make your content truly global.   ✉️ Contact us for more information 📖 Read more about Music Replacement 📖 Read more about how GSP handles music copyright issues    

2025.08.28