Gaudio Lab SDK Team Interview, Bridging Audio AI and Real-world Products

Gaudio Lab SDK Team, Bridging Audio AI and Products

2025.06.20ㆍ by Heidi Hwang

Gaudio Lab SDK Development Team Interview

🎧This interview shares a behind-the-scenes look at Gaudio Lab’s GDK (Gaudio SDK) Development Team—their working style, technical challenges, team culture, and journey of growth. The GDK Development team tackles the challenge of addressing diverse client needs with a single SDK, while actively experimenting with AI Agents to enhance development efficiency. In an environment where autonomy meets accountability and where learning is continuous, the team is constantly evolving.

If you’re curious about life on the GDK Development Team or thinking of joining Gaudio Lab, don’t miss this story! 😉

🧩 What does the SDK Develpment Team do?

Before we dive into the details, let’s briefly go over what an SDK is:)

SDK stands for Software Development Kit. It’s essentially a set of software tools that helps developers implement specific features more easily. For example, if you want to add a certain function to an app—like login, payment, or audio processing—you don’t have to build everything from scratch. Instead, you can use the tools provided in the SDK to build it faster and more efficiently.

What does Gaudio Lab SDK Development Team do

Bridging audio AI technology with real products

Q: Please introduce yourself and your role on the GDK Development Team.

Leo: We’re the SDK developers in the GDK squad. Simply put, we’re the key link that ensures our audio AI technology works seamlessly in real products. We define use cases, consider which technologies are suitable for which products, and identify the hardware (chipsets) and environments they’ll run on.

We then integrate the SDK into those environments, conduct continuous testing, and respond quickly when issues arise. Our ultimate goal is to deliver a stable, flexible SDK that clients can use independently.

In this process, I primarily focus on porting the SDK to real-world device environments. This involves identifying how and where to integrate the SDK for each system, and ensuring that it runs reliably within the constraints of each platform.

Testing functionality after SDK port and integration

William: While Leo mainly handles hardware porting, I primarily work on the mobile side and also take on DevOps responsibilities. I develop internal tools for build and deployment, and manage frameworks that support the entire development pipeline. I also serve as the first point of contact for Gaudio SDK products that are in the stabilization and maintenance phase, receiving issues from client companies, identifying the problems, and working to resolve them.

Our team isn’t rigidly divided by roles. While we each have our primary responsibilities, we help each other flexibly depending on the situation.

Q: Gaudio Lab has a number of products. Which products have been developed with your team’s involvement?

William: Just to name a few, LM1 (loudness normalization tech), GSA (a spatial audio solution), and Just Voice (an AI noise reduction solution) all involved the GDK team. Other products include ELEQ (Loudness EQ), Smart EQ, Binaural Speaker (3D audio renderer), and GFX (sound effects library). We’ve had a hand in all of them.

How we joined this dynamic team

Q: It sounds like a great environment to gain diverse experience. What led you to join the GDK Development Team in Gaudio Lab?

William: Honestly, when I first joined, I didn’t have much knowledge—or even interest—in audio (laughs). I started as an intern, and my first task was fixing a bug in a WAV parser. That led me to explore the metadata structure of WAV files, and I was fascinated to learn how analog signals are represented digitally. It felt like the same excitement I had when I first discovered computer science. That curiosity kept growing, and now I’ve been here for five years.

Leo: I was already familiar with Gaudio Lab through people I knew, so I naturally had an interest. In my previous company, I wanted to try backend development, but ended up mainly working on device-side tasks. I wanted to find work that was both engaging and something I could excel at.

The idea of SDKs—tools built by developers for developers—really intrigued me. Plus, the opportunity to develop across various device platforms was a big draw. Those two reasons alone made me want to take on the challenge, which led me to join Gaudio Lab.

Gaudio Lab SDK Development Team

Growing through diverse projects

Q: In what ways have you developed professionally since joining Gaudio Lab?

William: Working on the GDK Development Team has exposed me to a wide range of tasks and technologies. Over the past five years, I’ve worked on nearly 10 projects, across various programming languages and environments. I think of it as a journey from sand to stone—you don’t become solid with a single splash of water. Repeated exposure to projects helps you harden and grow. It’s been tough, but I’ve gained a lot from it.

Some might ask, “Can you go deep if you’re doing so many different things?” But in our team, staying shallow isn’t an option. Platform integration requires deep enough understanding to communicate effectively with clients. Working across so many different environments has helped me develop not just technical skills, but also a broader perspective and deeper insights—things that are often hard to gain in the early stages of a career.

Leo: Compared to previous companies I’ve worked for, Gaudio Lab is much freer and more flexible. The autonomous working style and active communication among team members gave me the impression that the company is “alive.” I’ve had many chances to apply my experience freely and, even as a senior hire, I’ve found plenty of opportunities to keep learning and challenging myself. Thanks to a culture that embraces experimentation, I’ve been exposed to a variety of technologies and environments—and recently, I’ve started exploring AI-related work as well.

Curious about the GDK Development Team’s work with AI?

STAY TUNED for Part 2 of our interview!

Explore Life at Gaudio Lab

Gaudio Lab Open Position

How Does AI Find Similar Music? Understanding the Criteria Behind the Match

In our previous post, we explored the story behind the creation of Gaudio Music Replacement. This time, we’ll dive into how AI determines whether two pieces of music are "similar"—not just in theory, but in practice. When broadcasters or content creators export their content internationally, they must navigate complex music licensing issues, which vary by country. To avoid these legal hurdles, replacing the original soundtrack with copyright-cleared alternatives has become a common workaround. Traditionally, this replacement process was manual—listening to tracks and selecting substitutes by ear. However, because human judgment is subjective, results vary widely depending on the individual’s taste and musical experience. Consistency has always been a challenge. To address this, Gaudio Lab developed Music Replacement, an AI-powered solution that finds replacement music based on clear and consistent criteria. Let’s explore how it works. Finding Music That Feels the Same How Humans Search for Similar Music When we try to find music that sounds similar to another, we subconsciously evaluate multiple elements: mood, instrumentation, rhythm, tempo, and more. But these judgments can differ from one day to the next, and vary depending on our familiarity with a genre. Moreover, what we consider "similar" depends heavily on what we focus on. One person may prioritize melody, while another emphasizes instrumentation. This subjectivity makes it difficult to search systematically. Can AI Understand Music Like We Do? Surprisingly, yes—AI models are designed to mimic how humans perceive music similarity. Multiple academic studies have explored this topic. In this post, we focus on one of the foundational models behind Gaudio Music Replacement: the Music Tagging Transformer (MTT), developed by a Gaudin, Keunwoo. The core concept behind MTT is music embedding. This refers to converting the characteristics of a song into a numerical vector—essentially, a piece of data that represents the song’s identity. If we describe a track as “bright and upbeat,” an AI can interpret those traits as vector values based on rhythm, tone, instrumentation, and more. These vectors act like musical DNAs, allowing the system to compare a new song against millions of candidates in the database and return the most similar options—quickly and consistently. MTT plays a key role in generating these embeddings, automatically tagging a song's genre, mood, and instrumentation, and transforming them into vector representations the AI can process. Replacing Songs with AI: Matching Similar Tracks Music Embedding vs. Audio Fingerprint There are two main technologies AI uses to analyze music: music embedding and audio fingerprint. While both convert audio into numerical form, their goals differ. Audio fingerprint is designed to uniquely identify a specific track—even in cases where it has been slightly altered. It’s great for detecting duplicates or copyright infringements. Music embedding, on the other hand, captures a song’s style and feel, making it ideal for identifying similar—but not identical—tracks. When it comes to music replacement, embedding is far more useful than fingerprint. AI needs to recommend tracks that evoke a similar emotional or sonic atmosphere, not just technical matches. How AI Searches for Similar Tracks Music Replacement uses music embeddings to search and replace music in a highly structured way. First, it builds a database of copyright-cleared music. Each song is divided into segments of appropriate length. The AI then pre-processes each segment to generate and store its embedding vector. When a user uploads a song they want to replace, the AI calculates that song’s embedding and compares it to all stored vectors in the database. It uses a mathematical metric called Euclidean distance to measure similarity. The smaller the distance, the more similar the tracks. But it doesn’t stop there. The AI also takes into account genre, tempo, instrumentation, and other musical properties. Users can even prioritize specific elements—like asking the AI to find replacements that match tempo above all else. Gaudio Music Replacement also supports advanced filtering, allowing users to fine-tune their search results to fit exact needs. The Devil Is in the Details: From Model to Product While music embeddings provide a strong technical foundation, deploying this system in real-world environments revealed a new set of challenges. Let's explore a few of them. Segment Selection Choosing which part of a song to analyze can impact results just as much as choosing which song to use. If we divide every track into uniform chunks-say, 30 seconds each-we might cut across bar lines, break musical phrases, or miss important transitions, resulting in poor matches. Music is typically structured into intros, verses, choruses, and bridges. These sections often carry distinct emotional tones. By analyzing the song’s internal structure and aligning segments accordingly, we improve both matching accuracy and musical coherence. Volume Dynamics: The Envelope Problem In video content, background music often changes dynamically depending on what’s happening in the scene. For example, during dialogue, the music volume might fade low, and then rise during an action sequence. These dynamic shifts are represented by the envelope—a term for how a sound’s volume or intensity changes over time. If AI ignores the envelope when replacing music, the result can feel awkward or unnatural. Ideally, the AI finds a replacement track with a similar envelope. If that’s not possible, it can learn the original envelope and apply it to the new track—preserving the intended emotional flow. Mixing & Mastering Finding the right song is only half the battle. For a replacement to feel seamless, the new track must blend naturally with existing dialogue and sound effects. While AI can find musically similar tracks, determining whether the new audio actually fits the original mood, tone, and mix often requires human expertise. In fact, professionals say they spend as much time mixing and mastering as they do selecting the right music. To address this, Gaudio Lab turned to WAVELAB, its own subsidiary and one of Korea’s leading film sound studios. With years of experience in cinema and broadcast sound design, WAVELAB contributed its expertise to develop a professional-grade AI mixing and mastering engine. This engine goes beyond simple volume adjustment, capturing the director’s intent and applying it to the new track with nuance and precision. Coming Up Next: Where Does the Music Begin—and End? The image above shows Gaudio’s Content Localization end-to-end system diagram, including Gaudio Music Replacement. In this post, we focused on Music Recommender, which takes a defined music segment and swaps it with a similar one. But in real-world content, the first challenge isn’t always about which song to replace—it’s figuring out where the music even is. In many videos, music, dialogue, and sound effects are mixed into a single audio track. Before we can replace anything, we need to separate those elements. But here’s a question: in a movie scene, is a cellphone ringtone considered music—or a sound effect? And what if multiple songs are joined together with fade-ins and fade-outs? The AI must then detect where one track ends and another begins, by accurately identifying timecodes. In the next article, we’ll explore two powerful technologies Gaudio Lab developed to solve these problems: DME Separator: separates dialogue, music, and effects from a master audio track TC(Time Code) Detector: identifies precise start and end points for music segments Stay tuned—we’ll dive deeper into how AI learns to define the boundaries of music itself.

2025.05.16

How AI agents help Gaudio Lab's SDK team work smarter

Did you enjoy Part 1 of the GDK Development Team interview? In this article, we’ll dive deeper into how they approach and solve problems with AI Agent. If you haven’t read Part 1 of the GDK Development Team’s story yet, click here! Even small changes deserve careful verification Q: With such a range of work, what challenges have you faced—and how did you overcome them? Leo: One recurring challenge is communicating when we port our SDK to third-party platforms. Especially with closed platforms or third-party chips, it’s often hard to obtain the necessary information. And when teams are physically far apart, inefficient communication loop can become a real issue. For example, confirming how a piece of hardware behaves can take days—or even up to a month. To work around this, we assume the other side can’t see our setup and try to communicate as clearly and kindly as possible, providing thorough context. That’s why communication skills—being able to clarify complex situations—are just as important as technical skills. Another challenge is that audio is difficult to evaluate quantitatively. Because sound quality is often judged subjectively, even minor code changes can affect the listener’s perception. That’s why we consult our in-house audio experts—yes, we have nine PhDs in acoustics! One of our core philosophies is: “Even small changes deserve careful verification.” William: One challenge I remember is from one of the Sound Quality projects. Each client had slightly different expectations for how the SDK should behave. As a result, even a single version could become fragmented, creating overhead in managing and automating deployments. Ideally, we’d provide one well-built SDK to all clients, but realistically, customization is often unavoidable depending on the client’s priorities. We constantly walk a fine line between standardization and customization—and I think managing that balance is one of our key strengths. Embracing AI Agents—Tools that truly boost productivity Q: I heard your team started to use AI Agents in development. What kind of changes do you expect? Leo: We’re still in the early stages—exploring which tools to use and how to integrate them into our process. We had been using some tools already, but what really kicked things off was a seminar by our frontend developer, Handy. After that, the team became aligned on experimenting more seriously. And once we started, the benefits became clear. William: We’re still figuring things out, but our productivity has definitely improved. We try to delegate repetitive tasks to AI so we can focus on core logic. I’ve found that if I define the task clearly—like in a commit-sized chunk—the AI can produce really useful results. We then review and refactor the output. For repetitive or boilerplate code, AI completes tasks in seconds. We’re seeing clear potential for time savings and efficiency. Leo: AI Agents significantly lower the initial barrier when starting a new, unfamiliar project. They’re also great at reducing human errors that are easy to overlook, and they serve as helpful collaborators during code reviews. I’m really optimistic about their potential. Q: It sounds like your team is open to new tech! William: Definitely. If a tool can improve efficiency, we’re willing to try it. Especially in areas with fewer external dependencies—like demos, test tools, or CLI utilities—we’re encouraged to experiment. Of course, SDK development requires more caution due to stability and compatibility concerns. But, if a tool proves to be effective, we are open to adopting it. We welcome proactive, respectful teammates Q: People say Gaudio Lab’s biggest asset is its people. What are your teammates like? William: Most of us are introverts—except maybe Jayden (laughs). I’m the youngest on the team, and compared to my friends’ workplaces, ours has no pressure from seniority. It’s a very relaxed and comfortable environment. I’ve never had issues with team dynamics. Leo: I agree. We all have different personalities, but when it comes to work, we express ourselves clearly and respectfully. We communicate smoothly, without unnecessary conflict. Our team lead, Seo, emphasizes ownership—encouraging each member to take initiative and lead their work. It’s created a culture where we naturally take responsibility, and if you want to try something, you’re encouraged to speak up and go for it. Final message to future teammates of Gaudio Lab SDK Leo: The most important quality is adaptability—the ability to embrace change and new environments. If you’re proactive and curious, you’ll do well here. From a teamwork perspective, we look for people who can collaborate respectfully and share ideas harmoniously. Since many of our projects are fast-paced, it’s a great fit if you enjoy working with quick, dynamic rhythms. William: I’d recommend this team to anyone who wants diverse experience. We frequently take on new challenges, so, it’s great for people who are open to change. And if you’re interested in music or audio, this is the perfect place to turn your ideas—like “It’d be cool if an SDK did this”—into reality. If you enjoy your work like a hobby rather than just a job, this team will be a fantastic fit. Curious about the products developed by the GDK team at Gaudio Lab? 🔊 LM1 – Loudness normalization solution 🌐 Gaudio Spatial Audio – Spatial audio solution 🔇 Just Voice – AI-powered noise reduction solution In addition to these, the GDK team has also contributed to many other products such as: ELEQ (Loudness EQ), Smart EQ (Smart Equalizer), Binaural Speaker (3D Audio Renderer), and GFX (Audio Effects Library). If you found the story of the GDK development team intriguing, Explore Life at Gaudio Lab:)

2025.06.27