Skip to main content

Version: 1.2.0

Model Overview

Stem Separation

Model	API Reference	Description
gsep_music_hq_v1	View	Separates vocals, drums, bass, electric guitar, piano, and other instruments
gsep_music_shq_v1	View	Higher-quality model that separates into vocals and accompaniment only
gsep_speech_hq_v1	View	Isolates speech from background noise to deliver clearer voice

DME Separation

Model	API Reference	Description
gsep_dme_dtrack_v1	View	Outputs only the dialogue track, removing both music and effects.
gsep_dme_d2track_v1	View	Outputs dialogue and vocals embedded in background music, while removing instrumental music and effects.
gsep_dme_metrack_v1	View	Music + effects track corresponding to dtrack (dialogue removed).
gsep_dme_me2track_v1	View	Music + effects track corresponding to d2track (dialogue/vocals removed).
gsep_dme_me2track_v2	View	Music + effects track corresponding to d2track (dialogue/vocals removed), with significantly improved separation quality and fidelity.
gsep_dme_mtrack_v1	View	Outputs only the music track, without dialogue or effects.
gsep_dme_etrack_v1	View	Outputs only the effects track, removing dialogue and music.

AI Text Sync

Model	API Reference	Description
gts_lyrics_line_v1	View	Line-level lyric alignment; supports English, Korean, Japanese, Chinese (Simplified)
gts_lyrics_line_v3	–	Not yet available

Stem Separation
DME Separation
AI Text Sync