Skip to main content
Version: 1.2.0

Model Overview

Stem Separation

ModelAPI ReferenceDescription
gsep_music_hq_v1ViewSeparates vocals, drums, bass, electric guitar, piano, and other instruments
gsep_music_shq_v1ViewHigher-quality model that separates into vocals and accompaniment only
gsep_speech_hq_v1ViewIsolates speech from background noise to deliver clearer voice

DME Separation

ModelAPI ReferenceDescription
gsep_dme_dtrack_v1ViewOutputs only the dialogue track, removing both music and effects.
gsep_dme_d2track_v1ViewOutputs dialogue and vocals embedded in background music, while removing instrumental music and effects.
gsep_dme_metrack_v1ViewMusic + effects track corresponding to dtrack (dialogue removed).
gsep_dme_me2track_v1ViewMusic + effects track corresponding to d2track (dialogue/vocals removed).
gsep_dme_me2track_v2ViewMusic + effects track corresponding to d2track (dialogue/vocals removed), with significantly improved separation quality and fidelity.
gsep_dme_mtrack_v1ViewOutputs only the music track, without dialogue or effects.
gsep_dme_etrack_v1ViewOutputs only the effects track, removing dialogue and music.

AI Text Sync

ModelAPI ReferenceDescription
gts_lyrics_line_v1ViewLine-level lyric alignment; supports English, Korean, Japanese, Chinese (Simplified)
gts_lyrics_line_v3Not yet available