Create a new translation project by uploading a video or audio file. The translation process runs asynchronously in the background. Use the status endpoint to track progress and retrieve results.
The video or audio file to translate.Supported video formats:video/mp4, video/quicktime, video/x-matroska, video/webm, video/mpegSupported audio formats:audio/mpeg, audio/wav, audio/mp4, audio/x-m4a, audio/flac, audio/ogg, audio/aacMaximum file size: 20 GB
The source language of the content using ISO language codes (e.g., en, es, fr, de, ja, zh).
Strongly recommended: Leave this empty for auto-detection.Only provide this parameter if you are 100% certain the language code is correct and in valid ISO format. Incorrect language codes will cause transcription failures. Our auto-detection supports 80+ languages and is highly accurate.
Whether to preserve background audio in the output.When enabled, keeps background music, ambience, laughs, claps, and crowd sounds while removing only the original voice (stem separation). Turn off if your source has no background audio.Default:true
Voice isolation mode when keepBackgroundMusic is enabled. Controls the quality and characteristics of voice separation.
Studio (Recommended)
Our default voice processing, designed for professional audio quality:
Removes echoes and reverberations
Cleans technical imperfections
Produces a clear and crisp voice
Recommended for: Most projects where audio quality is paramount. Ideal for tutorials, educational content, marketing videos, and any content requiring optimal voice clarity.
Realistic
Preserves the natural characteristics of the recording environment:
Maintains a sound closer to the original recording
Preserves environmental characteristics
Recommended for: Content where the authenticity of the environment is important, such as outdoor vlogs, documentaries, or content where the sound ambiance is an integral part of the experience.
This option may create artifacts or unexpected effects in some cases due to the preservation of background elements.
Whether to generate subtitles for the translated video.When enabled, adds clean Netflix-style black and white subtitles in the target language. Subtitles are automatically synced to the translated speech for optimal readability.Default:false
Fine-tune voice cloning parameters for advanced control over the generated voice. Pass as a JSON string when using form-data. All values must be between 0 and 1 (with step of 0.01).
Speaker Boost (0.00 - 1.00)Boosts similarity to the original speaker. This is a subtle enhancement that increases resemblance to the source voice.Note: Higher values increase compute time and latency.Default:0.60Tip: For avoiding accent reproduction, use 0.00