Cross-lingual TTS supporting 17 languages with zero-shot voice cloning. Requires only 6 seconds of reference audio to clone any voice.