VALL-E 2 performs zero-shot text-to-speech synthesis (TTS), which uses text inputs to generate speech for voices it hasn't been explicitly trained on. It uses a vast training library—in this ...
Some results have been hidden because they may be inaccessible to you