Hello convai Team,
we are using convai via Unity for a character which uses a eleven labs voice (custom female).
The generated audio when testing ( -in App- but also via -web- > convai/playground/character/language-and-voice) has much more artifacts and bugs regarding speed and tone of voice than audio created directly via eleven labs website.
The bugs were mentioned before in other forum posts ( ElevenLabs voice returning “slow motion” audio - Core Features / Language and Speech - Convai Developer Forum) … the audio is streched, the gaps between words and general speed of speaking is sometimes super fast and in the same audio but another part of a sentence it becomes super slow etc. … these bugs appear super frequent when audio is generated through convai but only very rarely when created directly via eleven labs and its super frustrating since we dont see an obvious reason. In another forum post, I already addressed that voice parameters and voice AI model are being checkt and this seems to be correct now (as far as I can imagine … since there is no user based configuration UI).
In addition to the speed of generated voice audio we sometimes get cut-off audios where only 20%-40% of the original text output gets generated as audio. Which is of course a black-box bug to us, too.
So overall - the audio voice of convai character is simply way too bad to be used in a professional project - and the only one to change this is your team.
How can I help to fix and challenge these issues or do you do this on purpose for what ever business reason?
cheers,
Markus