I created this mp3 at 11labs with pauses in between paragraphs. I want to play this and also show the transcript in the ui. How is this possible?
Is it possible to use EnqueueResponse in the ConvaiNPC.cs script? Or AddResponseAudio in ConvaiNPCAudioManager?
Ok, I have it working like this. Please review and tell my why I need to use SetWaitForCharacterLipSync(false)
and if that could cause any conflict? Right now it seems to work.
In the PlayAudioInOrder Coroutine, the while (_waitForCharacterLipSync) statement is used to wait for the lip sync data. SetWaitForCharacterLipSync(false) controls this loop, allowing the process to continue.
It didn’t if there is nothing happening beforehand. Meaning no other user input. I had to add that SetWaitForCharacterLipSync(false) to get it working.
But lipSync is not working. I tried to dig through the code, but I’m not sure how to set it up. Also not sure if this can be done and cached somehow.
From what I see ProcessAudioResponse in line 580 of ConvaiGRPCAPI.cs contains the lipSyncBlendFrameQueue which “is the lipsync data”? Can I create this data in my App by giving a audio and/or text (string)?
I’ll dig deeper and see if I can use the “create lip sync” functionalities locally. I guess that the lip sync is created locally, right?
Can you point me in the right direction: Can I somewhere input a audio (mp3) or a text and get lipsync “data”?
I mean you get a response from the server with audio (mp3?) and text, right? And then this is used to create the lipsync data. Once that is finished all three “contents” are used / played(?).
Hmm, ok. I saw that in ReceiveResultFromServer function in ConvaiGRPCAPI.cs the data is processed. ProcessAudioResponse function is called and handles the lipsync part. Indead it seems not possible to create it locally(?!).
Would it be possible to store / cache a response to use directly in the app. So that I could play a cached response directly without the delay of “InvokeTrigger” and waiting for the response.