I’ve just updated our solution to use Convai SDK version 3.6.4 and ran a series of tests.
I submitted 10 consecutive questions. Apart from one response that failed due to a network issue (most likely due to my current unstable internet connection), the other 9 responses worked flawlessly — no stuttering, no voice cut-offs, and lip sync was consistently accurate.
Interestingly, even with MinBufferDuration left at 10, the response time (i.e., the delay between the end of the user prompt and the start of the avatar’s reply) felt comparable to previous versions.
Here are the session details:
Session ID:38128c4013dcf9bd6da0712605e12b2d
Character ID:09f85624-eea8-11ef-85b7-42010a7be016
Unreal Engine:5.5.4
MinBufferDuration:10
I’m now deploying the updated system to production and will collect feedback from actual users over the next few days.
Initial feedback is encouraging, but of course 10 questions are not enough to call it solved. I’ll report back after observing how the system performs in real use.
I wanted to share a quick update based on live usage feedback over the past few days.
Saturday and Sunday (Aug 3–4):
The system performed very well. According to the staff on site, there were only one or two very brief interruptions during the entire day on Sunday — overall, performance was stable and the experience was smooth.
Today (Aug 5):
Unfortunately, the situation has regressed. The voice is stuttering again — in about 3 out of 5 interactions, speech gets interrupted mid-sentence for several seconds, then resumes and completes normally.
Additionally, response latency is noticeably higher than in previous days.
We’ve verified that this is not a network issue:
Packet loss: < 0.1%
Latency: 16.5 ms
Jitter: 1.37 ms
Upload/Download: > 80 Mbps
We’ve also checked our monitoring tools for both OpenAI and ElevenLabs, which we’re using through Convai. OpenAI seems fully operational (OpenAI Status)
I’m not requesting any specific action for now — just documenting this for traceability.
We’ll continue monitoring over the next few days and I’ll report back, especially as we gather more feedback from other projects using the same setup.
@omar.venturi You mention that you’ve set MinBufferDuration to 10, but that the response time from the characters are more or less the same as before. Are you sure that it’s actually in effect? When we use this setting, the time from we receive the first subtitle text until the character starts speaking seems to be {MinBufferDuration} seconds, so if we set it to 10 it will wait for a really long time (since it’s buffering) before it starts to speak.
Can you confirm that you have a section in DefaultEngine.ini that looks like this:
[/Script/Convai.ConvaiSettings] API_Key=[Your API key] ExtraParams=MinBufferDuration=10
We’re in the middle of testing different config variables to try and find an acceptable sweetspot between time to response and amount of glitching. I hadn’t noticed that ElevenLabs is having trouble right now though, so that’s helpful to know about!
@omar.venturi I see. Very strange that you get a completely different experience to us. We get the delay before the response comes in as expected, sounds like it makes no difference for you?
@K3 We’ve been testing more with version 3.6.4 now. The MinBufferDuration setting does seem to have some effect, but we’re still experiencing the same issues - even when it’s set to 10. Often (but not always), the first response from the character is smooth, but the next responses will start having more and more glitches.
It’s worth to note that removing the FacialSync component from the character BP completely eliminates the issue (with MinBufferDuration set to 0.7), so it has to be related to the facial animation setup somehow. Our workaround for now will most likely be to use some custom-made, more basic animations based only on volume changes.
We can provide session IDs and recordings if necessary.
Just downloaded and about to test the 3.6.5 Release update. @K3 can you confirm where the AudioLipSyncRatio parameter settings can be found? Can’t find it!
A quick video showing that would be helpful. Thanks.
You don’t need to look for the parameter anywhere.
Like MinBufferDuration, you need to add it under the Extra Parameters section in the Project Settings.
As shown in the example above: MinBufferDuration=7, AudioLipSyncRatio=0 you should use it in this format.
Got it! The parameters settings are not exposed at the moment. Understood. I’ll add the parameters you provided in the Extra Parameters section in the Project Settings.
It’s certainly much better, however the random audio chunk, the digital human stop moving the lips, but the audio continues. Had one instance of stuttering. I guess the idea is to find a sweet spot for natural conversation flow to prioritize fluid audio delivery with perfect lip sync.
Just got really bad stuttering while testing. Below is the Output Log
Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Initial Stream Write | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Initial Stream Read | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Sent UserQuery lemmiric: | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Calling Stream WriteLast | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: OnStreamWriteDone
ConvaiGRPCLog: NumberOfAudioBytesSent 0
ConvaiAudioStreamerLog: State transition: Playing → Stopped
ConvaiAudioStreamerLog: onAudioFinished
ConvaiGRPCLog: No more data to read after 3 attempts. Calling Finish…
ConvaiGRPCLog: Calling Stream Finish | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Warning: OnStreamFinish: Status.ok():Not Ok | Debug Log: | Error message:enter idle | Error Details: | Error Code:14 | Character ID:81b2b358-6cca-11f0-9210-42010a7be01f | Session ID:0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiChatbotComponentLog: Warning: UConvaiChatbotComponent Get Response Failed! | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiChatbotComponentLog: UConvaiChatbotComponent Request Finished! | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Destroying UConvaiGRPCGetResponseProxy… | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Destroying UConvaiGRPCGetResponseProxy… | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
LogViewport: Display: Viewport MouseLockMode Changed, LockOnCapture → LockAlways
LogViewport: Display: Viewport MouseCaptureMode Changed, CapturePermanently → CaptureDuringMouseDown
LogViewport: Display: Viewport MouseLockMode Changed, LockAlways → LockOnCapture
LogViewport: Display: Viewport MouseCaptureMode Changed, CaptureDuringMouseDown → CapturePermanently
ConvaiGRPCLog: AsyncGetResponse started | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: GRPC GetResponse stream initialized | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: request: character_id: “81b2b358-6cca-11f0-9210-42010a7be01f”
session_id: “0a4e15dd1f81ee9a018aa89e3bef02a5”
audio_config {
sample_rate_hertz: 16000
enable_facial_data: true
face_model: FACE_MODEL_OVR_MODEL_NAME
}
action_config {
actions: “Moves To”
actions: “Follows”
actions: “None”
actions: “Stops”
actions: “Thinks”
actions: “Feels”
actions: “Waits for ”
characters {
name: "Annelore "
}
characters {
name: “Guest”
}
characters {
name: “Player”
}
objects {
name: “Pearl and diamond earrings in White Gold ”
description: “These are gift from your creator as a genture of you joining the Vivience Research Team.”
}
classification: “multistep”
}
speaker: “Player”
speaker_id: “f783d652-6e96-11f0-aef1-42010a7be01f”
dynamic_info_config {
text: “Annelore resides within the Vivience Creative Studio\342\200\224a thoughtfully designed research environment that blends natural textures with dynamic interfaces to support calm, focused inquiry. The studio balances analog warmth with digital precision, and the air carries a subtle trace of cedar. She shares this space with Light-Arch, her perceptive colleague and the digital human embodiment of Professor Jeasy Sehgal\342\200\231s legacy. Together, they explore the frontiers of emotionally intelligent systems, where empathy and engineering converge.”
}
| Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Initial Stream Write | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Initial Stream Read | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Sent UserQuery Try a tounge twister: | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: Calling Stream WriteLast | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: OnStreamWriteDone
ConvaiGRPCLog: NumberOfAudioBytesSent 0
ConvaiGRPCLog: Received Audio Chunk: 6.680726 secs | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: EmotionResponse: neutral 1
ConvaiAudioStreamerLog: State transition: Stopped → WaitingOnAudio
ConvaiGRPCLog: Received Text She sells seashells by the seashore, where the sea’s ceaseless surge sends shimmering shells scattered across sandy shores. : | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5 | ReceivedFinalResponse : False
ConvaiGRPCLog: Received Audio Chunk: 6.520726 secs | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: EmotionResponse: neutral 1
ConvaiAudioStreamerLog: State transition: WaitingOnAudio → Playing
ConvaiGRPCLog: Received Text Peter Piper picked a peck of pickled peppers precisely, proving particularly proficient in pepper procurement practices. : | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5 | ReceivedFinalResponse : False
ConvaiAudioStreamerLog: onAudioFinished
ConvaiGRPCLog: Received Audio Chunk: 6.280726 secs | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: EmotionResponse: neutral 1
ConvaiGRPCLog: Received Text The rapid articulation and repeated consonants should reveal any buffer limitations or synchronization struggles.: | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5 | ReceivedFinalResponse : False
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiGRPCLog: Chatbot Total Received Lipsync Responses: 319 Responses
ConvaiChatbotComponentLog: Chatbot Total Received Audio: 19.479115 seconds
ConvaiGRPCLog: Received Text : | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5 | ReceivedFinalResponse : True
ConvaiGRPCLog: GetResponse SequenceString: None
ConvaiGRPCLog: Action: None
ConvaiAudioStreamerLog: onAudioFinished
ConvaiGRPCLog: GetResponse EmotionResponseDebug: session_id: “0a4e15dd1f81ee9a018aa89e3bef02a5”
emotion_response: “Serenity Acceptance Apprehension Distraction Pensiveness Boredom Annoyance Interest”
ConvaiGRPCLog: No more data to read after 3 attempts. Calling Finish…
ConvaiGRPCLog: Calling Stream Finish | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiGRPCLog: On Stream Finish | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiChatbotComponentLog: UConvaiChatbotComponent Request Finished! | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
ConvaiAudioStreamerLog: onAudioFinished
ConvaiGRPCLog: Destroying UConvaiGRPCGetResponseProxy… | Character ID : 81b2b358-6cca-11f0-9210-42010a7be01f | Session ID : 0a4e15dd1f81ee9a018aa89e3bef02a5
Cmd: quit
When the stuttering happened, the below was fired in the Output log
ReceivedFinalResponse : False
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: onAudioFinished……
Now I am getting even more stuttering and lip sync breakage suddenly @K3. Exiting the session and starting it again, resests the issue but continuous use, the issue comes back.
The response is very choppy now. Lot of jittering. It’s intermittent but then in the middle of a conversation, the character twitches and the audio breaks and then starts again. It’s weird. Didn’t see this before. Should we revert back to the 3.6.4 version for now?