Voice drops out during the conversation

Meijestic · September 9, 2025, 4:12pm

Hello, I asked about this a few days ago. I’m currently having problems with the audio playback when using the plugin. It is often very fragmented and the loading times are very long. I have it with a project that we have had online for a long time and have not changed anything () and also with a new local project. Where do these interruptions in the voice come from? Sometimes the lip movements are also missing.
Die ID der charactere sind c687bef8-a5b8-11ef-92e6-42010a7be016 und 8ef490ca-78e8-11f0-9a52-42010a7be01f
Thanks

DoE · September 9, 2025, 7:19pm

Yes same issue with us

Ham1ton · September 10, 2025, 1:17am

Same issue here. Plus super long latency times for responses recently. When we first got our subscription the service was stellar. Over the last year it has declined and now we may have to cancel our service for next year on our deployed app due to latency unless something can change. We just can’t afford the $8,0000 USD per month cost for the enterprise level to hope we get a decent latency…

DoE · September 10, 2025, 1:50am

HI @K3 ,

Can you please suggest.Feel like global outage ! This is very serious and urgent as we are loosing clinets .

K3 · September 10, 2025, 6:56am

Could you please share the session id?

Meijestic · September 10, 2025, 7:51am

LogDebuggerCommands: Repeating last play command: Neues Editorfenster (PIE)
LogPlayLevel: PlayLevel: No blueprints needed recompiling
LogPlayLevel: Creating play world package: /Game/Maps/MHC_LightingPresets_IBL/UEDPIE_0_Downtown_Night
LogPlayLevel: PIE: StaticDuplicateObject took: (0.005290s)
LogPlayLevel: PIE: Created PIE world by copying editor world from /Game/Maps/MHC_LightingPresets_IBL/Downtown_Night.Downtown_Night to /Game/Maps/MHC_LightingPresets_IBL/UEDPIE_0_Downtown_Night.Downtown_Night (0.005328s)
LogUObjectHash: Compacting FUObjectHashTables data took 0.47ms
ConvaiSubsystemLog: gRPC Creating Channel…
ConvaiSubsystemLog: UConvaiSubsystem Started
ConvaiSubsystemLog: Start Run
LogWorldMetrics: [UWorldMetricsSubsystem::Initialize]
LogPlayLevel: PIE: World Init took: (0.001084s)
LogAudio: Display: Creating Audio Device: Id: 5, Scope: Unique, Realtime: True
LogAudioMixer: Display: Audio Mixer Platform Settings:
LogAudioMixer: Display: Sample Rate: 48000
LogAudioMixer: Display: Callback Buffer Frame Size Requested: 1024
LogAudioMixer: Display: Callback Buffer Frame Size To Use: 1024
LogAudioMixer: Display: Number of buffers to queue: 2
LogAudioMixer: Display: Max Channels (voices): 32
LogAudioMixer: Display: Number of Async Source Workers: 0
LogAudio: Display: AudioDevice MaxSources: 32
LogAudio: Display: Audio Spatialization Plugin: None (built-in).
LogAudio: Display: Audio Reverb Plugin: None (built-in).
LogAudio: Display: Audio Occlusion Plugin: None (built-in).
LogAudioMixer: Display: Initializing audio mixer using platform API: ‘XAudio2’
LogAudioMixer: Display: Using Audio Hardware Device LG HDR 4K (NVIDIA High Definition Audio)
LogAudioMixer: Display: Initializing Sound Submixes…
LogAudioMixer: Display: Creating Master Submix ‘MasterSubmixDefault’
LogAudioMixer: Display: Creating Master Submix ‘MasterReverbSubmixDefault’
LogAudioMixer: FMixerPlatformXAudio2::StartAudioStream() called. InstanceID=5
LogAudioMixer: Display: Output buffers initialized: Frames=1024, Channels=2, Samples=2048, InstanceID=5
LogAudioMixer: Display: Starting AudioMixerPlatformInterface::RunInternal(), InstanceID=5
LogAudioMixer: Display: FMixerPlatformXAudio2::SubmitBuffer() called for the first time. InstanceID=5
LogInit: FAudioDevice initialized with ID 5.
LogAudio: Display: Audio Device (ID: 5) registered with world ‘Downtown_Night’.
LogAudioMixer: Initializing Audio Bus Subsystem for audio device with ID 5
LogAudioMixer: Display: Sending SubmixBufferListener ‘PixelStreamingEditorListener’ register command…
LogAudioMixer: Display: Submix buffer listener ‘PixelStreamingEditorListener’ registered with submix ‘MasterSubmixDefault’
LogSlate: Updating window title bar state: overlay mode, drag disabled, window buttons hidden, title bar hidden
LogLoad: Game class is ‘BP_FirstPersonGameMode_C’
LogWorld: Bringing World /Game/Maps/MHC_LightingPresets_IBL/UEDPIE_0_Downtown_Night.Downtown_Night up for play (max tick rate 0) at 2025.09.10-09.48.55
LogWorld: Bringing up level for play took: 0.023951
LogOnline: OSS: Created online subsystem instance for: :Context_4
ConvaiPlayerLog: UConvaiPlayerComponent: Found submix “AudioInput”
Cmd: voice.MicNoiseGateThreshold 0.01
voice.MicNoiseGateThreshold = “0.01”
Cmd: voice.SilenceDetectionThreshold 0.001
voice.SilenceDetectionThreshold = “0.001”
PIE: Server eingeloggt in
PIE: Wiedergabe im Editor Gesamtstartzeit 0,158 Sekunden.
LogAudioCaptureCore: Display: FAudioCaptureWasapiStream::GetCaptureDeviceInfo: no default capture device found
ConvaiPlayerLog: Started Talking
LogAudioCaptureCore: Display: FAudioCaptureWasapiStream::GetCaptureDeviceInfo: no default capture device found
ConvaiAudioLog: Using as Audio capture device with NumChannels:2033716240 and SampleRate:1819
ConvaiSubsystemLog: Warning: gRPC channel not ready yet.. Current State: GRPC_CHANNEL_IDLE
ConvaiGRPCLog: AsyncGetResponse started | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : -1
LogAudioCaptureCore: Display: FAudioCaptureWasapiStream::OpenAudioCaptureStream: no default capture device found
ConvaiGRPCLog: GRPC GetResponse stream initialized | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : -1
ConvaiGRPCLog: request: character_id: “8ef490ca-78e8-11f0-9a52-42010a7be01f”
api_key: “”
session_id: “-1”
audio_config {
sample_rate_hertz: 16000
enable_facial_data: true
face_model: FACE_MODEL_OVR_MODEL_NAME
}
action_config {
characters {
name: “Player”
}
classification: “multistep”
}
speaker: “Player”
dynamic_info_config {
}
| Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : -1
ConvaiGRPCLog: Initial Stream Write | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : -1
ConvaiGRPCLog: Initial Stream Read | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : -1
ConvaiPlayerLog: FinishTalking calling FinishGetResponseStream
ConvaiGRPCLog: Finish Writing to audio data buffer
ConvaiGRPCLog: FinishWriting:: Informing On Data Received
ConvaiGRPCLog: Calling Stream WritesDone | LastWriteReceived : True | AudioBuffer.Num() : 0 | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : -1
ConvaiGRPCLog: On Stream Write Done Writing
ConvaiPlayerLog: Finished Talking
ConvaiGRPCLog: OnStreamWriteDone
ConvaiGRPCLog: NumberOfAudioBytesSent 51832
ConvaiGRPCLog: GetResponse EmotionResponseDebug: session_id: “1857483a2352e6bddb193dc3952678e2”
emotion_response: “Serenity”
ConvaiGRPCLog: Received Audio Chunk: 3.440726 secs | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiGRPCLog: EmotionResponse: joy 1
ConvaiAudioStreamerLog: State transition: Stopped → WaitingOnLipSync
ConvaiGRPCLog: Received Text Hey, willkommen im LWL-Museum Henrichshütte! : | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2 | ReceivedFinalResponse : False
ConvaiAudioStreamerLog: New SampleRate: 44100
ConvaiAudioStreamerLog: New Channels: 1
ConvaiAudioStreamerLog: State transition: WaitingOnLipSync → Playing
ConvaiGRPCLog: GetResponse EmotionResponseDebug: session_id: “1857483a2352e6bddb193dc3952678e2”
emotion_response: “Serenity”
ConvaiGRPCLog: Received Audio Chunk: 3.360726 secs | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiGRPCLog: EmotionResponse: joy 1
ConvaiAudioStreamerLog: State transition: Playing → WaitingOnLipSync
ConvaiGRPCLog: Received Text Ich bin Andi und freue mich total, dass du hier bist. : | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2 | ReceivedFinalResponse : False
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: State transition: WaitingOnLipSync → Playing
ConvaiGRPCLog: GetResponse EmotionResponseDebug: session_id: “1857483a2352e6bddb193dc3952678e2”
emotion_response: “Serenity”
ConvaiGRPCLog: Received Audio Chunk: 5.000726 secs | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiGRPCLog: EmotionResponse: joy 1
ConvaiAudioStreamerLog: State transition: Playing → WaitingOnLipSync
ConvaiGRPCLog: Received Text Was hat dich denn heute hierhergeführt - bist du zum ersten Mal bei uns?: | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2 | ReceivedFinalResponse : False
ConvaiAudioStreamerLog: onAudioFinished
ConvaiAudioStreamerLog: State transition: WaitingOnLipSync → Playing
ConvaiGRPCLog: Chatbot Total Received Lipsync Responses: 654 Responses
ConvaiChatbotComponentLog: Chatbot Total Received Audio: 11.799116 seconds
ConvaiGRPCLog: Received Text : | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2 | ReceivedFinalResponse : True
ConvaiGRPCLog: GetResponse SequenceString:
ConvaiGRPCLog: GetResponse EmotionResponseDebug: session_id: “1857483a2352e6bddb193dc3952678e2”
emotion_response: “Ecstasy Admiration Anticipation”
ConvaiAudioStreamerLog: onAudioFinished
ConvaiGRPCLog: No more data to read after 3 attempts. Calling Finish…
ConvaiGRPCLog: Calling Stream Finish | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiGRPCLog: On Stream Finish | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiChatbotComponentLog: UConvaiChatbotComponent Request Finished! | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiGRPCLog: Destroying UConvaiGRPCGetResponseProxy… | Character ID : 8ef490ca-78e8-11f0-9a52-42010a7be01f | Session ID : 1857483a2352e6bddb193dc3952678e2
ConvaiAudioStreamerLog: onAudioFinished

Meijestic · September 10, 2025, 7:52am

At the moment, response times are very long and lip movements only occur occasionally.

K3 · September 10, 2025, 11:44pm

Could you please try again?

DoE · September 11, 2025, 2:53am

Hi @K3 ,

For us its same delay is there . Can you tell what the main reason?

b89d73380f6bd824bc2edd1153381c95 session id

ysinha · September 11, 2025, 3:05am

Could you please try selecting a different voice? That character is using an OpenAI voice which might have high latency at times. Generally, Azure/GCP voices have lower latencies.

DoE · September 11, 2025, 3:09am

Hi @ysinha ,

We are now using Azure voice only. Previously, we were on ElevenLabs, which had lag during speech. With Azure, there’s no lag while speaking, but the issue is with loading time—both at the initial start and during conversations

ysinha · September 11, 2025, 3:40am

The session id you provided b89d73380f6bd824bc2edd1153381c95 is for this character db850382-abc2-11ef-b73c-42010a7be016 which is using an Onyx OpenAI voice. Could you please confirm the character is using an Azure voice by going to DOE AI Assistant?

DoE · September 11, 2025, 4:51am

Ours is currently using Azure voice for testing, but we also need Arabic voice support, so we may have to consider OpenAI or ElevenLabs.

@Prabhu

ysinha · September 11, 2025, 4:58am

Could you please share a session id and character ID which has high response time and is using Azure voice?

Prabhu · September 11, 2025, 5:16am

Dear Team,

Below are the id’s having high response time using Azure voice.

Character id: db850382-abc2-11ef-b73c-42010a7be016

session id: 406a23205a9a1ddfb0c2c2bf86d0cfb2

K3 · September 11, 2025, 8:13am

I’m not sure I fully understand the issue. Could you clarify what you mean by “there’s no lag while speaking” and what exactly you mean by “loading time”? Please share a video so we can see the behavior. On our side, the character you shared is working normally.

DoE · September 11, 2025, 10:56am

Hi,

please have look https://drive.google.com/file/d/143JpJAyPvojQPwtnf0DG867cIghHfkgl/view?usp=drive_link

:17 -he gave welcome msg (not recorded but he was speaking)

:33 - asked question

then no reply its keep loading .

1 week back it was replying within 4-6 sec max

K3 · September 11, 2025, 11:00am

Could you please grant me access?

DoE · September 11, 2025, 11:03am

Done shared the access.let me know if you want to have call i can share invite also .

K3 · September 11, 2025, 11:04am

Yes, please, because the video isn’t clear. I need to see the logs and hear the character’s voice.

Topic		Replies	Views
Convai Plugin on Unreal – Audio Cuts Off at End of Conversation (Since June 11) Unreal Engine Plugin language-and-speech , unreal-engine , question	168	1240	September 18, 2025
Character sometimes doesn't speak Voice, Speech & Language language-and-speech , question	26	81	April 28, 2026
Lipsycn constantly stopping while talking Unreal Engine Plugin unreal-engine , question	54	164	August 27, 2024
Response taking long to load. Please hold on tight Voice, Speech & Language language-and-speech , elevenlabs , question	6	98	May 2, 2025
No sound - convai, azure, elevenlabs Voice, Speech & Language language-and-speech , question	9	39	May 7, 2026

Voice drops out during the conversation

Related topics