How to reduce Convai GRPC response latency

Stav_Cohen · April 7, 2026, 7:38am

I’m currently using Convai version 3.3.4 with GPT-5.3 Instant. Previously, I was working with Convai 3.2.3 and GPT-4.1, and I’ve observed that the response time has increased after upgrading.

In the earlier setup, the NPC would start speaking much faster. However, with the current configuration, there is a noticeable delay before the dialogue begins. Additionally, I’m seeing that actions take 3 seconds to trigger but they are triggered before the response and the NPC speech starts 2–3 seconds after the action, which creates a visible gap between action and dialogue. for normal communication without an action it takes also 5-7 seconds to get a response.

Ideally, with the newer Convai version and a faster model, the communication latency should have improved, but the current behavior suggests otherwise.

At this stage, upgrading to Convai 4.0.0 is not feasible for me, as my project depends on Embodied Actions.

Could you please suggest possible ways to reduce this communication delay? Specifically, I’d appreciate guidance on:

Optimizing response time with GPT-5.3 Instant
Any Convai configuration or settings that may affect latency
Best practices to synchronize actions and speech more tightly while using Embodied Actions

Looking forward to your recommendations.

also I want to add please that I have also tested with GPT 4.1 after upgrading to 3.3.4 and the result is the same big latency like I described in the post above.
so the core ai model has no affect that’s why I believe it related to the setup after the upgrade.

one of the Character ID: d6459aa8-dd8a-11f0-8ff7-42010a7be027.

Session ID: 38bf4065c52d002035cb5f85da2260f0.

Session ID: c99cf769c19156e2050a140b54ce90c1.

Manzer · April 9, 2026, 11:21am

Hi Stav,

Thanks for sharing your character ID.
The main delay is coming from LLM token generation on the provider side, which is outside our infrastructure. Could you try switching to Gemini 2.5 Flash Lite or Gemini 3.1 Flash Lite Preview?

Let us know how it goes

Thanks

Stav_Cohen · April 10, 2026, 7:01am

Hey, after checking multiple times for Gemini 3.1 Flash lite preview it was much worse. For Gemini 2.5 Flash Light there is a slight improvement but barely Noticeable.

בתאריך יום ה׳, 9 באפר׳ 2026 ב-14:26 מאת Md Manzer Alam <notifications@convai.discoursemail.com>:

K3 · April 10, 2026, 9:53pm

How do you test it? With text input or speech?

Stav_Cohen · April 10, 2026, 10:05pm

Speech in VR

בתאריך שבת, 11 באפר׳ 2026 ב-0:59 מאת Kaan <notifications@convai.discoursemail.com>:

K3 · April 10, 2026, 10:10pm

No, could you please test it on the playground with text input first?

Topic		Replies	Views
High Response Latency with Convai in Unreal Engine 5 (Indie Plan) – How to Optimize? Questions unreal-engine	6	146	June 24, 2025
Very slow response Conversation Issues conversation-issues	37	746	July 3, 2025
How can I reduce response times? Questions unity	11	331	January 20, 2025
Troubleshooting latency issues across all characters Core AI Settings conversation-issues	28	87	April 14, 2026
Balancing latency and safety in NPCs Personality and Style unity , conversation-issues	5	149	November 9, 2025

How to reduce Convai GRPC response latency

Related topics