How to reduce Convai GRPC response latency

I’m currently using Convai version 3.3.4 with GPT-5.3 Instant. Previously, I was working with Convai 3.2.3 and GPT-4.1, and I’ve observed that the response time has increased after upgrading.

In the earlier setup, the NPC would start speaking much faster. However, with the current configuration, there is a noticeable delay before the dialogue begins. Additionally, I’m seeing that actions take 3 seconds to trigger but they are triggered before the response and the NPC speech starts 2–3 seconds after the action, which creates a visible gap between action and dialogue. for normal communication without an action it takes also 5-7 seconds to get a response.

Ideally, with the newer Convai version and a faster model, the communication latency should have improved, but the current behavior suggests otherwise.

At this stage, upgrading to Convai 4.0.0 is not feasible for me, as my project depends on Embodied Actions.

Could you please suggest possible ways to reduce this communication delay? Specifically, I’d appreciate guidance on:

  • Optimizing response time with GPT-5.3 Instant

  • Any Convai configuration or settings that may affect latency

  • Best practices to synchronize actions and speech more tightly while using Embodied Actions

Looking forward to your recommendations.

also I want to add please that I have also tested with GPT 4.1 after upgrading to 3.3.4 and the result is the same big latency like I described in the post above.
so the core ai model has no affect that’s why I believe it related to the setup after the upgrade.

one of the Character ID: d6459aa8-dd8a-11f0-8ff7-42010a7be027.

Session ID: 38bf4065c52d002035cb5f85da2260f0.

Session ID: c99cf769c19156e2050a140b54ce90c1.

Hi Stav,

Thanks for sharing your character ID.
The main delay is coming from LLM token generation on the provider side, which is outside our infrastructure. Could you try switching to Gemini 2.5 Flash Lite or Gemini 3.1 Flash Lite Preview?

Let us know how it goes

Thanks

Hey, after checking multiple times for Gemini 3.1 Flash lite preview it was much worse. For Gemini 2.5 Flash Light there is a slight improvement but barely Noticeable.

בתאריך יום ה׳, 9 באפר׳ 2026 ב-14:26 מאת Md Manzer Alam <notifications@convai.discoursemail.com>:

How do you test it? With text input or speech?

Speech in VR

בתאריך שבת, 11 באפר׳ 2026 ב-0:59 מאת Kaan <notifications@convai.discoursemail.com>:

No, could you please test it on the playground with text input first?