Experiencing response latency

Hi,

I’m building a VR therapy application on Meta Quest 2 using Convai SDK v4.1.0
with Unity 6 and URP. The app is running standalone on Quest 2 (Android build).

I’m experiencing response latency of around 4-6 seconds from when the patient
speaks to when Delia (my AI character) starts responding. This is noticeable
and affects the therapy experience.

My current setup:

  • Model: gpt-4o-mini
  • Core Description: ~600 words
  • TTS: default Convai voice
  • Platform: Meta Quest 2 standalone, WiFi connection
  • SDK: Convai v4.1.0, Unity 6000.4.2f1

Questions:

  1. Which LLM model gives the fastest response time for short conversational
    replies in Romanian?
  2. Which TTS voice/engine has the lowest latency?
  3. Does Core Description length significantly impact response time?
  4. Are there any SDK-level settings to reduce latency
    (streaming, buffer size, etc.)?
  5. Is there a recommended architecture for minimizing latency on
    standalone VR headsets?

Thank you!

Hello,

Could you please try using GPT-Realtime 1.5 and let us know if the issue still occurs?

Hi, I tried with GPT-Realtime 1.5 and it’s much better.