Troubleshooting latency issues across all characters

Today I am getting 8 to 10 second response times when speaking to characters. The issue seems to be across the board with all my characters and the STT service is also cutting out.
I’m using Unreal Engine, but talking to my agents on the website shows the same latency issues.
Here is one of my character IDs for testing:
f8155bf2-c0eb-11f0-a761-42010a7be025

Please try using a different LLM. You can try Gemini 2.5.

My latest test on this character were ~13 second delays as well as longer time out errors in the chat dialogue about “Response take long to load. Plead hold on tight”.

The LLM change got it to down around 6 seconds but that is still at the max of what we need. Gemini 2.5 has also given us issues with simple actions so now we are bouncing around trying to find the LLM that can both respond with actions reliably and does not have a long latency delay.

The major concern is the inconsistency of the response times. Yesterday we had ~5-6 second delays while today they shot up to over twice that and were often timing out. This was the case on both our business and test accounts.

Which version of the Convai plugin are you using?

I’m using 3.6.8-beta.3 but it’s the same latency on the website as it is in the app.

The free tier seems to get the exact same latency as the Business plan, is this the case?

No, the plan does not affect response time.

Latency is generally influenced by the selected LLM and voice. In some cases, the issue may also come from the LLM provider or voice provider. Your Knowledge Bank content and Narrative Design setup can also affect response time.

I’d recommend trying our new beta package. If you want to test on convai.com, you can also switch the LLM to Gemini 2.5 Live Beta, which offers newer features such as lower latency and hands-free support.

Could you please try again? We’ve updated your character’s LLM to Gemini 3.1 Flash Lite. It’s better and faster than 2.5 Flash.

At this moment our service is unusable. None of our characters (testing on the website and our app) are responding and we are getting timeouts on all our chats that says error fetching prompts. Changing LLMs does not seem to effect this. WE launched a new update today and have only had 500 interactions on our Business level account so this seems to be an actual service issue. con you help?

Could you share the character ID? I tested your character, and it works.

Thanks for the quick response, the service came back online a few minutes ago and seem okay now. We’ve had issues a lot recently related to latency. We thought it could be due to concurrency but we don’t have metrics on that and nothing has changed since updating to the Business account so it must not be that. I see other’s are complaining about outages too. Do we know why this has been happening?

Could you please share the post links?

This is the CharacterID from the screenshot earlier. It is not having issues at the moment for me.
e2aaca18-fdc7-11f0-ac4f-42010a7be027

I’ll have to remove it from the post soon as we are trying to manage to traffic tightly. The one from earlier is a test account.

I see other’s are complaining about outages too.

Could you please share the relevant posts?

We’ve been experiencing periodic major slow-downs and short outages for almost a year. This post that started 2 weeks ago is still active because we are still experiencing major slow-downs and timeouts. We may have to take our own metrics to show what I’m talking about. We always run to the website when having issues to verify it is not our app, so all the tests mentioned here are done talking directly to our characters.

Another post we have followed in the past:

I don’t want to stick my nose in where its not wanted, and note I am using Unity, but Convai of late has been more reliable than it’s ever been (for me). Literally almost no failed responses nowadays whereas 12 months ago it was about once every 10 responses, and responses with Eleven Labs voices and some script edits are 1 to 2 seconds, and that’s with some big prompts etc. I am building with Unity for Meta Quest headsets.

1 Like

Thanks for your input Tyke, I’m glad to hear you are having a good experience. I’m not sure why we would be experiencing more or less the opposite latency times. I’m curious, what region/local time does most of your account activity occur? Are you on a self serve tier or enterprise? What LLMS are you using for your characters?

Most of our activity is in the US from around 11am to 3pm New York time. During these times and regions we have been experiencing very bad slowdowns several times a week.

1 Like

This can happen for a number of different reasons. It may be related to the LLM, voice provider, Knowledge Bank, Narrative Design, or a specific feature being used on our side.

If you can note down and share the session IDs where the issue occurs, we can investigate those sessions more closely and check which part of the pipeline may be causing the slowdown or timeout.

1 Like

Hi :waving_hand: thank you for taking my message in the way I intended, I too have experienced slower responses/missed responses last year and I can understand it must be frustrating for you to experience these issues, I hope they can be sorted ASAP.

In answer to your questions : Most of the users of my app are from the US so usage typically revolves around their waking hours. I am using the Business Tier x 2, I am using mostly the Grok 4 fast (beta) LLM, an Eleven Labs voice (not private, it’s a Convai listed one, “Ophelia” I think), about 2 Knowledge Bank text files per character of around 7 to 14 KB each, no Narrative Design setup on the Convai website but I am using the Convai Dynamic Info System which I believe uses Narrative Design in the background (because of this the prompt is very big I think, I’m really pleased this hasn’t seemed to noticeably slow down character responses). I edited ConvaiNPCAudioManager to give slightly faster responses, I’d gladly share the script with you if you’d like but I think with you using Unreal it probably wouldn’t help.

Thanks for all the details. We also have a Business Tier account as well as a free one we use for testing. The free account and Business Tier get the same latency issues on the website. When we have issues in the app the first thing we do is test the free tier site and sure enough we consistently get the same results there. The lighter weight LLMs definitely make a difference, unfortunately we are extremely restricted in which model we use because we are using simple actions on some of the calls and the reliability on those varies wildly between models. Some of our characters have around a 4KB knowledge bank file and we use a wide range of voices including elevenlabs and some of the multilingual ones offered on the site. The number of tokens really doesn’t seem to effect the latency and we see very little latency variation between small and large calls.

Most of our negative experience comes in the form of a phone call or message from tech support or QA, who are in different states, saying there is an issue. Then we check the site on several accounts to determine if it is an us problem or a service problem. A big day for us is only around 500 interactions over ~6 hours during the peak and we have confirmed that the gap in concurrency between business and free account effectively has no effect on these periodic latency issues for our use case. The slow downs can last from 20 minutes to 2 hours or more which has put us in a really bad place.

1 Like