Hi Convai team,
We’re currently integrating a custom avatar pipeline using Unreal Engine, and we’re planning to utilize Pixel Streaming to display our AI avatar on a frontend web application.
Here’s a quick summary of our setup and the questions we have:
Our Setup
- Avatar is built in Unreal Engine and deployed via Convai.
- We require audio-visual synchronization between the avatar’s animation and AI-generated speech.
- The frontend application needs to stream the avatar (real-time), preferably via Unreal Pixel Streaming.
Key Questions
-
What is the recommended method to stream a Convai-deployed avatar (with synced audio and animation) to a React or Vite frontend?
- Should we host the Unreal client locally and use Pixel Streaming directly?
- Or should we rely entirely on Convai’s streaming endpoint if the avatar is hosted on your infrastructure?
-
Can we upload our own custom avatars (e.g.,
.glb
files or MetaHuman models) directly into Convai’s system for use with their AI models and animation pipeline?- If so, what’s the best practice for deployment—via Convai or Unreal Engine?
-
What are the necessary components or services required to deploy Pixel Streaming in production from an Unreal project that uses Convai?
- Signal server setup, media servers, firewall rules, etc.
- Is there a reference deployment architecture you recommend?
-
For real-time syncing of voice and facial animation, does Convai handle both audio synthesis and animation control, or do we need to separately trigger animation sequences inside Unreal based on TTS input?
-
What’s the best way to integrate user input (e.g., text or microphone input from the browser) into the streaming avatar’s conversational loop?
- Any sample integrations or frontend SDKs available?
-
Are there any bandwidth, latency, or GPU requirements we should plan for when using Convai + Unreal Pixel Streaming at scale?
-
Is there a roadmap for supporting other streaming technologies (e.g., WebRTC directly without Unreal client dependency) for rendering avatars?
-
We need the User’s input (whatever user talking with avatar ) to send to the backend server to process further. Is this possible if our server is hosted on Convai’s server or Unreal Client hosted on cloud? If yes, please guide us in right direction.
Looking forward to your guidance or any relevant documentation or code samples.
Thanks in advance!
— Shubham