I want to convert speech to text (Start Talking) and then modify the text as in Start Typing

Original Discord Post by fdorike | 2023-12-23 23:06:22

Precautions:
I am Japanese, so I do not speak English fluently.

Prerequisite:
I have set up the following

Conversational AI in Unreal Engine Quick Setup Guide | MetaHumans, ReadyPlayerMe, and more | Convai

Contents:
As the title suggests, I want to convert speech to text (Start Talking) and then modify the text as in Start Typing.
It is not always possible to correctly convert speech to text.
If you already have findings, please provide them.
It would be helpful if you could provide us with blue prints. <@&1163218672580575372>

Embedded Content:
Conversational AI in Unreal Engine Quick Setup Guide | MetaHumans, …
How to set up Convai’s Unreal Engine plugin and use the demo project.

Full tutorial playlist: https://www.youtube.com/playlist?list=PLn_7tCx0ChiogfggG1AVo6IkELQSLt6o3

Sign up at https://convai.com to create your own character.
Find the UE plugin here - https://www.unrealengine.com/marketplace/en-US/product/convai
Write to us at support@convai…
Link: https://www.youtube.com/watch?v=HHJvY9dmwwg

Reply by k3kalinix | 2023-12-23 23:11:12

Hello <@848877255198900269> , Welcome to Convai Community!
Let me tag the <@&1163218672580575372>. They will help you as soon as possible.

Replying to k3kalinix’s Message

Reply by k3kalinix | 2023-12-23 23:11:12
Hello <@848877255198900269> , Welcome to Convai Community!
Let me tag the <@&1163218672580575372>. They will help you as soon as possible.

Reply by fdorike | 2023-12-23 23:16:22

I tagged the end of message, is that correct?

Reply by k3kalinix | 2023-12-23 23:20:54

Yes, but it’s too late for them now.

Reply by k3kalinix | 2023-12-23 23:21:15

They will respond within the week.

Reply by freezfast | 2023-12-25 03:18:48

<@848877255198900269>, my apologies, I am not able to understand what you mean, could you clarify in another way?

Replying to freezfast’s Message

Reply by freezfast | 2023-12-25 03:18:48
<@848877255198900269>, my apologies, I am not able to understand what you mean, could you clarify in another way?

Reply by fdorike | 2023-12-25 08:59:02

I think holding down the T button will record what I say(Start Talking function), and when I release the T button , what I say will appear as text in the Chat widget and I will continue the conversation with the chatbot.(Finish Talking function) What I would like to do is not to continue the conversation after releasing the T button, but to modify the text displayed in the chat widget and then press Enter to continue the conversation with the chatbot.(like the Send Text function).
In other words, I want to make sure that the microphone picked up what I said correctly, and if it was wrong, I want to correct it before sending it to the chatbot.

Reply by freezfast | 2023-12-25 13:30:41

<@848877255198900269> thank you so much, I now understand. You can follow this guide to see how to transcribe player’s voice into text Speech To Text Transcription | Documentation

Afterwards, you can correct the text then use the Send Text function to send it to the chatbot. However, please note that the latency will be higher than using the Start Talking function directly, so the characters will take longer to respond.

Replying to freezfast’s Message

Reply by freezfast | 2023-12-25 13:30:41
<@848877255198900269> thank you so much, I now understand. You can follow this guide to see how to transcribe player’s voice into text Speech To Text Transcription | Documentation

Afterwards, you can correct the text then use the Send Text function to send it to the chatbot. However, please note that the latency will be higher than using the Start Talking function directly, so the characters will take longer to respond.

Reply by qqqo_92491 | 2023-12-26 14:44:37

Is there any way to select the language in the guide? I’m working in Korean and it seems like it only recognizes English when I do what the guide says.

Replying to qqqo_92491’s Message

Reply by qqqo_92491 | 2023-12-26 14:44:37
Is there any way to select the language in the guide? I’m working in Korean and it seems like it only recognizes English when I do what the guide says.

Reply by freezfast | 2023-12-26 18:01:48

We will add your request for more speech to text languages to the list of features in the next plugin update

Replying to freezfast’s Message

Reply by freezfast | 2023-12-26 18:01:48
We will add your request for more speech to text languages to the list of features in the next plugin update

Reply by fdorike | 2023-12-28 21:28:05

Thank you very much. I have achieved what I wanted to do.
<@1138879756113293334> I also have the same request as qqqo. thank you qqqo.

Replying to freezfast’s Message

Reply by freezfast | 2023-12-26 18:01:48
We will add your request for more speech to text languages to the list of features in the next plugin update

Reply by qqqo_92491 | 2023-12-30 20:25:49

I was wondering if the next TTS update will include facial expressions and mouth movements based on speech, and when it will be released?

Reply by freezfast | 2023-12-31 16:04:37

<@1138879756113293334>, adding facial expressions and lipsync with the Text To Speech function is not currently on our roadmap, we will take your request into consideration for future development but I can’t give any timeline at the moment. On the other hand you can still use the Invoke Speech function which has lipsync+emotions+animations already integrated and working, just let me know what you want the character to say and I can help you use the function.

Replying to freezfast’s Message

Reply by freezfast | 2023-12-31 16:04:37
<@1138879756113293334>, adding facial expressions and lipsync with the Text To Speech function is not currently on our roadmap, we will take your request into consideration for future development but I can’t give any timeline at the moment. On the other hand you can still use the Invoke Speech function which has lipsync+emotions+animations already integrated and working, just let me know what you want the character to say and I can help you use the function.

Reply by qqqo_92491 | 2024-01-07 18:20:25

As always, thank you for your kind words. What I’m looking for is to implement speech and facial expressions in Korean, but since you probably don’t know Korean very well, I’ll try to keep it simple in English. How do I get the character to say ‘what are we doing today’ verbatim? Like in your screenshot, I put that phrase in the trigger message of the invoke speech node, but it gives a different answer, like a conversation. ‘What do you need’, the character speaks and makes facial expressions. I want the character to speak the phrase as it is in the trigger message.

Images:

Reply by freezfast | 2024-01-07 18:22:50

Try: Say hello to the player, use only Korean language and common Korean words

Replying to freezfast’s Message

Reply by freezfast | 2024-01-07 18:22:50
Try: Say hello to the player, use only Korean language and common Korean words

Reply by qqqo_92491 | 2024-01-07 19:08:46

Thank you. I did what you said and it worked.

Replying to qqqo_92491’s Message

Reply by qqqo_92491 | 2024-01-07 19:08:46
Thank you. I did what you said and it worked.

Reply by freezfast | 2024-01-07 19:10:13

Awesome!

This conversation happened on the Convai Discord Server, so this post will be closed.