dimanche 15 octobre 2023

Methodology for real-time voice-animated assistants

Introduction: I am doing the project on AI voice assistance with character animation, it includes voice conversation, chat box and acting virtual character. It's welcome to provide any insight, architecture, project design.

Project: We have installed IOT sensing equipment to detect environmental factors such as temperature and humidity.

For our project, we need to get the IOT parameters and report the environmental conditions to the user through voice assistant and virtual character's dialog.

The key point is how to express the data in a humanized way after getting the data. For example Q: What is the temperature in the exhibition hall? Answer: There are about 30 people in the exhibition hall and the temperature is 23°, so please pay attention to keep warm.

This project emphasizes, real-time conversations and virtual character animation to make it feel like a video call. We hope that we can realize the technology of talking with virtual characters.

The process and logic I expect:

  1. Record the user's voice
  2. Voice to Text
  3. Read IOT environment parameters.
  4. Access to ChatGPT API
  5. Convert ChatGPT answer to voice.
  6. Synthesize the voice and virtual character picture into a video that matches the mouth shape.
  7. Broadcast the video.

Optional:
8. Ensure the speed of speech in the video is synchronized with the pop-up subtitles. 9. Handle streaming technology

Difficulty:

  • Lack of comprehensive API services for animation and voice.
  • The synthesis of voice and character animation takes a long time to compute, resulting in a round of dialog that can take up to several minutes.
  • Difficulty in finding as free (or low-cost) and natural voice generation libraries as possible (need to cover Asian languages such as Cantonese HK).
  • It may be necessary to explore the architecture and how to deploy it

Others
It will also help me to calculate the cost of the services involved according to the methodology provided by you.

Aucun commentaire:

Enregistrer un commentaire