Project

General

Profile

Actions

Feature #274

open

W3: Integrate Speech-to-Text (Whisper / OpenAI)

Added by Anonymous 3 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Start date:
11/20/2025
Due date:
12/04/2025 (about 3 months late)
% Done:

0%

Estimated time:
3:30 h

Description

Use a Speech-to-Text service to convert incoming audio into text.
Handle STT latency, errors, and return the recognized text to Unreal.
Verify accuracy with multiple speech samples.

Actions #1

Updated by Anonymous 3 months ago

  • Status changed from Re-opened to New
Actions #2

Updated by Anonymous 3 months ago

  • Status changed from New to In Progress
Actions #3

Updated by Anonymous 3 months ago

  • Status changed from In Progress to Resolved
Actions #4

Updated by Anonymous 3 months ago

Unreal Engine handled the microphone recording automatically through the built-in Audio Capture system. My work focused on ensuring that the recorded audio was correctly exported so it could be used by the backend Speech-to-Text (Whisper/OpenAI) service.

What was implemented:

Configured and tested UE5’s automatic microphone recording pipeline.

Verified that UE5 correctly exported .wav files suitable for STT processing.

Prepared the logic to send the recorded audio file to the backend, which handles transcription using Whisper/OpenAI.

Confirmed that Unreal can detect when recording stops and that the audio file is ready for backend processing.

What is not implemented yet:

Unreal Engine does not call Whisper/OpenAI directly.

STT transcription currently happens outside UE, on the backend.

JSON handling and displaying the returned transcript in UE was not part of this step.

This completes the Unreal-side preparation required for the backend STT pipeline.

Actions

Also available in: Atom PDF