Real-Time AI Voice with Azure OpenAI GPT-4o & Node.js (TypeScript)
Microsoft recently added support for real-time audio output via the gpt-4o-mini-realtime-preview
model. In this tutorial, you'll learn how to use it from a Node.js TypeScript app, send a prompt, and play back the AI-generated speech in real-time.
🛠️ Prerequisites
-
Node.js v18+ (I'm using v22.15.0)
TypeScript installed globally
-
Azure OpenAI resource with
gpt-4o-mini-realtime-preview
deployed -
.env
file with:
📦 Setup
-
Project Initialization
Your package.json file might look something like this-
Environment Config
Create a .env
file:
🎧 TypeScript Code
▶️ Run Your App
Your speaker will now play back the AI’s voice response in real time 🎧.
📌 Final Thoughts
Once you've built this application, read through the API reference and try to build on this by adding more features (maybe a 'Stop' button to stop the audio response?).