Real-Time AI Voice with Azure OpenAI GPT-4o & Node.js (TypeScript)
Microsoft recently added support for real-time audio output via the gpt-4o-mini-realtime-preview model. In this tutorial, you'll learn how to use it from a Node.js TypeScript app, send a prompt, and play back the AI-generated speech in real-time.
Prerequisites
-
Node.js v18+ (I'm using v22.15.0)
TypeScript installed globally
-
Azure OpenAI resource with
gpt-4o-mini-realtime-previewdeployed -
.envfile with:
Setup
-
Project Initialization
Your package.json file might look something like this-
Environment Config
Create a .env file:
TypeScript Code
Run Your App
Your speaker will now play back the AI’s voice response in real time.
Final Thoughts
Once you've built this application, read through the API reference and try to build on this by adding more features (maybe a 'Stop' button to stop the audio response?).