Real-Time AI Voice with Azure OpenAI GPT-4o & Node.js (TypeScript)

Article to build real-time application using Microsoft Azure OpenAI's real-time API for text and audio output.

Microsoft recently added support for real-time audio output via the gpt-4o-mini-realtime-preview model. In this tutorial, you'll learn how to use it from a Node.js TypeScript app, send a prompt, and play back the AI-generated speech in real-time.

🛠️ Prerequisites

Node.js v18+ (I'm using v22.15.0)
TypeScript installed globally
Azure OpenAI resource with gpt-4o-mini-realtime-preview deployed

.env file with:

AZURE_OPENAI_ENDPOINT=<Endpoint for your Azure OpenAI resource>
AZURE_OPENAI_API_KEY=<Key to your API>
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini-realtime-preview
OPENAI_API_VERSION=<API version>

📦 Setup

Project Initialization

mkdir azure-gpt4o-audio && cd azure-gpt4o-audio
npm init -y
npm pkg set type=module
npm install openai @azure/identity dotenv speaker

Your package.json file might look something like this

Environment Config

Create a .env file:

AZURE_OPENAI_ENDPOINT=<Endpoint for your Azure OpenAI resource>
AZURE_OPENAI_API_KEY=<Key to your API>
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini-realtime-preview
OPENAI_API_VERSION=<API version>

🎧 TypeScript Code

▶️ Run Your App

You can run the application using the npm run start shorthand that I've got configured in my package.json. It transpiles the Typescript into a JavaScript file and runs the output JS.

Your speaker will now play back the AI’s voice response in real time 🎧.

📌 Final Thoughts

Once you've built this application, read through the API reference and try to build on this by adding more features (maybe a 'Stop' button to stop the audio response?).

Search this blog

RoshanAdhikaryA tech blog that covers a wide spectrum of topics, ranging from programming in Java and Python to building projects using Spring Boot and Node.js

Real-Time AI Voice with Azure OpenAI GPT-4o & Node.js (TypeScript)

🛠️ Prerequisites

📦 Setup

🎧 TypeScript Code

▶️ Run Your App

📌 Final Thoughts

Register a .com.np Domain for Free in Nepal

Spring Boot Integration Testing with MockMvc

Build a Markdown-based Blog with Spring Boot - Part 1

Build a Markdown-based Blog with Spring Boot - Part 3

Real-Time AI Voice with Azure OpenAI GPT-4o & Node.js (TypeScript)

🛠️ Prerequisites

📦 Setup

🎧 TypeScript Code

▶️ Run Your App

📌 Final Thoughts

You may like these posts