88API88API
User GuideAPI ReferenceAI ApplicationsHelp & Support
Realtime

OpenAI Realtime API

?? Overview

Introduction

OpenAI Realtime API provides two connection methods:

  1. WebRTC - For real-time audio/video interaction in browsers and mobile clients

  2. WebSocket - For server-to-server application integration

Use Cases

  • Real-time voice conversations
  • Audio/video conferencing
  • Real-time translation
  • Speech transcription
  • Real-time code generation
  • Server-side real-time integration

Key Features

  • Bidirectional audio streaming
  • Mixed text and audio conversations
  • Function calling support
  • Automatic Voice Activity Detection (VAD)
  • Audio transcription capabilities
  • WebSocket server-side integration

?? Authentication & Security

Authentication Methods

  1. Standard API Key (server-side only)
  2. Ephemeral Token (client-side use)

Ephemeral Token

  • Validity: 1 minute
  • Usage limit: Single connection
  • Generation: Created via server-side API
POST https://88api.ai/v1/realtime/sessions
Content-Type: application/json
Authorization: Bearer $NEW_API_KEY

{
  "model": "gpt-4o-realtime-preview-2024-12-17",
  "voice": "verse"
}

Security Recommendations

  • Never expose standard API keys on the client side
  • Use HTTPS/WSS for communication
  • Implement appropriate access controls
  • Monitor for unusual activity

?? Connection Establishment

WebRTC Connection

  • URL: https://88api.ai/v1/realtime
  • Query parameters: model
  • Headers:
    • Authorization: Bearer EPHEMERAL_KEY
    • Content-Type: application/sdp

WebSocket Connection

  • URL: wss://88api.ai/v1/realtime
  • Query parameters: model
  • Headers:
    • Authorization: Bearer YOUR_API_KEY
    • OpenAI-Beta: realtime=v1

Connection Flow

sequenceDiagram
    participant Client
    participant Server
    participant OpenAI

    alt WebRTC Connection
        Client->>Server: Request ephemeral token
        Server->>OpenAI: Create session
        OpenAI-->>Server: Return ephemeral token
        Server-->>Client: Return ephemeral token

        Client->>OpenAI: Create WebRTC offer
        OpenAI-->>Client: Return answer

        Note over Client,OpenAI: Establish WebRTC connection

        Client->>OpenAI: Create data channel
        OpenAI-->>Client: Confirm data channel
    else WebSocket Connection
        Server->>OpenAI: Establish WebSocket connection
        OpenAI-->>Server: Confirm connection

        Note over Server,OpenAI: Begin real-time conversation
    end

Data Channel

  • Name: oai-events
  • Purpose: Event transmission
  • Format: JSON

Audio Stream

  • Input: addTrack()
  • Output: ontrack event

?? Conversation Interaction

Conversation Modes

  1. Text-only conversations
  2. Voice conversations
  3. Mixed conversations

Session Management

  • Create session
  • Update session
  • End session
  • Session configuration

Event Types

  • Text events
  • Audio events
  • Function calls
  • Status updates
  • Error events

?? Configuration Options

Audio Configuration

  • Input formats
    • pcm16
    • g711_ulaw
    • g711_alaw
  • Output formats
    • pcm16
    • g711_ulaw
    • g711_alaw
  • Voice types
    • alloy
    • echo
    • shimmer

Model Configuration

  • Temperature
  • Maximum output length
  • System prompt
  • Tool configuration

VAD Configuration

  • Threshold
  • Silence duration
  • Prefix padding

?? Request Examples

WebRTC Connection ?

Client Implementation (Browser)

async function init() {
  // Get ephemeral key from server - see server code below
  const tokenResponse = await fetch('/session');
  const data = await tokenResponse.json();
  const EPHEMERAL_KEY = data.client_secret.value;

  // Create peer connection
  const pc = new RTCPeerConnection();

  // Set up remote audio from model playback
  const audioEl = document.createElement('audio');
  audioEl.autoplay = true;
  pc.ontrack = (e) => (audioEl.srcObject = e.streams[0]);

  // Add local audio track from browser microphone input
  const ms = await navigator.mediaDevices.getUserMedia({
    audio: true,
  });
  pc.addTrack(ms.getTracks()[0]);

  // Set up data channel for sending and receiving events
  const dc = pc.createDataChannel('oai-events');
  dc.addEventListener('message', (e) => {
    // Receive real-time server events here!
    console.log(e);
  });

  // Start session using Session Description Protocol (SDP)
  const offer = await pc.createOffer();
  await pc.setLocalDescription(offer);

  const baseUrl = 'https://88api.ai/v1/realtime';
  const model = 'gpt-4o-realtime-preview-2024-12-17';
  const sdpResponse = await fetch(`${baseUrl}?model=${model}`, {
    method: 'POST',
    body: offer.sdp,
    headers: {
      Authorization: `Bearer ${EPHEMERAL_KEY}`,
      'Content-Type': 'application/sdp',
    },
  });

  const answer = {
    type: 'answer',
    sdp: await sdpResponse.text(),
  };
  await pc.setRemoteDescription(answer);
}

init();

Server Implementation (Node.js)

import express from 'express';

const app = express();

// Create an endpoint for generating ephemeral tokens
// This endpoint works with the client code above
app.get('/session', async (req, res) => {
  const r = await fetch(
    'https://88api.ai/v1/realtime/sessions',
    {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${process.env.NEW_API_KEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        model: 'gpt-4o-realtime-preview-2024-12-17',
        voice: 'verse',
      }),
    }
  );
  const data = await r.json();

  // Send the JSON received from OpenAI REST API back to client
  res.send(data);
});

app.listen(3000);

WebRTC Event Send/Receive Example

// Create data channel from peer connection
const dc = pc.createDataChannel('oai-events');

// Listen for server events on data channel
// Event data needs to be parsed from JSON string
dc.addEventListener('message', (e) => {
  const realtimeEvent = JSON.parse(e.data);
  console.log(realtimeEvent);
});

// Send client event: serialize valid client events to
// JSON and send via data channel
const responseCreate = {
  type: 'response.create',
  response: {
    modalities: ['text'],
    instructions: 'Write a haiku about code',
  },
};
dc.send(JSON.stringify(responseCreate));

WebSocket Connection ?

Node.js (ws module)

import WebSocket from 'ws';

const url =
  'wss://88api.ai/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17';
const ws = new WebSocket(url, {
  headers: {
    Authorization: 'Bearer ' + process.env.NEW_API_KEY,
    'OpenAI-Beta': 'realtime=v1',
  },
});

ws.on('open', function open() {
  console.log('Connected to server.');
});

ws.on('message', function incoming(message) {
  console.log(JSON.parse(message.toString()));
});

Python (websocket-client)

# Requires websocket-client library:
# pip install websocket-client

import os
import json
import websocket

NEW_API_KEY = os.environ.get("NEW_API_KEY")

url = "wss://88api.ai/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17"
headers = [
    "Authorization: Bearer " + NEW_API_KEY,
    "OpenAI-Beta: realtime=v1"
]

def on_open(ws):
    print("Connected to server.");

def on_message(ws, message):
    data = json.loads(message)
    print("Received event:", json.dumps(data, indent=2))

ws = websocket.WebSocketApp(
    url,
    header=headers,
    on_open=on_open,
    on_message=on_message,
)

ws.run_forever()

Browser (Standard WebSocket)

/*
Note: In browser and other client environments, we recommend using WebRTC.
But in Deno and Cloudflare Workers and other browser-like environments,
you can also use the standard WebSocket interface.
*/

const ws = new WebSocket(
  'wss://88api.ai/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17',
  [
    'realtime',
    // Authentication
    'openai-insecure-api-key.' + NEW_API_KEY,
    // Optional
    'openai-organization.' + OPENAI_ORG_ID,
    'openai-project.' + OPENAI_PROJECT_ID,
    // Beta protocol, required
    'openai-beta.realtime-v1',
  ]
);

ws.on('open', function open() {
  console.log('Connected to server.');
});

ws.on('message', function incoming(message) {
  console.log(message.data);
});

Message Send/Receive Example

Node.js/Browser
// Receive server events
ws.on('message', function incoming(message) {
  // Need to parse message data from JSON
  const serverEvent = JSON.parse(message.data);
  console.log(serverEvent);
});

// Send events, create JSON data structure conforming to client event format
const event = {
  type: 'response.create',
  response: {
    modalities: ['audio', 'text'],
    instructions: 'Give me a haiku about code.',
  },
};
ws.send(JSON.stringify(event));
Python
# Send client events, serialize dictionary to JSON
def on_open(ws):
    print("Connected to server.");

    event = {
        "type": "response.create",
        "response": {
            "modalities": ["text"],
            "instructions": "Please assist the user."
        }
    }
    ws.send(json.dumps(event))

# Receive messages need to parse message payload from JSON
def on_message(ws, message):
    data = json.loads(message)
    print("Received event:", json.dumps(data, indent=2))

?? Error Handling

Common Errors

  1. Connection errors
    • Network issues
    • Authentication failures
    • Configuration errors
  2. Audio errors
    • Device permissions
    • Unsupported formats
    • Codec issues
  3. Session errors
    • Token expiration
    • Session timeout
    • Concurrency limits

Error Recovery

  1. Automatic reconnection
  2. Session recovery
  3. Error retry
  4. Graceful degradation

?? Event Reference

Common Request Headers

All events need to include the following request headers:

HeaderTypeDescriptionExample Value
AuthorizationStringAuthentication tokenBearer $NEW_API_KEY
OpenAI-BetaStringAPI versionrealtime=v1

Client Events

session.update

Update the default configuration for the session.

ParameterTypeRequiredDescriptionExample Value/Optional Values
event_idStringNoClient-generated event identifierevent_123
typeStringNoEvent typesession.update
modalitiesString arrayNoModality types the model can respond with["text", "audio"]
instructionsStringNoSystem instructions prepended to model calls"Your knowledge cutoff is 2023-10..."
voiceStringNoVoice type used by the modelalloy, echo, shimmer
input_audio_formatStringNoInput audio formatpcm16, g711_ulaw, g711_alaw
output_audio_formatStringNoOutput audio formatpcm16, g711_ulaw, g711_alaw
input_audio_transcription.modelStringNoModel used for transcriptionwhisper-1
turn_detection.typeStringNoVoice detection typeserver_vad
turn_detection.thresholdNumberNoVAD activation threshold (0.0-1.0)0.8
turn_detection.prefix_padding_msIntegerNoAudio duration included before speech starts500
turn_detection.silence_duration_msIntegerNoSilence duration to detect speech stop1000
toolsArrayNoList of tools available to the model[]
tool_choiceStringNoHow the model chooses toolsauto/none/required
temperatureNumberNoModel sampling temperature0.8
max_output_tokensString/IntegerNoMaximum tokens per response"inf"/4096

input_audio_buffer.append

Append audio data to the input audio buffer.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_456
typeStringNoEvent typeinput_audio_buffer.append
audioStringNoBase64-encoded audio dataBase64EncodedAudioData

input_audio_buffer.commit

Commit the audio data in the buffer as a user message.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_789
typeStringNoEvent typeinput_audio_buffer.commit

input_audio_buffer.clear

Clear all audio data from the input audio buffer.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_012
typeStringNoEvent typeinput_audio_buffer.clear

conversation.item.create

Add a new conversation item to the conversation.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_345
typeStringNoEvent typeconversation.item.create
previous_item_idStringNoNew item will be inserted after this IDnull
item.idStringNoUnique identifier for the conversation itemmsg_001
item.typeStringNoType of conversation itemmessage/function_call/function_call_output
item.statusStringNoStatus of conversation itemcompleted/in_progress/incomplete
item.roleStringNoRole of message senderuser/assistant/system
item.contentArrayNoMessage content[text/audio/transcript]
item.call_idStringNoID of function callcall_001
item.nameStringNoName of called functionfunction_name
item.argumentsStringNoArguments for function call{"param": "value"}
item.outputStringNoOutput result of function call{"result": "value"}

conversation.item.truncate

Truncate audio content in assistant messages.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_678
typeStringNoEvent typeconversation.item.truncate
item_idStringNoID of assistant message item to truncatemsg_002
content_indexIntegerNoIndex of content part to truncate0
audio_end_msIntegerNoEnd time point for audio truncation1500

conversation.item.delete

Delete the specified conversation item from conversation history.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_901
typeStringNoEvent typeconversation.item.delete
item_idStringNoID of conversation item to deletemsg_003

response.create

Trigger response generation.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_234
typeStringNoEvent typeresponse.create
response.modalitiesString arrayNoModality types for response["text", "audio"]
response.instructionsStringNoInstructions for the model"Please assist the user."
response.voiceStringNoVoice type used by the modelalloy/echo/shimmer
response.output_audio_formatStringNoOutput audio formatpcm16
response.toolsArrayNoList of tools available to the model["type", "name", "description"]
response.tool_choiceStringNoHow the model chooses toolsauto
response.temperatureNumberNoSampling temperature0.7
response.max_output_tokensInteger/StringNoMaximum output tokens150/"inf"

response.cancel

Cancel ongoing response generation.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoClient-generated event identifierevent_567
typeStringNoEvent typeresponse.cancel

Server Events

error

Event returned when an error occurs.

ParameterTypeRequiredDescriptionExample Value
event_idString arrayNoUnique identifier for server event["event_890"]
typeStringNoEvent typeerror
error.typeStringNoError typeinvalid_request_error/server_error
error.codeStringNoError codeinvalid_event
error.messageStringNoHuman-readable error message"The 'type' field is missing."
error.paramStringNoParameter related to errornull
error.event_idStringNoID of related eventevent_567

conversation.item.input_audio_transcription.completed

Returned when input audio transcription is enabled and transcription succeeds.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_2122
typeStringNoEvent typeconversation.item.input_audio_transcription.completed
item_idStringNoID of user message itemmsg_003
content_indexIntegerNoIndex of content part containing audio0
transcriptStringNoTranscribed text content"Hello, how are you?"

conversation.item.input_audio_transcription.failed

Returned when input audio transcription is configured but transcription request for user message fails.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_2324
typeString arrayNoEvent type["conversation.item.input_audio_transcription.failed"]
item_idStringNoID of user message itemmsg_003
content_indexIntegerNoIndex of content part containing audio0
error.typeStringNoError typetranscription_error
error.codeStringNoError codeaudio_unintelligible
error.messageStringNoHuman-readable error message"The audio could not be transcribed."
error.paramStringNoParameter related to errornull

conversation.item.truncated

Returned when client truncates previous assistant audio message item.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_2526
typeStringNoEvent typeconversation.item.truncated
item_idStringNoID of truncated assistant message itemmsg_004
content_indexIntegerNoIndex of truncated content part0
audio_end_msIntegerNoTime point when audio was truncated (milliseconds)1500

conversation.item.deleted

Returned when an item in the conversation is deleted.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_2728
typeStringNoEvent typeconversation.item.deleted
item_idStringNoID of deleted conversation itemmsg_005

input_audio_buffer.committed

Returned when audio buffer data is committed.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_1121
typeStringNoEvent typeinput_audio_buffer.committed
previous_item_idStringNoNew conversation item will be inserted after this IDmsg_001
item_idStringNoID of user message item to be createdmsg_002

input_audio_buffer.cleared

Returned when client clears input audio buffer.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_1314
typeStringNoEvent typeinput_audio_buffer.cleared

input_audio_buffer.speech_started

In server voice detection mode, returned when voice input is detected.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_1516
typeStringNoEvent typeinput_audio_buffer.speech_started
audio_start_msIntegerNoMilliseconds from session start to voice detection1000
item_idStringNoID of user message item to be created when voice stopsmsg_003

input_audio_buffer.speech_stopped

In server voice detection mode, returned when voice input stops.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_1718
typeStringNoEvent typeinput_audio_buffer.speech_stopped
audio_start_msIntegerNoMilliseconds from session start to voice stop detection2000
item_idStringNoID of user message item to be createdmsg_003

response.created

Returned when a new response is created.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_2930
typeStringNoEvent typeresponse.created
response.idStringNoUnique identifier for responseresp_001
response.objectStringNoObject typerealtime.response
response.statusStringNoStatus of responsein_progress
response.status_detailsObjectNoAdditional details about statusnull
response.outputString arrayNoList of output items generated by response["[]"]
response.usageObjectNoUsage statistics for responsenull

response.done

Returned when response streaming is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_3132
typeStringNoEvent typeresponse.done
response.idStringNoUnique identifier for responseresp_001
response.objectStringNoObject typerealtime.response
response.statusStringNoFinal status of responsecompleted/cancelled/failed/incomplete
response.status_detailsObjectNoAdditional details about statusnull
response.outputString arrayNoList of output items generated by response["[...]"]
response.usage.total_tokensIntegerNoTotal tokens50
response.usage.input_tokensIntegerNoInput tokens20
response.usage.output_tokensIntegerNoOutput tokens30

response.output_item.added

Returned when a new output item is created during response generation.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_3334
typeStringNoEvent typeresponse.output_item.added
response_idStringNoID of response the output item belongs toresp_001
output_indexStringNoIndex of output item in response0
item.idStringNoUnique identifier for output itemmsg_007
item.objectStringNoObject typerealtime.item
item.typeStringNoType of output itemmessage/function_call/function_call_output
item.statusStringNoStatus of output itemin_progress/completed
item.roleStringNoRole associated with output itemassistant
item.contentArrayNoContent of output item["type", "text", "audio", "transcript"]

response.output_item.done

Returned when output item streaming is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_3536
typeStringNoEvent typeresponse.output_item.done
response_idStringNoID of response the output item belongs toresp_001
output_indexStringNoIndex of output item in response0
item.idStringNoUnique identifier for output itemmsg_007
item.objectStringNoObject typerealtime.item
item.typeStringNoType of output itemmessage/function_call/function_call_output
item.statusStringNoFinal status of output itemcompleted/incomplete
item.roleStringNoRole associated with output itemassistant
item.contentArrayNoContent of output item["type", "text", "audio", "transcript"]

response.content_part.added

Returned when a new content part is added to assistant message item during response generation.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_3738
typeStringNoEvent typeresponse.content_part.added
response_idStringNoID of responseresp_001
item_idStringNoID of message item to add content part tomsg_007
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
part.typeStringNoContent typetext/audio
part.textStringNoText content"Hello"
part.audioStringNoBase64-encoded audio data"base64_encoded_audio_data"
part.transcriptStringNoTranscribed text of audio"Hello"

response.content_part.done

Returned when content part in assistant message item streaming is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_3940
typeStringNoEvent typeresponse.content_part.done
response_idStringNoID of responseresp_001
item_idStringNoID of message item to add content part tomsg_007
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
part.typeStringNoContent typetext/audio
part.textStringNoText content"Hello"
part.audioStringNoBase64-encoded audio data"base64_encoded_audio_data"
part.transcriptStringNoTranscribed text of audio"Hello"

response.text.delta

Returned when text value of "text" type content part is updated.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_4142
typeStringNoEvent typeresponse.text.delta
response_idStringNoID of responseresp_001
item_idStringNoID of message itemmsg_007
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
deltaStringNoText delta update content"Sure, I can h"

response.text.done

Returned when "text" type content part text streaming is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_4344
typeStringNoEvent typeresponse.text.done
response_idStringNoID of responseresp_001
item_idStringNoID of message itemmsg_007
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
deltaStringNoFinal complete text content"Sure, I can help with that."

response.audio_transcript.delta

Returned when transcription content of model-generated audio output is updated.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_4546
typeStringNoEvent typeresponse.audio_transcript.delta
response_idStringNoID of responseresp_001
item_idStringNoID of message itemmsg_008
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
deltaStringNoTranscription text delta update content"Hello, how can I a"

response.audio_transcript.done

Returned when transcription of model-generated audio output streaming is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_4748
typeStringNoEvent typeresponse.audio_transcript.done
response_idStringNoID of responseresp_001
item_idStringNoID of message itemmsg_008
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
transcriptStringNoFinal complete transcribed text of audio"Hello, how can I assist you today?"

response.audio.delta

Returned when model-generated audio content is updated.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_4950
typeStringNoEvent typeresponse.audio.delta
response_idStringNoID of responseresp_001
item_idStringNoID of message itemmsg_008
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0
deltaStringNoBase64-encoded audio data delta"Base64EncodedAudioDelta"

response.audio.done

Returned when model-generated audio is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_5152
typeStringNoEvent typeresponse.audio.done
response_idStringNoID of responseresp_001
item_idStringNoID of message itemmsg_008
output_indexIntegerNoIndex of output item in response0
content_indexIntegerNoIndex of content part in message item content array0

Function Calling

response.function_call_arguments.delta

Returned when model-generated function call arguments are updated.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_5354
typeStringNoEvent typeresponse.function_call_arguments.delta
response_idStringNoID of responseresp_002
item_idStringNoID of message itemfc_001
output_indexIntegerNoIndex of output item in response0
call_idStringNoID of function callcall_001
deltaStringNoJSON format function call arguments delta{"location": "San"}

response.function_call_arguments.done

Returned when model-generated function call arguments streaming is complete.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_5556
typeStringNoEvent typeresponse.function_call_arguments.done
response_idStringNoID of responseresp_002
item_idStringNoID of message itemfc_001
output_indexIntegerNoIndex of output item in response0
call_idStringNoID of function callcall_001
argumentsStringNoFinal complete function call arguments (JSON format){"location": "San Francisco"}

Other Status Updates

rate_limits.updated

Triggered after each "response.done" event to indicate updated rate limits.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_5758
typeStringNoEvent typerate_limits.updated
rate_limitsObject arrayNoList of rate limit information[{"name": "requests_per_min", "limit": 60, "remaining": 45, "reset_seconds": 35}]

conversation.created

Returned when conversation is created.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_9101
typeStringNoEvent typeconversation.created
conversationObjectNoConversation resource object{"id": "conv_001", "object": "realtime.conversation"}

conversation.item.created

Returned when conversation item is created.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_1920
typeStringNoEvent typeconversation.item.created
previous_item_idStringNoID of previous conversation itemmsg_002
itemObjectNoConversation item object{"id": "msg_003", "object": "realtime.item", "type": "message", "status": "completed", "role": "user", "content": [{"type": "text", "text": "Hello"}]}

session.created

Returned when session is created.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_1234
typeStringNoEvent typesession.created
sessionObjectNoSession object{"id": "sess_001", "object": "realtime.session", "model": "gpt-4", "modalities": ["text", "audio"]}

session.updated

Returned when session is updated.

ParameterTypeRequiredDescriptionExample Value
event_idStringNoUnique identifier for server eventevent_5678
typeStringNoEvent typesession.updated
sessionObjectNoUpdated session object{"id": "sess_001", "object": "realtime.session", "model": "gpt-4", "modalities": ["text", "audio"]}

Rate Limit Event Parameter Table

ParameterTypeRequiredDescriptionExample Value
nameStringYesLimit namerequests_per_min
limitIntegerYesLimit value60
remainingIntegerYesRemaining available amount45
reset_secondsIntegerYesReset time (seconds)35

Function Call Parameter Table

ParameterTypeRequiredDescriptionExample Value
typeStringYesFunction typefunction
nameStringYesFunction nameget_weather
descriptionStringNoFunction descriptionGet the current weather
parametersObjectYesFunction parameter definition{"type": "object", "properties": {...}}

Audio Format Parameter Table

ParameterTypeDescriptionOptional Values
sample_rateIntegerSample rate8000, 16000, 24000, 44100, 48000
channelsIntegerNumber of channels1 (mono), 2 (stereo)
bits_per_sampleIntegerBits per sample16 (pcm16), 8 (g711)
encodingStringEncoding methodpcm16, g711_ulaw, g711_alaw

Voice Detection Parameter Table

ParameterTypeDescriptionDefault ValueRange
thresholdFloatVAD activation threshold0.50.0-1.0
prefix_padding_msIntegerVoice prefix padding (milliseconds)5000-5000
silence_duration_msIntegerSilence detection duration (milliseconds)1000100-10000

Tool Selection Parameter Table

ParameterTypeDescriptionOptional Values
tool_choiceStringTool selection methodauto, none, required
toolsArrayAvailable tools list[{type, name, description, parameters}]

Model Configuration Parameter Table

ParameterTypeDescriptionRange/Optional ValuesDefault Value
temperatureFloatSampling temperature0.0-2.01.0
max_output_tokensInteger/StringMaximum output length1-4096/"inf""inf"
modalitiesString arrayResponse modalities["text", "audio"]["text"]
voiceStringVoice typealloy, echo, shimmeralloy

Event Common Parameter Table

ParameterTypeRequiredDescriptionExample Value
event_idStringYesUnique identifier for eventevent_123
typeStringYesEvent typesession.update
timestampIntegerNoEvent timestamp (milliseconds)1677649363000

Session Status Parameter Table

ParameterTypeDescriptionOptional Values
statusStringSession statusactive, ended, error
errorObjectError information{"type": "error_type", "message": "error message"}
metadataObjectSession metadata{"client_id": "web", "session_type": "chat"}

Conversation Item Status Parameter Table

ParameterTypeDescriptionOptional Values
statusStringConversation item statuscompleted, in_progress, incomplete
roleStringSender roleuser, assistant, system
typeStringConversation item typemessage, function_call, function_call_output

Content Type Parameter Table

ParameterTypeDescriptionOptional Values
typeStringContent typetext, audio, transcript
formatStringContent formatplain, markdown, html
encodingStringEncoding methodutf-8, base64

Response Status Parameter Table

ParameterTypeDescriptionOptional Values
statusStringResponse statuscompleted, cancelled, failed, incomplete
status_detailsObjectStatus details{"reason": "user_cancelled"}
usageObjectUsage statistics{"total_tokens": 50, "input_tokens": 20, "output_tokens": 30}

Audio Transcription Parameter Table

ParameterTypeDescriptionExample Value
enabledBooleanWhether transcription is enabledtrue
modelStringTranscription modelwhisper-1
languageStringTranscription languageen, zh, auto
promptStringTranscription prompt"Transcript of a conversation"

Audio Stream Parameter Table

ParameterTypeDescriptionOptional Values
chunk_sizeIntegerAudio chunk size (bytes)1024, 2048, 4096
latencyStringLatency modelow, balanced, high
compressionStringCompression methodnone, opus, mp3

WebRTC Configuration Parameter Table

ParameterTypeDescriptionDefault Value
ice_serversArrayICE server list[{"urls": "stun:stun.l.google.com:19302"}]
audio_constraintsObjectAudio constraints{"echoCancellation": true}
connection_timeoutIntegerConnection timeout (milliseconds)30000

On this page

?? Overview
Introduction
Use Cases
Key Features
?? Authentication & Security
Authentication Methods
Ephemeral Token
Security Recommendations
?? Connection Establishment
WebRTC Connection
WebSocket Connection
Connection Flow
Data Channel
Audio Stream
?? Conversation Interaction
Conversation Modes
Session Management
Event Types
?? Configuration Options
Audio Configuration
Model Configuration
VAD Configuration
?? Request Examples
WebRTC Connection ?
Client Implementation (Browser)
Server Implementation (Node.js)
WebRTC Event Send/Receive Example
WebSocket Connection ?
Node.js (ws module)
Python (websocket-client)
Browser (Standard WebSocket)
Message Send/Receive Example
Node.js/Browser
Python
?? Error Handling
Common Errors
Error Recovery
?? Event Reference
Common Request Headers
Client Events
session.update
input_audio_buffer.append
input_audio_buffer.commit
input_audio_buffer.clear
conversation.item.create
conversation.item.truncate
conversation.item.delete
response.create
response.cancel
Server Events
error
conversation.item.input_audio_transcription.completed
conversation.item.input_audio_transcription.failed
conversation.item.truncated
conversation.item.deleted
input_audio_buffer.committed
input_audio_buffer.cleared
input_audio_buffer.speech_started
input_audio_buffer.speech_stopped
response.created
response.done
response.output_item.added
response.output_item.done
response.content_part.added
response.content_part.done
response.text.delta
response.text.done
response.audio_transcript.delta
response.audio_transcript.done
response.audio.delta
response.audio.done
Function Calling
response.function_call_arguments.delta
response.function_call_arguments.done
Other Status Updates
rate_limits.updated
conversation.created
conversation.item.created
session.created
session.updated
Rate Limit Event Parameter Table
Function Call Parameter Table
Audio Format Parameter Table
Voice Detection Parameter Table
Tool Selection Parameter Table
Model Configuration Parameter Table
Event Common Parameter Table
Session Status Parameter Table
Conversation Item Status Parameter Table
Content Type Parameter Table
Response Status Parameter Table
Audio Transcription Parameter Table
Audio Stream Parameter Table
WebRTC Configuration Parameter Table