4.4 KiB
4.4 KiB
Real-Time Event Reference
Complete reference for events in real-time speech-to-text streaming.
Sent Events
input_audio_chunk
Send audio data for transcription.
{
"message_type": "input_audio_chunk",
"audio_base_64": "<base64-encoded-pcm-audio>",
"sample_rate": 16000
}
| Field | Type | Description |
|---|---|---|
message_type |
string | Always "input_audio_chunk" |
audio_base_64 |
string | Base64-encoded PCM audio data |
sample_rate |
number | Sample rate in Hz (8000-48000) |
previous_text |
string | Optional context (first chunk only, max 50 chars) |
commit
Finalize the current transcript segment.
{
"message_type": "commit"
}
Received Events
session_started
Connection established successfully.
{
"type": "session_started",
"session_id": "abc123",
"model_id": "scribe_v2_realtime"
}
partial_transcript
Interim transcription results, updates frequently as audio is processed.
{
"type": "partial_transcript",
"text": "Hello, how are"
}
| Field | Type | Description |
|---|---|---|
type |
string | "partial_transcript" |
text |
string | Current partial transcription |
committed_transcript
Final transcription after commit.
{
"type": "committed_transcript",
"text": "Hello, how are you today?"
}
| Field | Type | Description |
|---|---|---|
type |
string | "committed_transcript" |
text |
string | Finalized transcription |
committed_transcript_with_timestamps
Final transcription with word-level timing. Sent after committed_transcript when include_timestamps=true.
{
"type": "committed_transcript_with_timestamps",
"words": [
{"text": "Hello", "start": 0.0, "end": 0.32, "type": "word"},
{"text": ",", "start": 0.32, "end": 0.35, "type": "punctuation"},
{"text": " ", "start": 0.35, "end": 0.40, "type": "spacing"},
{"text": "how", "start": 0.40, "end": 0.55, "type": "word"}
]
}
| Field | Type | Description |
|---|---|---|
type |
string | "committed_transcript_with_timestamps" |
words |
array | Word-level timing data |
words[].text |
string | The word or token |
words[].start |
number | Start time in seconds |
words[].end |
number | End time in seconds |
words[].type |
string | "word", "spacing", "punctuation", "audio_event" |
Error Events
error
Sent when an error occurs.
{
"type": "error",
"code": "invalid_audio",
"message": "Audio format not supported"
}
Error Codes
| Code | Description |
|---|---|
authentication_failed |
Invalid API key or token |
quota_exceeded |
Usage limit reached |
invalid_audio |
Unsupported audio format |
rate_limited |
Too many requests |
session_time_limit_exceeded |
Session exceeded max duration |
unaccepted_terms |
Terms not accepted in dashboard |
resource_exhausted |
Server capacity reached |
transcription_error |
Internal processing error |
Connection Events
open
WebSocket connection established.
close
WebSocket connection closed.
{
"type": "close",
"code": 1000,
"reason": "Normal closure"
}
Event Handling Examples
Python
async for event in connection:
if event.type == "session_started":
print(f"Session: {event.session_id}")
elif event.type == "partial_transcript":
print(f"Partial: {event.text}")
elif event.type == "committed_transcript":
print(f"Final: {event.text}")
elif event.type == "committed_transcript_with_timestamps":
for word in event.words:
print(f" {word.text}: {word.start}s - {word.end}s")
elif event.type == "error":
print(f"Error: {event.code} - {event.message}")
JavaScript
connection.on("session_started", (data) => {
console.log("Session:", data.sessionId);
});
connection.on("partial_transcript", (data) => {
console.log("Partial:", data.text);
});
connection.on("committed_transcript", (data) => {
console.log("Final:", data.text);
});
connection.on("committed_transcript_with_timestamps", (data) => {
for (const word of data.words) {
console.log(` ${word.text}: ${word.start}s - ${word.end}s`);
}
});
connection.on("error", (error) => {
console.error("Error:", error.code, error.message);
});