Setup
Install the package:Credentials
Get your Soniox API key from the Soniox Console and set it as an environment variable:Usage
Basic transcription
Example how to transcribe audio file using theSonioxDocumentLoader and generate the summary with an LLM.
Async transcription
For async operations, useaload() or alazy_load():
Advanced usage
Language hints
Soniox automatically detects and transcribes speech in 60+ languages. When you know which languages are likely to appear in your audio, providelanguage_hints to improve accuracy by biasing recognition toward those languages.
Language hints do not restrict recognition — they only bias the model toward the specified languages, while still allowing other languages to be detected if present.
Speaker diarization
Enable speaker identification to distinguish between different speakers:Language identification
Enable automatic language detection and identification:Context for improved accuracy
Provide domain-specific context to improve transcription accuracy. Context helps the model understand your domain, recognize important terms, and apply custom vocabulary. Thecontext object supports four optional sections:
Translation
Translate from any detected language to a target language:two_way translation type. Learn more about translation here.
API reference
Constructor parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
file_path | str | No* | None | Path to local audio file to transcribe |
file_data | bytes | No* | None | Binary data of audio file to transcribe |
file_url | str | No* | None | URL of audio file to transcribe |
api_key | str | No | SONIOX_API_KEY env var | Soniox API key |
base_url | str | No | https://api.soniox.com/v1 | API base URL (see regional endpoints) |
options | SonioxTranscriptionOptions | No | SonioxTranscriptionOptions() | Transcription options |
polling_interval_seconds | float | No | 1.0 | Time between status polls (seconds) |
timeout_seconds | float | No | 300.0 (5 minutes) | Maximum time to wait for transcription |
http_request_timeout_seconds | float | No | 60.0 | Timeout for individual HTTP requests |
file_path, file_data, or file_url.
Transcription options
TheSonioxTranscriptionOptions class supports these parameters:
| Parameter | Type | Description |
|---|---|---|
model | str | Async model to use (see available models) |
language_hints | list[str] | Language hints for transcription (ISO language codes) |
language_hints_strict | bool | Enforce strict language hints |
enable_speaker_diarization | bool | Enable speaker identification |
enable_language_identification | bool | Enable language detection |
translation | TranslationConfig | Translation configuration |
context | StructuredContext | Context for improved accuracy |
client_reference_id | str | Custom reference ID for your records |
webhook_url | str | Webhook URL for completion notifications |
webhook_auth_header_name | str | Custom auth header name for webhook |
webhook_auth_header_value | str | Custom auth header value for webhook |
Return value
Thelazy_load() and alazy_load() methods yield a single Document object:
tokens array in metadata includes detailed information for each transcribed word:
text: The transcribed textstart_ms: Start time in millisecondsend_ms: End time in millisecondsspeaker: Speaker ID (if diarization enabled), for example"1","2", etc.language: Detected language (if identification enabled), for example"en","fr", etc.translation_status: Translation status ("original","translated"or"none")
Related
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.