All models
GPT-4o Transcribe
openai/gpt-4o-transcribe
OpenAI Speech-to-TextTranscriptionMultilingual
A speech-to-text model that uses GPT-4o to transcribe audio with improved word error rate and better language recognition compared to original Whisper models.
Quick start
# Inspect the price — a plain request returns the 402 challenge:
curl -i https://api.glianalabs.com/v1/infer \
-H "content-type: application/json" \
-d '{
"model": "openai/gpt-4o-transcribe",
"file": <string>,
"prompt": <string>
}'
# Pay + run in one step with the mppx CLI (create a wallet: npx mppx account create):
npx mppx https://api.glianalabs.com/v1/infer \
-J '{"model": "openai/gpt-4o-transcribe", "file": "<string>", "prompt": "<string>"}'Parameters
Input
file string required
The audio file as a data URI (data:audio/...;base64,...) or HTTPS URL. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
language string
The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.
prompt string
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
temperature number
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Defaults to 0 if omitted.
Output
text: The transcribed text.