GPT-4o Transcribe

Name: GPT-4o Transcribe
Brand: OpenAI
Price: 0.0108 USD
Availability: InStock

openai/gpt-4o-transcribe

OpenAI Speech-to-TextTranscriptionMultilingual

A speech-to-text model that uses GPT-4o to transcribe audio with improved word error rate and better language recognition compared to original Whisper models.

Quick start

# Inspect the price — a plain request returns the 402 challenge:
curl -i https://api.glianalabs.com/v1/infer \
  -H "content-type: application/json" \
  -d '{
    "model": "openai/gpt-4o-transcribe",
    "file": "https://example.com/input"
  }'

# Pay + run in one step with the mppx CLI (create a wallet: npx mppx account create):
npx mppx https://api.glianalabs.com/v1/infer \
  -J '{"model": "openai/gpt-4o-transcribe", "file": "https://example.com/input"}'

Parameters

Input

file string required

The audio file as a data URI (data:audio/...;base64,...) or HTTPS URL. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.

language string optional

The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

prompt string optional

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

temperature number optional

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Defaults to 0 if omitted.

Output

text: The transcribed text.