Skip to main content

Cloud STT API

Base path: /api/stt

Cloud speech-to-text transcription using Groq Whisper or OpenAI Whisper. The API key stays server-side — the browser never contacts the STT provider directly.


POST /api/stt/transcribe

Transcribe an audio recording. Accepts a multipart upload and forwards it to the configured cloud STT provider.

Request

POST /api/stt/transcribe
Content-Type: multipart/form-data
FieldTypeDescription
audioFileAudio recording (webm, ogg, wav, mp3, etc.)

Constraints:

  • Maximum file size: 25 MB (Groq/OpenAI hard limit)
  • Minimum size: 100 bytes (rejects empty/noise-only recordings)

Response

{
"ok": true,
"data": {
"text": "The transcribed text from the audio recording."
}
}

Error Responses

StatusError
400Cloud STT not configured — set provider in Settings → Voice
400No API key configured for the provider
400No audio file provided
400File too large (> 25 MB)
400Audio too short (< 100 bytes)
502Network error calling STT provider
502STT provider returned an error

Example (curl)

curl -X POST http://localhost:3000/api/stt/transcribe \
-F "audio=@recording.webm"

Configuration

Configure the cloud STT provider in ~/.getthatquick/config/settings.json (or via Settings → Voice in the UI):

{
"stt": {
"provider": "groq",
"cloudApiKey": "gsk_...",
"cloudModel": "whisper-large-v3-turbo"
}
}
FieldValuesDescription
provider"local", "groq", "openai-whisper"STT engine to use
cloudApiKeystringAPI key for the cloud provider
cloudModelstringModel identifier (see below)

Available Models

Groq (recommended — free tier):

ModelSpeedAccuracy
whisper-large-v3-turboFastestGood
whisper-large-v3SlowerBest

OpenAI Whisper:

ModelNotes
whisper-1Classic model
gpt-4o-transcribeBest accuracy
gpt-4o-mini-transcribeFaster, cheaper