This site is not affiliated with or endorsed by Cloudflare, Inc. It simply showcases experiments built using Cloudflare services.
Cloudflare Experiments

Speech to Text Transcriber

Transcribe uploaded audio with Workers AI Whisper at the edge

Upload an audio file and receive a transcript using Workers AI Whisper (@cf/openai/whisper-large-v3-turbo). Validates file size and content type before calling the model.

Features

  • POST /transcribe - Multipart upload with field audio
  • Whisper model - @cf/openai/whisper-large-v3-turbo
  • Validation - Max 2 MB, audio/* types only
  • Timing - Returns durationMs for the transcription call

API Reference

POST /transcribe

Transcribe an uploaded audio file.

audio file (required, multipart)

Audio file (max 2 MB). Accepted types include audio/mpeg, audio/wav, audio/webm, audio/ogg, and other audio/* MIME types.

Example Request

curl -X POST "https://your-worker.workers.dev/transcribe" \
  -F "audio=@sample.mp3"

Success Response

{
  "text": "Hello, this is a test recording.",
  "language": "en",
  "durationMs": 842
}

Error Codes

  • 400 - Missing or invalid audio (INVALID_AUDIO)
  • 502 - Transcription failed or empty result (TRANSCRIBE_ERROR)

Use Cases

  • Add speech-to-text to edge apps without external API keys
  • Prototype voice note or meeting transcription workflows
  • Learn Workers AI audio model integration patterns

Limitations

  • Max upload size 2 MB per request (Whisper model limits apply)
  • No chunking for long recordings; split large files client-side
  • Requires Workers AI enabled on your Cloudflare account

Deployment

Deploy

Enable Workers AI. The AI binding is declared in wrangler.json.

Test your deployment

curl -X POST "https://your-worker.workers.dev/transcribe" -F "audio=@sample.mp3"

Local Development

cd apps/experiments/speech-to-text-transcriber
npm install
npm run dev

Configuration

BindingPurpose
AIWorkers AI Whisper model

Cloudflare Features Used

On this page