This site is not affiliated with or endorsed by Cloudflare, Inc. It simply showcases experiments built using Cloudflare services.
Cloudflare Experiments

AI Gateway Dashboard

Route Workers AI through AI Gateway and surface cache and latency metadata

Call Workers AI through AI Gateway instead of direct inference. Returns generated text plus gateway metadata (cache status, latency). Optionally compare cached vs fresh requests for the same prompt.

Features

  • POST /generate - text generation via AI Gateway
  • Returns latency and cache hit/miss metadata
  • Optional compareCache mode runs cached vs skipCache requests

API Reference

POST /generate

prompt string (required) - Text prompt for the model.

compareCache boolean (optional) - Run cached and fresh requests and compare latency.

Default model: @cf/meta/llama-3.1-8b-instruct-fast.

Example Request

curl -X POST "https://your-worker.workers.dev/generate" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Explain Workers AI in one sentence","compareCache":true}'

Error Codes

  • 400 - INVALID_BODY, MISSING_PROMPT
  • 502 - AI_ERROR

Use Cases

  • Learn AI Gateway cache behavior vs direct Workers AI calls
  • Compare latency for repeated prompts in prototyping
  • Reference pattern for gateway options in production Workers

Limitations

  • Requires Workers AI and AI Gateway enabled on your account
  • Cache metadata depends on gateway configuration
  • Text generation only; fixed default model

Deployment

Configure bindings

AI binding in wrangler.json. Use gateway id default or your own gateway ID in code.

Test your deployment

See the experiment README for curl examples.

Local Development

cd apps/experiments/ai-gateway-dashboard
npm install
npm run dev

Configuration

AI binding in wrangler.json. Use gateway id default or your own gateway ID in code.

Cloudflare Features Used

On this page