AI Gateway Dashboard

Call Workers AI through AI Gateway instead of direct inference. Returns generated text plus gateway metadata (cache status, latency). Optionally compare cached vs fresh requests for the same prompt.

Features

POST /generate - text generation via AI Gateway
Returns latency and cache hit/miss metadata
Optional compareCache mode runs cached vs skipCache requests

API Reference

POST /generate

prompt string (required) - Text prompt for the model.

compareCache boolean (optional) - Run cached and fresh requests and compare latency.

Default model: @cf/meta/llama-3.1-8b-instruct-fast.

Example Request

curl -X POST "https://your-worker.workers.dev/generate" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Explain Workers AI in one sentence","compareCache":true}'

Error Codes

400 - INVALID_BODY, MISSING_PROMPT
502 - AI_ERROR

Use Cases

Learn AI Gateway cache behavior vs direct Workers AI calls
Compare latency for repeated prompts in prototyping
Reference pattern for gateway options in production Workers

Limitations

Requires Workers AI and AI Gateway enabled on your account
Cache metadata depends on gateway configuration
Text generation only; fixed default model

Deployment

Click the deploy button

Configure bindings

AI binding in wrangler.json. Use gateway id default or your own gateway ID in code.

Test your deployment

See the experiment README for curl examples.

Local Development

cd apps/experiments/ai-gateway-dashboard
npm install
npm run dev

Configuration

AI binding in wrangler.json. Use gateway id default or your own gateway ID in code.

Cloudflare Features Used

Workers AI
AI Gateway

AI Gateway Dashboard

On this page