The endpoint is public by design. For production use, add rate limiting or API key authentication.
Features
- Call any Workers AI text-generation model via one endpoint
- Support for single prompt or chat-style messages
- Optional
max_tokenscontrol - POST (JSON) and GET (query params) interfaces
- No external API keys; uses your Cloudflare account’s Workers AI
API Reference
POST /chat
Send a JSON body with model and prompt or messages.| Field | Required | Description |
|---|---|---|
model | Yes | Workers AI model ID (e.g. @cf/meta/llama-3.1-8b-instruct-fast) |
prompt | No* | Single prompt string |
messages | No* | Array of { role: string, content: string } for chat-style input |
max_tokens | No | Max tokens to generate (1–4096) |
prompt or messages is required.
The AI-generated text.
Example Request
Example Response
GET /chat
Query params:model (required), prompt (required), optional max_tokens.
Workers AI model ID (e.g.
@cf/meta/llama-3.1-8b-instruct-fast).The prompt text. URL-encoded.
Max tokens to generate (1–4096). Optional.
Example Request
Error Responses
- 400 Bad Request —
INVALID_BODY,MISSING_MODEL,MISSING_PROMPT, orINVALID_MAX_TOKENS - 502 Bad Gateway —
AI_ERROR(model run failed)
Run locally
Requires a Cloudflare account with Workers AI enabled. See Workers AI for setup.
Deploy
Cloudflare features used
- Workers — Edge runtime
- Workers AI —
AIbinding for text generation