HTML Rewriter
Extract HTML stats and transform pages with HTMLRewriter at the edge
Fetch remote HTML pages and analyze or transform them using the HTMLRewriter API. Stream-parse documents at the edge to count elements or inject content without loading a full DOM parser.
API Reference
GET /stats
Fetches a page and returns HTML statistics collected via HTMLRewriter handlers.
Prop
Type
Example Request
curl "https://your-worker.workers.dev/stats?url=https://example.com"Success Response
url string
The requested page URL
title string | null
Text content of the <title> element, if present
linkCount number
Number of <a> elements on the page
imageCount number
Number of <img> elements on the page
headingCounts object
Count of each heading tag (e.g. h1, h2) found on the page
{
"url": "https://example.com",
"title": "Example Domain",
"linkCount": 1,
"imageCount": 0,
"headingCounts": { "h1": 1 }
}Error Response
{
"error": "Missing or invalid query parameter: url",
"code": "INVALID_URL"
}Error Codes
400- Missing or invalidurl(INVALID_URL)502- Failed to fetch or analyze the page (FETCH_ERROR)
GET /transform
Fetches a page and injects a banner <div> at the top of <body> using HTMLRewriter.
Prop
Type
Example Request
curl "https://your-worker.workers.dev/transform?url=https://example.com&banner=Hello%20from%20the%20edge"Success Response
url string
The requested page URL
banner string
The banner text that was injected
html string
The transformed HTML document
{
"url": "https://example.com",
"banner": "Transformed by Cloudflare HTMLRewriter at the edge",
"html": "<html>...</html>"
}Error Response
{
"error": "Missing or invalid query parameter: url",
"code": "INVALID_URL"
}Error Codes
400- Missing or invalidurl(INVALID_URL)502- Failed to fetch or transform the page (FETCH_ERROR)
Use Cases
- Audit page structure (links, images, headings) without a headless browser
- Inject edge banners or notices into proxied HTML responses
- Learn streaming HTML transformation with HTMLRewriter handlers
- Build lightweight HTML analysis tools at the edge
Limitations
- Static HTML fetch only; JavaScript-rendered content is not executed
- HTML body size is capped before streaming through HTMLRewriter
- Single URL per request; no multi-page crawling
- Fetch timeout on slow origins
Deployment
Deploy
No additional configuration required.
Test your deployment
curl "https://your-worker.workers.dev/stats?url=https://example.com"Local Development
cd apps/experiments/html-rewriter
npm install
npm run devTest locally:
curl "http://localhost:8787/stats?url=https://example.com"
curl "http://localhost:8787/transform?url=https://example.com"Cloudflare Features Used
- Workers - Edge compute runtime
- HTMLRewriter - Streaming HTML parsing and transformation
- Fetch API - Remote HTML retrieval