# HTML Rewriter (/docs/experiments/html-rewriter)



Fetch remote HTML pages and analyze or transform them using the [HTMLRewriter](https://developers.cloudflare.com/workers/runtime-apis/html-rewriter/) API. Stream-parse documents at the edge to count elements or inject content without loading a full DOM parser.

## API Reference [#api-reference]

### GET /stats [#get-stats]

Fetches a page and returns HTML statistics collected via HTMLRewriter handlers.

<TypeTable
  type="{
  url: {
    description: &#x22;Page URL to analyze (`http://` or `https://` only)&#x22;,
    type: &#x22;string&#x22;,
    required: true,
  },
}"
/>

#### Example Request [#example-request]

```bash
curl "https://your-worker.workers.dev/stats?url=https://example.com"
```

#### Success Response [#success-response]

**`url`** `string`

The requested page URL

**`title`** `string | null`

Text content of the `<title>` element, if present

**`linkCount`** `number`

Number of `<a>` elements on the page

**`imageCount`** `number`

Number of `<img>` elements on the page

**`headingCounts`** `object`

Count of each heading tag (e.g. `h1`, `h2`) found on the page

```json
{
  "url": "https://example.com",
  "title": "Example Domain",
  "linkCount": 1,
  "imageCount": 0,
  "headingCounts": { "h1": 1 }
}
```

#### Error Response [#error-response]

```json
{
  "error": "Missing or invalid query parameter: url",
  "code": "INVALID_URL"
}
```

#### Error Codes [#error-codes]

* `400` - Missing or invalid `url` (`INVALID_URL`)
* `502` - Failed to fetch or analyze the page (`FETCH_ERROR`)

### GET /transform [#get-transform]

Fetches a page and injects a banner `<div>` at the top of `<body>` using HTMLRewriter.

<TypeTable
  type="{
  url: {
    description: &#x22;Page URL to transform (`http://` or `https://` only)&#x22;,
    type: &#x22;string&#x22;,
    required: true,
  },
  banner: {
    description:
      &#x22;Banner text to inject (default: `Transformed by Cloudflare HTMLRewriter at the edge`)&#x22;,
    type: &#x22;string&#x22;,
  },
}"
/>

#### Example Request [#example-request-1]

```bash
curl "https://your-worker.workers.dev/transform?url=https://example.com&banner=Hello%20from%20the%20edge"
```

#### Success Response [#success-response-1]

**`url`** `string`

The requested page URL

**`banner`** `string`

The banner text that was injected

**`html`** `string`

The transformed HTML document

```json
{
  "url": "https://example.com",
  "banner": "Transformed by Cloudflare HTMLRewriter at the edge",
  "html": "<html>...</html>"
}
```

#### Error Response [#error-response-1]

```json
{
  "error": "Missing or invalid query parameter: url",
  "code": "INVALID_URL"
}
```

#### Error Codes [#error-codes-1]

* `400` - Missing or invalid `url` (`INVALID_URL`)
* `502` - Failed to fetch or transform the page (`FETCH_ERROR`)

## Use Cases [#use-cases]

* Audit page structure (links, images, headings) without a headless browser
* Inject edge banners or notices into proxied HTML responses
* Learn streaming HTML transformation with HTMLRewriter handlers
* Build lightweight HTML analysis tools at the edge

## Limitations [#limitations]

* Static HTML fetch only; JavaScript-rendered content is not executed
* HTML body size is capped before streaming through HTMLRewriter
* Single URL per request; no multi-page crawling
* Fetch timeout on slow origins

## Deployment [#deployment]

<Steps>
  <Step>
    ### Click the deploy button [#click-the-deploy-button]

    [![Deploy to Cloudflare Workers](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/shrinathsnayak/cloudflare-experiments/tree/main/apps/experiments/html-rewriter)
  </Step>

  <Step>
    ### Deploy [#deploy]

    No additional configuration required.
  </Step>

  <Step>
    ### Test your deployment [#test-your-deployment]

    ```bash
    curl "https://your-worker.workers.dev/stats?url=https://example.com"
    ```
  </Step>
</Steps>

## Local Development [#local-development]

```bash
cd apps/experiments/html-rewriter
npm install
npm run dev
```

Test locally:

```bash
curl "http://localhost:8787/stats?url=https://example.com"
curl "http://localhost:8787/transform?url=https://example.com"
```

## Cloudflare Features Used [#cloudflare-features-used]

* **[Workers](https://developers.cloudflare.com/workers/)** - Edge compute runtime
* **[HTMLRewriter](https://developers.cloudflare.com/workers/runtime-apis/html-rewriter/)** - Streaming HTML parsing and transformation
* **[Fetch API](https://developers.cloudflare.com/workers/runtime-apis/fetch/)** - Remote HTML retrieval
