# Readability Extractor (/docs/experiments/readability-extractor)



Load a fully rendered page with **Browser Rendering**, then strip navigation, ads, and sidebars using readability-style heuristics. Returns title, author (when detectable), body text, word count, and estimated read time.

## Features [#features]

* GET /extract - rendered DOM extraction with readability heuristics
* Returns title, author, body, wordCount, readTimeMinutes
* Uses @cloudflare/puppeteer like other Browser Rendering experiments

## API Reference [#api-reference]

### GET /extract [#get-extract]

**`url`** `string` (required) - Article URL (http or https).

#### Example Request [#example-request]

```bash
curl "https://your-worker.workers.dev/extract?url=https://example.com/article"
```

#### Error Codes [#error-codes]

* `400` - `INVALID_URL`
* `502` - `EXTRACT_ERROR`

## Use Cases [#use-cases]

* Build reading-mode or newsletter digest pipelines
* Extract main content from JavaScript-heavy news sites
* Prototype RAG document ingestion from article URLs

## Limitations [#limitations]

* Requires Browser Rendering on your account
* Heuristic extraction; not identical to Mozilla Readability
* Local dev may need `wrangler dev --remote` for browser binding

## Deployment [#deployment]

<Steps>
  <Step>
    ### Click the deploy button [#click-the-deploy-button]

    [![Deploy to Cloudflare Workers](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/shrinathsnayak/cloudflare-experiments/tree/main/apps/experiments/readability-extractor)
  </Step>

  <Step>
    ### Configure bindings [#configure-bindings]

    Browser binding `BROWSER` and `nodejs_compat_v2` in `wrangler.json`.
  </Step>

  <Step>
    ### Test your deployment [#test-your-deployment]

    See the experiment README for curl examples.
  </Step>
</Steps>

## Local Development [#local-development]

```bash
cd apps/experiments/readability-extractor
npm install
npm run dev
```

## Configuration [#configuration]

Browser binding `BROWSER` and `nodejs_compat_v2` in `wrangler.json`.

## Cloudflare Features Used [#cloudflare-features-used]

* **[Browser Rendering](https://developers.cloudflare.com/browser-rendering/)**
* **[Workers](https://developers.cloudflare.com/workers/)**
