Extract API

# Turn web pages into AI-friendly context

## Extract full or excerpted text contents from any public page with ease

Try Extract[Try Extract](https://platform.parallel.ai/play/extract)View docs[View docs](https://docs.parallel.ai/extract/extract-quickstart)

report.md

Web content, optimized for agents

Declare an objective, get focused excerpts from a target URL. Need everything? Get the full page converted to markdown.

Get started[Get started](https://platform.parallel.ai/play/extract)

URLs

The perfect complement to Parallel Search

Use Search to find the best pages, and Extract to get fresh and full content.

Get started[Get started](https://platform.parallel.ai/play/extract)

Agent

## Features

Describe what you're looking for in plain language. The API returns excerpts aligned to your specific goal.

Priced for scale

$1 CPM

($0.001 per URL)

Get started[Get started](https://platform.parallel.ai/play/extract)Contact us[Contact us](https://form.fillout.com/t/sL37Ja5wWKus)

# From web page to context window with a single tool call

Directly integrate the Extract API, or use the Web_Fetch Tool in the Parallel Search MCP Server[Parallel Search MCP Server](https://docs.parallel.ai/integrations/mcp/search-mcp)

Try Extract[Try Extract](https://platform.parallel.ai/play/extract)View docs[View docs](https://docs.parallel.ai/extract/extract-quickstart)

Objective

URLs

https://www.un.org/en/about-us/history-of-the-un

import os
from parallel import Parallel

client = Parallel(api_key=os.environ["PARALLEL_API_KEY"])

extract = client.beta.extract(
    urls=["https://www.un.org/en/about-us/history-of-the-un"],
    objective="When was the United Nations established?",
    excerpts=True,
    full_content=False,
)

print(extract.results)

- Get structured results: The API returns clean markdown with excerpts focused on your objective, plus metadata like publish dates and page titles.
- Feed your pipeline: Output is optimized for LLM consumption: no HTML cleanup, boilerplate removal, or post-processing required.

## Try Parallel for free

Get started[Get started](https://platform.parallel.ai/)Contact us[Contact us](https://form.fillout.com/t/sL37Ja5wWKus)

## FAQ

Any public URL—including JavaScript-rendered single-page apps, dynamic content, and PDFs.

Describe what you need in plain language. Extract returns only the relevant portions, not just the entire page.

Yes. Set full_content: true for complete page markdown. You can enable both excerpts and full content in the same request.

Dynamic caching by default, based on content type and objective. Override with fetch_policy to force live fetches or accept older cached content.

Similar, but smarter. Traditional scrapers return raw HTML requiring custom parsing per site. Extract handles the complexities of websites to simplify the end-to-end content fetching process, optimized for agent consumption.

No. Crawling discovers pages by following links across sites. Extract retrieves content from URLs you specify. Parallel maintains a crawl index (enabling fast cached responses), but Extract itself is for targeted extraction, not discovery. Use Search API to find pages first.

Crawling = discovery (what pages exist). Scraping = extraction (what's on this page). Most pipelines need both. Search API handles discovery, Extract handles extraction.

Those are browser automation libraries requiring you to manage infrastructure and parsing. Extract is a managed API: send URLs, get markdown. Use Puppeteer/Playwright when you need interaction beyond extraction (forms, screenshots, testing).

Yes. 10 URLs per request, 600 req/min in beta. Higher limits available for production. Particularly efficient for AI workloads as it can be instructed to return only relevant content, not entire pages.

Yes. Same objective-driven extraction, same markdown output.

# Turn web pages into AI-friendly context

Web content, optimized for agents

The perfect complement to Parallel Search

## Features

$1 CPM

# From web page to context window with a single tool call

## Try Parallel for free

## FAQ

Contact

For Content Owners

Products

Developers

Company

Resources

Legal