Extract API
# Turn web pages into AI-friendly context
## Extract full or excerpted text contents from any public page with ease

Web content, optimized for agents
Declare an objective, get focused excerpts from a target URL. Need everything? Get the full page converted to markdown.
The perfect complement to Parallel Search
Use Search to find the best pages, and Extract to get fresh and full content.
## Features
Describe what you're looking for in plain language. The API returns excerpts aligned to your specific goal.
Priced for scale
# From web page to context window with a single tool call
Directly integrate the Extract API, or use the Web_Fetch Tool in the Parallel Search MCP Server[Parallel Search MCP Server](https://docs.parallel.ai/integrations/mcp/search-mcp)
import os
from parallel import Parallel
client = Parallel(api_key=os.environ["PARALLEL_API_KEY"])
extract = client.beta.extract(
urls=["https://www.un.org/en/about-us/history-of-the-un"],
objective="When was the United Nations established?",
excerpts=True,
full_content=False,
)
print(extract.results)- - Get structured results: The API returns clean markdown with excerpts focused on your objective, plus metadata like publish dates and page titles.
- - Feed your pipeline: Output is optimized for LLM consumption: no HTML cleanup, boilerplate removal, or post-processing required.

## FAQ
Any public URL—including JavaScript-rendered single-page apps, dynamic content, and PDFs.
Describe what you need in plain language. Extract returns only the relevant portions, not just the entire page.
Yes. Set full_content: true for complete page markdown. You can enable both excerpts and full content in the same request.
Dynamic caching by default, based on content type and objective. Override with fetch_policy to force live fetches or accept older cached content.
Similar, but smarter. Traditional scrapers return raw HTML requiring custom parsing per site. Extract handles the complexities of websites to simplify the end-to-end content fetching process, optimized for agent consumption.
No. Crawling discovers pages by following links across sites. Extract retrieves content from URLs you specify. Parallel maintains a crawl index (enabling fast cached responses), but Extract itself is for targeted extraction, not discovery. Use Search API to find pages first.
Crawling = discovery (what pages exist). Scraping = extraction (what's on this page). Most pipelines need both. Search API handles discovery, Extract handles extraction.
Those are browser automation libraries requiring you to manage infrastructure and parsing. Extract is a managed API: send URLs, get markdown. Use Puppeteer/Playwright when you need interaction beyond extraction (forms, screenshots, testing).
Yes. 10 URLs per request, 600 req/min in beta. Higher limits available for production. Particularly efficient for AI workloads as it can be instructed to return only relevant content, not entire pages.
Yes. Same objective-driven extraction, same markdown output.