Parallel
About[About](https://parallel.ai/about)Pricing[Pricing](https://parallel.ai/pricing)Careers[Careers](https://jobs.ashbyhq.com/parallel)Blog[Blog](https://parallel.ai/blog)Docs[Docs](https://docs.parallel.ai/home)
[Start Building]
[Menu]

# Introducing LLMTEXT, an open source toolkit for the llms.txt standard

TL;DR: We're launching LLMTEXT, an open source toolkit that helps developers create, validate, and use llms.txt files—making any website instantly accessible to AI agents through standardized markdown documentation and MCP servers.

Tags:Product Release
Reading time: 7 min
Visit LLMTEXTView Code
Introducing LLMTEXT, an open source toolkit for the llms.txt standard

## The web wasn't built for AI agents.

Wikipedia recently reported[recently reported]($https://flowingdata.com/2025/10/27/wikipedia-losing-human-views-to-ai-summaries/#:~:text=Topic,and%20a%20lot%20of%20bots) an 8% decline in human visitors, and AI has overtaken humans[AI has overtaken humans]($https://cpl.thalesgroup.com/blog/access-management/ai-bots-internet-traffic-imperva-2025-report) as the primary user of the web this year. As LLMs increasingly become the primary way people interact with online information, websites face a critical challenge: how do you serve both human visitors and AI agents effectively?

Today, we’re proud to support the launch of LLMTEXT[LLMTEXT]($http://llmtext.com), an open source toolkit by Jan Wilmake[Jan Wilmake]($https://x.com/janwilmake) to help grow the llms.txt standard. With these tools, developers can more easily create llms.txt files for their websites, check their website’s existing llms.txt for validity, or turn any existing llms.txt into a dedicated MCP (Model Context Protocol) server.

## What is LLMTEXT, and why did we make it?

Jeremy Howard[Jeremy Howard]($https://x.com/jeremyphoward) introduced the llms.txt standard[llms.txt standard]($https://llmstxt.org) to make websites more friendly for large-language models (LLMs) by giving them access to Markdown files that contain the site’s most important text, explicitly excluding distracting elements that otherwise fill up their context windows. The spec has already been adopted by companies like Anthropic, Cloudflare, Docker, HubSpot, and many others.

At Parallel, we believe that AIs will soon be the primary users of the web, which is why we support initiatives like llms.txt that introduce new standards and frameworks to embrace that future.

## What tools are available on LLMTEXT?

The three LLMTEXT tools we're releasing today serve two purposes. First, the **llms.txt MCP **helps developers use projects without hallucination by getting a dedicated MCP for every library or API they use that supports llms.txt. On the other hand, the **Check tool** and **Create tool** aid websites to serve their users the best possible experience. Let's dive into each tool.

## llms.txt MCP tool

The **llms.txt MCP **turns any public llms.txt into a dedicated MCP server. You can think of this like Context7, but instead of one MCP server for all docs, it’s a narrowly-focused MCP server for a website, making it easier to get the right context for products you use often. It also works fundamentally differently: Where Context7 uses vector search to determine what’s relevant, the **llms.txt MCP **leverages the reasoning of the LLM based on the llms.txt overview to decide which documents to ingest into the context window.

An LLMTEXT MCP being used in Cursor

Build a terminal-style chat app using Parallel's llms.txt
![Build a terminal-style chat app using Parallel's llms.txt](https://cdn.sanity.io/images/5hzduz3y/production/9963e0b2f7bfa3c72c4e5a9a2a2593029cc41e52-1276x1428.png)

Many developers have already been using llms.txt or the linked markdown files by manually copying them into their context window, but the llms.txt MCP smoothens this process by having one-click installation and providing the LLM with clear instructions on how to ingest the right context when needed. The MCP exposes two tools:

  1. **get - **explains what this llms.txt is about, will first retrieve the llms.txt itself, then be used again to retrieve multiple relevant contexts for any task
  2. **leaderboard - **shows the most active users of the MCP and other insights

The **llms.txt MCP **can be installed for any llms.txt that follows the standard[the standard]($https://llmstxt.org).

## Check tool

When building the llms.txt MCP and trying it out on some of the available MCPs[some of the available MCPs]($https://github.com/thedaviddias/llms-txt-hub), many of them turned out to be incorrect according to the llms.txt prescribed format[llms.txt prescribed format]($https://llmstxt.org/) for various reasons.

An example of validation for Docker's llms.txt

Illustration demonstrating deep research API concepts, web search capabilities, or AI agent integration features
![](https://cdn.sanity.io/images/5hzduz3y/production/862b6db697a44d5b5007850367bf503d74393f79-2098x1658.png)

The goal of your llms.txt should be to give LLMs the best possible overview: a table of contents to determine where to look for the right information. Or, as our Co-founder and Head of Product, Travers[Travers]($https://x.com/travers00/status/1975947045497344162), puts it, the goal is to retrieve the tokens that agents need to answer or make the next best decision in a loop. This means that you should have clear, distinct titles and descriptions of pages, and the individual results of pages shouldn't be too long.

To incentivize companies to fix their llms.txt and to provide only the best quality to our users, LLMTEXT[LLMTEXT]($http://llmtext.com) only allows installing MCP servers that adhere to the spec fully. Here are the most common mistakes we found in llms.txt files, with examples from popular websites:

### Document Size

To get the most out of llms.txt, documents should be token-efficient. For example, https://developers.cloudflare.com/llms.txt[https://developers.cloudflare.com/llms.txt]($https://developers.cloudflare.com/llms.txt) is 36,000 tokens for just the table of contents, creating a very large minimum amount of tokens.

Another example is https://docs.cursor.com/llms.txt[https://docs.cursor.com/llms.txt]($https://docs.cursor.com/llms.txt), which serves links to several languages. This isn't succinct and creates unnecessary overhead to an LLM that knows most languages.

To make token usage efficient when wading through context, it's best if the llms.txt itself is not bigger than the pages being linked to. If it is, it becomes a significant addition to the context window every time you want to retrieve a piece of information.

Another example is https://supabase.com/llms.txt[https://supabase.com/llms.txt]($https://supabase.com/llms.txt), where the first document linked contains approximately 800,000 tokens, which is far too large for most LLMs to process. As a first rule, we recommend keeping both llms.txt and all linked documents under 10,000 tokens.

### Incorrect content-type

The llms.txt itself, as well as the links it refers to, must lead to a text/markdown or text/plain response. This is the most common mistake in llms.txt files today.

For example, https://www.bitcoin.com/llms.txt[https://www.bitcoin.com/llms.txt]($https://www.bitcoin.com/llms.txt) and https://docs.docker.com/llms.txt[https://docs.docker.com/llms.txt]($https://docs.docker.com/llms.txt) both return HTML for every document linked to, and while listed in some registries, https://elevenlabs.io/llms.txt[https://elevenlabs.io/llms.txt]($https://elevenlabs.io/llms.txt) responds with an HTML document.

In many cases, the content-type is text/plain or text/markdown, yet it can't be parsed according to the spec[the spec]($https://llmstxt.org/). For example, https://cursor.com/llms.txt[https://cursor.com/llms.txt]($https://cursor.com/llms.txt) just lists raw URLs without markdown link format, https://console.groq.com/llms.txt[https://console.groq.com/llms.txt]($https://console.groq.com/llms.txt) does not present its links in an h2 markdown section (##), and https://lmstudio.ai/llms.txt[https://lmstudio.ai/llms.txt]($https://lmstudio.ai/llms.txt) returns all documents directly, concatenated.

### Not served at the root

Many companies ended up not serving their llms.txt at the root. For example, https://www.mintlify.com/docs/llms.txt[https://www.mintlify.com/docs/llms.txt]($https://www.mintlify.com/docs/llms.txt) and https://nextjs.org/docs/llms.txt[https://nextjs.org/docs/llms.txt]($https://nextjs.org/docs/llms.txt) are not hosted at the root, making it hard to find programmatically.

## Create tool

Most websites aren't adapted to the AI internet yet, and instead serve as HTML content intended for humans. Most CMS systems don't support the creation of a Markdown versions either. There are several llms.txt generators (hosted as well as libraries) found on the internet, but many are specific to a certain framework. And many tools to create llms.txt don’t actually follow the llms.txt spec.

For example, some tools just create the llms.txt file itself, but don't refer to plain text or markdown variants of the pages.

Parallel's llms.txt, created with the create tool from LLMTEXT

Illustration demonstrating deep research API concepts, web search capabilities, or AI agent integration features
![](https://cdn.sanity.io/images/5hzduz3y/production/22495cdfdcdd5ed45f81cd2c7e48f32236888d62-3850x2218.png)

The extract-from-sitemap[extract-from-sitemap]($https://github.com/janwilmake/llmtext-mcp/tree) tool is a framework-agnostic way to generate an llms.txt from multiple sources. It scrapes all needed pages and turns them into markdown, powered by the new Parallel Extract API[Parallel Extract API]($https://docs.parallel.ai/api-reference/search-and-extract-api-beta/extract) (beta). We used this library to create our own llms.txt[our own llms.txt]($https://parallel.ai/llms.txt), which is also available through this repo[this repo]($https://github.com/parallel-web/parallel-llmtext) as reference, and installable as MCP[installable as MCP]($https://installthismcp.com/parallel-llmtext-mcp?url=https://mcp.llmtext.com/parallel.ai/mcp) for those building with Parallel's APIs.

## Plans for LLMTEXT

This is just the beginning. We hope that the llms.txt standard thrives and evolves into a more valuable standard with many use-cases. We've already started improving the tooling and adding utilities, and hope to see the open source community contribute as well.

## About Jan Wilmake

Jan has been an active OSS developer building dev tools in the AI context management space. His work includes uithub.com[uithub.com]($http://uithub.com), a context ingestion tool for GitHub, and the openapi-mcp-server[openapi-mcp-server]($https://github.com/janwilmake/openapi-mcp-server), which allows ingesting the full API specification of the operations you’re interested in, following a very similar pattern to how the llms.txt MCP works.

## About Parallel Web Systems

Parallel develops critical web search infrastructure for AI. Our suite of web search and agent APIs is built on a rapidly growing proprietary index of the global internet. These solutions transform human tasks that previously took weeks into agentic tasks that now take just minutes.

Fortune 100 companies use Parallel’s search and agent APIs in insurance, finance, and retail, as well as AI-first businesses like Clay, Starbridge, and Sourcegraph.

Parallel avatar

By Parallel

October 30, 2025

## Related Posts30

Starbridge + Parallel
Parallel avatar

- [How Starbridge powers public sector GTM with state-of-the-art web research](https://parallel.ai/blog/case-study-starbridge)

Tags:Case Study
Reading time: 4 min
Building a market research platform with Parallel Deep Research
Parallel avatar

- [Building a market research platform with Parallel Deep Research](https://parallel.ai/blog/cookbook-market-research-platform-with-parallel)

Tags:Cookbook
Reading time: 4 min
How Lindy brings state-of-the-art web research to automation flows
Parallel avatar

- [How Lindy brings state-of-the-art web research to automation flows](https://parallel.ai/blog/case-study-lindy)

Tags:Case Study
Reading time: 3 min
Introducing the Parallel Task MCP Server
Parallel avatar

- [Introducing the Parallel Task MCP Server](https://parallel.ai/blog/parallel-task-mcp-server)

Tags:Product Release
Reading time: 4 min
Introducing the Core2x Processor for improved compute control on the Task API
Parallel avatar

- [Introducing the Core2x Processor for improved compute control on the Task API](https://parallel.ai/blog/core2x-processor)

Tags:Product Release
Reading time: 2 min
How Day AI merges private and public data for business intelligence
Parallel avatar

- [How Day AI merges private and public data for business intelligence](https://parallel.ai/blog/case-study-day-ai)

Tags:Case Study
Reading time: 4 min
Full Basis framework for all Task API Processors
Parallel avatar

- [Full Basis framework for all Task API Processors](https://parallel.ai/blog/full-basis-framework-for-task-api)

Tags:Product Release
Reading time: 2 min
Building a real-time streaming task manager with Parallel
Parallel avatar

- [Building a real-time streaming task manager with Parallel](https://parallel.ai/blog/cookbook-sse-task-manager-with-parallel)

Tags:Cookbook
Reading time: 5 min
How Gumloop built a new AI automation framework with web intelligence as a core node
Parallel avatar

- [How Gumloop built a new AI automation framework with web intelligence as a core node](https://parallel.ai/blog/case-study-gumloop)

Tags:Case Study
Reading time: 3 min
Introducing the TypeScript SDK
Parallel avatar

- [Introducing the TypeScript SDK](https://parallel.ai/blog/typescript-sdk)

Tags:Product Release
Reading time: 1 min
Building a serverless competitive intelligence platform with MCP + Task API
Parallel avatar

- [Building a serverless competitive intelligence platform with MCP + Task API](https://parallel.ai/blog/cookbook-competitor-research-with-reddit-mcp)

Tags:Cookbook
Reading time: 6 min
Introducing Parallel Deep Research reports
Parallel avatar

- [Introducing Parallel Deep Research reports](https://parallel.ai/blog/deep-research-reports)

Tags:Product Release
Reading time: 2 min
A new pareto-frontier for Deep Research price-performance
Parallel avatar

- [A new pareto-frontier for Deep Research price-performance](https://parallel.ai/blog/deep-research-benchmarks)

Tags:Benchmarks
Reading time: 4 min
Building a Full-Stack Search Agent with Parallel and Cerebras
Parallel avatar

- [Building a Full-Stack Search Agent with Parallel and Cerebras](https://parallel.ai/blog/cookbook-search-agent)

Tags:Cookbook
Reading time: 5 min
Webhooks for the Parallel Task API
Parallel avatar

- [Webhooks for the Parallel Task API](https://parallel.ai/blog/webhooks)

Tags:Product Release
Reading time: 2 min
Introducing Parallel: Web Search Infrastructure for AIs
Parallel avatar

- [Introducing Parallel: Web Search Infrastructure for AIs ](https://parallel.ai/blog/introducing-parallel)

Tags:Benchmarks,Product Release
Reading time: 6 min
Introducing SSE for Task Runs
Parallel avatar

- [Introducing SSE for Task Runs](https://parallel.ai/blog/sse-for-tasks)

Tags:Product Release
Reading time: 2 min
A new line of advanced processors: Ultra2x, Ultra4x, and Ultra8x
Parallel avatar

- [A new line of advanced processors: Ultra2x, Ultra4x, and Ultra8x ](https://parallel.ai/blog/new-advanced-processors)

Tags:Product Release
Reading time: 2 min
Introducing Auto Mode for the Parallel Task API
Parallel avatar

- [Introducing Auto Mode for the Parallel Task API](https://parallel.ai/blog/task-api-auto-mode)

Tags:Product Release
Reading time: 1 min
A linear dithering of a search interface for agents
Parallel avatar

- [A state-of-the-art search API purpose-built for agents](https://parallel.ai/blog/search-api-benchmark)

Tags:Benchmarks
Reading time: 3 min
Parallel Search MCP Server in Devin
Parallel avatar

- [Parallel Search MCP Server in Devin](https://parallel.ai/blog/parallel-search-mcp-in-devin)

Tags:Product Release
Reading time: 2 min
Introducing Tool Calling via MCP Servers
Parallel avatar

- [Introducing Tool Calling via MCP Servers](https://parallel.ai/blog/mcp-tool-calling)

Tags:Product Release
Reading time: 2 min
Introducing the Parallel Search MCP Server
Parallel avatar

- [Introducing the Parallel Search MCP Server ](https://parallel.ai/blog/search-mcp-server)

Tags:Product Release
Reading time: 2 min
Starting today, Source Policy is available for both the Parallel Task API and Search API - giving you granular control over which sources your AI agents access and how results are prioritized.
Parallel avatar

- [Introducing Source Policy](https://parallel.ai/blog/source-policy)

Tags:Product Release
Reading time: 1 min
The Parallel Task Group API
Parallel avatar

- [The Parallel Task Group API](https://parallel.ai/blog/task-group-api)

Tags:Product Release
Reading time: 1 min
State of the Art Deep Research APIs
Parallel avatar

- [State of the Art Deep Research APIs](https://parallel.ai/blog/deep-research)

Tags:Benchmarks
Reading time: 3 min
Introducing the Parallel Search API
Parallel avatar

- [Introducing the Parallel Search API ](https://parallel.ai/blog/parallel-search-api)

Tags:Product Release
Reading time: 2 min
Introducing the Parallel Chat API - a low latency web research API for web based LLM completions. The Parallel Chat API returns completions in text and structured JSON format, and is OpenAI Chat Completions compatible.
Parallel avatar

- [Introducing the Parallel Chat API ](https://parallel.ai/blog/chat-api)

Tags:Product Release
Reading time: 1 min
Parallel Web Systems introduces Basis with calibrated confidences - a new verification framework for AI web research and search API outputs that sets a new industry standard for transparent and reliable deep research.
Parallel avatar

- [Introducing Basis with Calibrated Confidences ](https://parallel.ai/blog/introducing-basis-with-calibrated-confidences)

Tags:Product Release
Reading time: 4 min
The Parallel Task API is a state-of-the-art system for automated web research that delivers the highest accuracy at every price point.
Parallel avatar

- [Introducing the Parallel Task API](https://parallel.ai/blog/parallel-task-api)

Tags:Product Release,Benchmarks
Reading time: 4 min
![Company Logo](https://parallel.ai/parallel-logo-540.png)

Contact

  • hello@parallel.ai[hello@parallel.ai](mailto:hello@parallel.ai)

Resources

  • About[About](https://parallel.ai/about)
  • Pricing[Pricing](https://parallel.ai/pricing)
  • Docs[Docs](https://docs.parallel.ai)
  • Status[Status](https://status.parallel.ai/)
  • Blog[Blog](https://parallel.ai/blog)
  • Changelog[Changelog](https://docs.parallel.ai/resources/changelog)
  • Careers[Careers](https://jobs.ashbyhq.com/parallel)

Info

  • Terms[Terms](https://parallel.ai/terms-of-service)
  • Privacy[Privacy](https://parallel.ai/privacy-policy)
  • Trust Center[Trust Center](https://trust.parallel.ai/)
![SOC 2 Compliant](https://parallel.ai/soc2.svg)
LinkedIn[LinkedIn](https://www.linkedin.com/company/parallel-web/about/)Twitter[Twitter](https://x.com/p0)

Parallel Web Systems Inc. 2025