Parallel
About[About](https://parallel.ai/about)Pricing[Pricing](https://parallel.ai/pricing)Careers[Careers](https://jobs.ashbyhq.com/parallel)Blog[Blog](https://parallel.ai/blog)Docs[Docs](https://docs.parallel.ai/home)
[Start Building]
[Menu]

# A new pareto-frontier for Deep Research price-performance

Expanded results that demonstrate Parallel's complete price-performance advantage in Deep Research.

Tags:Benchmarks
Reading time: 4 min
A new pareto-frontier for Deep Research price-performance

We previously released benchmarks[previously released benchmarks]($https://parallel.ai/blog/introducing-parallel) for Parallel Deep Research that demonstrated superior accuracy and win rates against leading AI models. Today, we're publishing expanded results that showcase our complete price-performance advantage - delivering the highest accuracy across every price point.

## **Parallel leads in accuracy at every price point**

We evaluated Parallel against all available deep research APIs on two industry-standard benchmarks. Our processors consistently deliver the highest accuracy at each price tier.

### **BrowseComp Benchmark**

OpenAI's BrowseComp tests deep research capabilities through 1,266 complex questions requiring multi-hop reasoning, creative search strategies, and synthesis across scattered sources.

BrowseComp
128256512102420484096Cost (CPM)0510152025303540455055PRO34% / 100CPMULTRA45% / 300CPMULTRA2X51% / 600CPMULTRA4X56% / 1200CPMULTRA8X58% / 2400CPMGPT-538% / 488CPMANTHROPIC7% / 5194CPMEXA14% / 402CPMPERPLEXITY6% / 709CPM

COST (CPM)

ACCURACY (%)

Loading chart...

CPM: USD per 1000 requests. Cost is shown on a Log scale.

Parallel
Others
BrowseComp benchmark analysis: CPM: USD per 1000 requests. Cost is shown on a Log scale. . Evaluation shows Parallel's enterprise deep research API for AI agents achieving up to 48% accuracy, outperforming GPT-4 browsing (1%), Claude search (6%), Exa (14%), and Perplexity (8%). Enterprise-grade structured deep research performance across Cost (CPM) and Accuracy (%). State-of-the-art enterprise deep research API with structured data extraction built for ChatGPT deep research and complex multi-hop AI agent workflows.

### About the benchmark

This benchmark[benchmark]($https://openai.com/index/browsecomp/), created by OpenAI, contains 1,266 questions requiring multi-hop reasoning, creative search formulation, and synthesis of contextual clues across time periods. Results are reported on a random sample of 100 questions from this benchmark.

### Methodology

  • - Dates: All measurements were made between 08/11/2025 and 08/29/2025.
  • - Configurations: For all competitors, we report the highest numbers we were able to achieve across multiple configurations of their APIs. The exact configurations are below.
    • - GPT-5: high reasoning, high search context, default verbosity
    • - Exa: Exa Research Pro
    • - Anthropic: Claude Opus 4.1
    • - Perplexity: Sonar Deep Research reasoning effort high

### New Browsecomp

| Series    | Model      | Cost (CPM) | Accuracy  (%) |
| --------- | ---------- | ---------- | ------------- |
| Parallel  | Pro        | 100        | 34            |
| Parallel  | Ultra      | 300        | 45            |
| Parallel  | Ultra2x    | 600        | 51            |
| Parallel  | Ultra4x    | 1200       | 56            |
| Parallel  | Ultra8x    | 2400       | 58            |
| Others    | GPT-5      | 488        | 38            |
| Others    | Anthropic  | 5194       | 7             |
| Others    | Exa        | 402        | 14            |
| Others    | Perplexity | 709        | 6             |

CPM: USD per 1000 requests. Cost is shown on a Log scale.

### About the benchmark

This benchmark[benchmark]($https://openai.com/index/browsecomp/), created by OpenAI, contains 1,266 questions requiring multi-hop reasoning, creative search formulation, and synthesis of contextual clues across time periods. Results are reported on a random sample of 100 questions from this benchmark.

### Methodology

  • - Dates: All measurements were made between 08/11/2025 and 08/29/2025.
  • - Configurations: For all competitors, we report the highest numbers we were able to achieve across multiple configurations of their APIs. The exact configurations are below.
    • - GPT-5: high reasoning, high search context, default verbosity
    • - Exa: Exa Research Pro
    • - Anthropic: Claude Opus 4.1
    • - Perplexity: Sonar Deep Research reasoning effort high

Our results demonstrate clear price-performance leadership, with our Ultra processor achieving 45% accuracy at $300 CPM at up to 17X lower cost compared to alternatives. Our newly available high-compute processors push accuracy even further for critical research tasks, with Ultra8x reaching 58%.


**DeepResearch Bench**

DeepResearch Bench evaluates the quality of long-form deep research reports across 22 fields including Business & Finance, Science & Technology, and Software Development. The benchmark consists of 100 PhD-level tasks and assesses the multistep web exploration, targeted retrieval, and higher-order synthesis capabilities of deep research agents.

DeepResearch Bench
512102420484096Cost (CPM)0102030405060708090ULTRA82% / 300CPMULTRA2X86% / 600CPMULTRA4X92% / 1200CPMULTRA8X96% / 2400CPMGPT-566% / 628CPMO3 PRO30% / 4331CPMO326% / 605CPMPERPLEXITY6% / 538CPM

COST (CPM)

WIN RATE VS REFERENCE (%)

Loading chart...

CPM: USD per 1000 requests. Cost is shown on a Log scale.

Parallel
Others
BrowseComp benchmark analysis: CPM: USD per 1000 requests. Cost is shown on a Log scale. . Evaluation shows Parallel's enterprise deep research API for AI agents achieving up to 48% accuracy, outperforming GPT-4 browsing (1%), Claude search (6%), Exa (14%), and Perplexity (8%). Enterprise-grade structured deep research performance across Cost (CPM) and Win Rate vs Reference (%). State-of-the-art enterprise deep research API with structured data extraction built for ChatGPT deep research and complex multi-hop AI agent workflows.

### About the benchmark

This benchmark[benchmark]($https://github.com/Ayanami0730/deep_research_bench) contains 100 expert-level research tasks designed by domain specialists across 22 fields, primarily Science & Technology, Business & Finance, and Software Development. It evaluates AI systems' ability to produce rigorous, long-form research reports on complex topics requiring cross-disciplinary synthesis. Results are reported from the subset of 50 English-language tasks in the benchmark.

### Methodology

  • - Dates: All measurements were made between 08/11/2025 and 08/29/2025.
  • - Win Rate: Calculated by comparing RACE[RACE]($https://github.com/Ayanami0730/deep_research_bench) scores in direct head-to-head evaluations against reference reports.
  • - Configurations: For all competitors, we report results for the highest numbers we were able to achieve across multiple configurations of their APIs. The exact GPT-5 configuration is high reasoning, high search context, and high verbosity.
  • - Excluded API Results: Exa Research Pro (0% win rate), Claude Opus 4.1 (0% win rate).

### RACER

| Series   | Model      | Cost (CPM) | Win Rate vs Reference (%) |
| -------- | ---------- | ---------- | ------------------------- |
| Parallel | Ultra      | 300        | 82                        |
| Parallel | Ultra2x    | 600        | 86                        |
| Parallel | Ultra4x    | 1200       | 92                        |
| Parallel | Ultra8x    | 2400       | 96                        |
| Others   | GPT-5      | 628        | 66                        |
| Others   | O3 Pro     | 4331       | 30                        |
| Others   | O3         | 605        | 26                        |
| Others   | Perplexity | 538        | 6                         |

CPM: USD per 1000 requests. Cost is shown on a Log scale.

### About the benchmark

This benchmark[benchmark]($https://github.com/Ayanami0730/deep_research_bench) contains 100 expert-level research tasks designed by domain specialists across 22 fields, primarily Science & Technology, Business & Finance, and Software Development. It evaluates AI systems' ability to produce rigorous, long-form research reports on complex topics requiring cross-disciplinary synthesis. Results are reported from the subset of 50 English-language tasks in the benchmark.

### Methodology

  • - Dates: All measurements were made between 08/11/2025 and 08/29/2025.
  • - Win Rate: Calculated by comparing RACE[RACE]($https://github.com/Ayanami0730/deep_research_bench) scores in direct head-to-head evaluations against reference reports.
  • - Configurations: For all competitors, we report results for the highest numbers we were able to achieve across multiple configurations of their APIs. The exact GPT-5 configuration is high reasoning, high search context, and high verbosity.
  • - Excluded API Results: Exa Research Pro (0% win rate), Claude Opus 4.1 (0% win rate).

Parallel Ultra achieves an 82% win rate against reference reports at $300 CPM, compared to GPT-5's 66% win rate at $628 CPM - delivering superior quality at half the cost. Our highest compute processor, Ultra8x, reaches a 96% win rate, representing a significant improvement from our previously published 82% benchmark.

We also measured win rate against GPT-5 directly by comparing the RACE scores of Parallel processors vs GPT-5. The results demonstrate that Ultra8x achieves an 88% win rate against GPT-5.

Head-to-head comparison with GPT-5
Performance comparison proving Parallel delivers the best enterprise deep research API for ChatGPT and AI agents with 48% accuracy vs competitors' 14% max across Model and Win Rate vs GPT-5 (%). Multi-hop research benchmark shows Parallel's structured AI agent deep research outperforms GPT-4, Claude, Exa, and Perplexity. Enterprise-ready structured deep research API with MCP server integration.

### DeepResearch Bench against GPT-5

| Category | Win Rate (%) |
| -------- | ------------ |
| Ultra8x  | 88           |
| Ultra4x  | 84           |
| Ultra2x  | 80           |
| Ultra    | 74           |

## **Beyond benchmarks: Flexible outputs, fully verifiable**

These benchmark results translate directly to production value. Parallel Deep Research delivers the same high accuracy in whichever format you need - human-readable reports for strategic analysis or structured JSON for machine consumption and database ingestion.

Every output, regardless of format, includes our comprehensive Basis framework:

  • - **Citations**: Direct links to source materials
  • - **Reasoning**: Explanations for each finding
  • - **Confidence**: Calibrated scores (low/medium/high) for intelligent routing
  • - **Excerpts**: Relevant text snippets from cited sources

This complete verification layer means the accuracy demonstrated in our benchmarks comes with the audibility and transparency required for production workflows where every detail matters.

## **Built for scale: 1000x more research, predictably priced**

Our price-performance advantage unlocks new possibilities. At these price points, you can run 1000x the number of queries compared to token-based alternatives - transforming deep research from an occasional tool to core infrastructure.

Consider the possibilities:

  • - **Build research databases**: Run thousands of queries, store structured results, and query them downstream
  • - **Continuous intelligence**: Monitor competitors, markets[markets]($https://github.com/parallel-web/parallel-cookbook/blob/main/python-recipes/Deep_Research_Recipe.ipynb), and trends with daily deep research updates
  • - **Pipeline integration**: Use research outputs as inputs for downstream analysis, decision-making, or automation
  • - **Parallel processing**: Research hundreds of entities simultaneously for large-scale enrichment

Our per-query pricing model ensures complete cost predictability. Unlike token-based systems where a single complex query can unexpectedly consume your budget, every Parallel query costs exactly what you expect. This predictability enables confident scaling - whether you're running 10 queries or 10,000.

## **Start building with Deep Research**

Parallel Deep Research is available today through our Task API. Choose the processor that matches your accuracy and budget requirements, from Pro for simpler deep research to Ultra8x for the most demanding deep research tasks.

Get started in our Developer Platform[Developer Platform]($https://platform.parallel.ai/) or explore our documentation[documentation]($https://docs.parallel.ai/task-api/features/task-deep-research).

## **Notes on Methodology**

_Benchmark Dates_: Benchmarks were run from Aug 11 to Aug 29.

_DeepResearchBench Evaluation_**: **We evaluated all available DeepResearch API solutions on the 50 English-language tasks in the benchmark, measuring both RACE and FACT scores for generated reports. Given that RACE is a relative scoring metric benchmarked against reference materials, we calculated win-rates by comparing each vendor's performance to the human reference reports included in the dataset. A candidate report achieves a "win" when its RACE score exceeds that of the corresponding human reference report.

_BrowseComp Evaluation_**: **For the BrowseComp benchmark, we tested our processors alongside other APIs on a random 100-question subset of the original 1,266-question dataset. All systems were evaluated using the same standard LLM evaluator with consistent evaluation criteria, comparing agent responses against verified ground truth answers.

_Cost Calculation_: Token-based pricing is normalized to cost per thousand queries (CPM) based on actual usage in benchmarks.


Parallel avatar

By Parallel

September 9, 2025

## Related Posts17

Building a Full-Stack Search Agent with Parallel and Cerebras
Parallel avatar

- [Building a Full-Stack Search Agent with Parallel and Cerebras](https://parallel.ai/blog/cookbook-search-agent)

Tags:Cookbook
Reading time: 5 min
Webhooks for the Parallel Task API
Parallel avatar

- [Webhooks for the Parallel Task API](https://parallel.ai/blog/webhooks)

Tags:Product Release
Reading time: 2 min
Introducing Parallel: Web Search Infrastructure for AIs
Parallel avatar

- [Introducing Parallel: Web Search Infrastructure for AIs ](https://parallel.ai/blog/introducing-parallel)

Tags:Benchmarks,Product Release
Reading time: 6 min
Introducing SSE for Task Runs
Parallel avatar

- [Introducing SSE for Task Runs](https://parallel.ai/blog/sse-for-tasks)

Tags:Product Release
Reading time: 2 min
A new line of advanced processors: Ultra2x, Ultra4x, and Ultra8x
Parallel avatar

- [A new line of advanced processors: Ultra2x, Ultra4x, and Ultra8x ](https://parallel.ai/blog/new-advanced-processors)

Tags:Product Release
Reading time: 2 min
Introducing Auto Mode for the Parallel Task API
Parallel avatar

- [Introducing Auto Mode for the Parallel Task API](https://parallel.ai/blog/task-api-auto-mode)

Tags:Product Release
Reading time: 1 min
A linear dithering of a search interface for agents
Parallel avatar

- [A state-of-the-art search API purpose-built for agents](https://parallel.ai/blog/search-api-benchmark)

Tags:Benchmarks
Reading time: 3 min
Parallel Search MCP Server in Devin
Parallel avatar

- [Parallel Search MCP Server in Devin](https://parallel.ai/blog/parallel-search-mcp-in-devin)

Tags:Product Release
Reading time: 2 min
Introducing Tool Calling via MCP Servers
Parallel avatar

- [Introducing Tool Calling via MCP Servers](https://parallel.ai/blog/mcp-tool-calling)

Tags:Product Release
Reading time: 2 min
Introducing the Parallel Search MCP Server
Parallel avatar

- [Introducing the Parallel Search MCP Server ](https://parallel.ai/blog/search-mcp-server)

Tags:Product Release
Reading time: 2 min
Starting today, Source Policy is available for both the Parallel Task API and Search API - giving you granular control over which sources your AI agents access and how results are prioritized.
Parallel avatar

- [Introducing Source Policy](https://parallel.ai/blog/source-policy)

Tags:Product Release
Reading time: 1 min
The Parallel Task Group API
Parallel avatar

- [The Parallel Task Group API](https://parallel.ai/blog/task-group-api)

Tags:Product Release
Reading time: 1 min
State of the Art Deep Research APIs
Parallel avatar

- [State of the Art Deep Research APIs](https://parallel.ai/blog/deep-research)

Tags:Benchmarks
Reading time: 3 min
Introducing the Parallel Search API
Parallel avatar

- [Introducing the Parallel Search API ](https://parallel.ai/blog/parallel-search-api)

Tags:Product Release
Reading time: 2 min
Introducing the Parallel Chat API - a low latency web research API for web based LLM completions. The Parallel Chat API returns completions in text and structured JSON format, and is OpenAI Chat Completions compatible.
Parallel avatar

- [Introducing the Parallel Chat API ](https://parallel.ai/blog/chat-api)

Tags:Product Release
Reading time: 1 min
Parallel Web Systems introduces Basis with calibrated confidences - a new verification framework for AI web research and search API outputs that sets a new industry standard for transparent and reliable deep research.
Parallel avatar

- [Introducing Basis with Calibrated Confidences ](https://parallel.ai/blog/introducing-basis-with-calibrated-confidences)

Tags:Product Release
Reading time: 4 min
The Parallel Task API is a state-of-the-art system for automated web research that delivers the highest accuracy at every price point.
Parallel avatar

- [Introducing the Parallel Task API](https://parallel.ai/blog/parallel-task-api)

Tags:Product Release,Benchmarks
Reading time: 4 min
![Company Logo](https://parallel.ai/parallel-logo-540.png)

Contact

  • hello@parallel.ai[hello@parallel.ai](mailto:hello@parallel.ai)

Resources

  • About[About](https://parallel.ai/about)
  • Pricing[Pricing](https://parallel.ai/pricing)
  • Docs[Docs](https://docs.parallel.ai)
  • Blog[Blog](https://parallel.ai/blog)
  • Changelog[Changelog](https://docs.parallel.ai/resources/changelog)
  • Careers[Careers](https://jobs.ashbyhq.com/parallel)

Info

  • Terms[Terms](https://parallel.ai/terms-of-service)
  • Privacy[Privacy](https://parallel.ai/privacy-policy)
  • Trust Center[Trust Center](https://trust.parallel.ai/)
![SOC 2 Compliant](https://parallel.ai/soc2.svg)
LinkedIn[LinkedIn](https://www.linkedin.com/company/parallel-web/about/)Twitter[Twitter](https://x.com/p0)

Parallel Web Systems Inc. 2025