GPT-5 Released: OpenAI's New Model in 2025—Marginal Gains, Major Scale
Explore the key highlights of OpenAI's GPT-5 launch in August 2025: reduced hallucinations, strategic optimizations, benchmark scores, parameter and dataset estimates, and how it compares to Gemini 2.5 and Claude Opus. See what the new system card reveals, and what's next in the AI race.
OpenAI has launched GPT-5—three years after GPT-4 reached labs in August 2022. The new model arrives with high expectations, but this debut is characterized by marginal architectural gains and a strategic focus on serving scale and inference cost optimization. Here’s a breakdown of what matters from the latest info, docs, and early signals.
GPT-5: What’s New?
Despite the hype, GPT-5 brings incremental improvements rather than a revolution:
Reduced hallucinations: Up to 90% lower than o3 and 26–65% lower than GPT-4o, depending on the “thinking” mode.
Better instruction following and minimized sycophancy, according to OpenAI’s release notes.
Automatic system integration: GPT-5 is not a single fixed model but acts as an integrated system, switching between reasoning modes and submodels depending on the task.
Knowledge Cutoff & Comparisons
- GPT-5: Knowledge cutoff Sep 2024 (11 months before release).
- Gemini 2.5 Pro: Jan 2025.
- Claude Opus 4.1: Mar 2025.
Marginal but Meaningful Improvements: Compared to prior versions, GPT-5 delivers better reliability—44% fewer major factual errors in “main” mode, and up to 78% fewer in “thinking” mode versus OpenAI o3.
System, Architecture, and Dataset
- Model system: Acts like a “composite” platform—integrating o3 and GPT-4o, reasoning when needed.
- Parameter count: Estimated ~300B parameters (Mixture-of-Experts), lower than speculative older estimates, placing it closer to Gemini 2.5 Pro. Anthropic’s latest models still have larger parameter sizes but at much greater cost.
- Training data: Estimated 114T tokens seen—similar to last year's projections (70T tokens, 281TB, with 1.625 epochs on GPT-4's dataset).
- Users served: Now reaching around 800 million users globally, forcing further cost and inference optimizations.
Previous Model | GPT-5 Model |
---|---|
GPT-4o | gpt-5-main |
GPT-4o-mini | gpt-5-main-mini |
OpenAI o3 | gpt-5-thinking |
OpenAI o4-mini | gpt-5-thinking-mini |
GPT-4.1-nano | gpt-5-thinking-nano |
OpenAI o3 Pro | gpt-5-thinking-pro |
HLE Scores by Model Variant
Model Variant | HLE Score |
---|---|
gpt-5-thinking-pro | 42% |
gpt-5-main | 24.8% |
gpt-5-thinking | 20.2% |
gpt-5-main-mini | 16.7% |
gpt-5-thinking-mini | 14.7% |
gpt-5-thinking-nano | 8.7% |
⚠️ Note: High HLE scores for SOTA models are typically achieved by multi-agent sophisticated and expensive thinking versions, which require significantly more computational resources and cost.
Source: p4, GPT-5 system card
Benchmarks and Scores
Metric | GPT-5 Main | GPT-5 Thinking | GPT-4o | o3 | Gemini 2.5 Pro | Claude Opus 4.1 |
---|---|---|---|---|---|---|
Hallucination Rate | — | -65% vs o3 | — | 86.7% | — | — |
GPQA | 89.4 | — | — | — | ~80 | ~75 |
HLE (Pro Mode) | 42 | — | — | — | ~21 | 10.7 |
GPT-5’s pro mode with tool use scored well on GPQA and Humanity’s Last Exam compared to competitors.
Size and Technical Guesswork
- Parameters: The center point is about 300B expert-merged parameters.
- Dataset size: Large-scale text + multimodal data, with some sources noting images, video, audio, plus special datasets added.
“Parameter count and general model size is no longer an indicator of performance, but the complexity of ‘thinking mode’ makes estimates speculative.”
Product, Community & What’s Next
- Official interfaces: Rolled out first to chatgpt.com (not yet globally), with Poe.com as a secondary option for early access.
- System Card & Docs: OpenAI published a new GPT-5 system card and livestreamed developer intros.
- Public benchmarks: Analysts note the marginal gains and call attention to OpenAI’s shift toward cost and inference at massive scale, not sheer capability leaps.
- Comparisons: Gemini 2.5 Pro and Claude Opus 4.1 remain competitive at the top, but GPT-5 shows reliability gains.
Links & Further Reading
- Official OpenAI GPT-5 announcement
- LifeArchitect.ai GPT-5 Analysis & Models Table
- ChatGPT Official Interface
- Poe.com GPT-5
GPT-5 looks set to power a new era of reliable, cost-efficient AI at scale with fewer hallucinations and smarter instruction following, but few headline grabbing breakthrough features over GPT-4o. As global rollout continues and community analysis deepens, the focus will be on how much further reliability can be pushed—and how quickly competitors answer.
Subscribe to AI Spectrum
Stay updated with weekly AI News and Insights delivered to your inbox