GPT-5 Released: OpenAI's New Model in 2025—Marginal Gains, Major Scale

Explore the key highlights of OpenAI's GPT-5 launch in August 2025: reduced hallucinations, strategic optimizations, benchmark scores, parameter and dataset estimates, and how it compares to Gemini 2.5 and Claude Opus. See what the new system card reveals, and what's next in the AI race.

OpenAI has launched GPT-5—three years after GPT-4 reached labs in August 2022. The new model arrives with high expectations, but this debut is characterized by marginal architectural gains and a strategic focus on serving scale and inference cost optimization. Here’s a breakdown of what matters from the latest info, docs, and early signals.

GPT-5: What’s New?

Despite the hype, GPT-5 brings incremental improvements rather than a revolution:

Reduced hallucinations: Up to 90% lower than o3 and 26–65% lower than GPT-4o, depending on the “thinking” mode.
Better instruction following and minimized sycophancy, according to OpenAI’s release notes.
Automatic system integration: GPT-5 is not a single fixed model but acts as an integrated system, switching between reasoning modes and submodels depending on the task.

Knowledge Cutoff & Comparisons

GPT-5: Knowledge cutoff Sep 2024 (11 months before release).
Gemini 2.5 Pro: Jan 2025.
Claude Opus 4.1: Mar 2025.

📉

Marginal but Meaningful Improvements: Compared to prior versions, GPT-5 delivers better reliability—44% fewer major factual errors in “main” mode, and up to 78% fewer in “thinking” mode versus OpenAI o3.

System, Architecture, and Dataset

Model system: Acts like a “composite” platform—integrating o3 and GPT-4o, reasoning when needed.
Parameter count: Estimated ~300B parameters (Mixture-of-Experts), lower than speculative older estimates, placing it closer to Gemini 2.5 Pro. Anthropic’s latest models still have larger parameter sizes but at much greater cost.
Training data: Estimated 114T tokens seen—similar to last year's projections (70T tokens, 281TB, with 1.625 epochs on GPT-4's dataset).
Users served: Now reaching around 800 million users globally, forcing further cost and inference optimizations.

Previous Model	GPT-5 Model
GPT-4o	gpt-5-main
GPT-4o-mini	gpt-5-main-mini
OpenAI o3	gpt-5-thinking
OpenAI o4-mini	gpt-5-thinking-mini
GPT-4.1-nano	gpt-5-thinking-nano
OpenAI o3 Pro	gpt-5-thinking-pro

HLE Scores by Model Variant

Model Variant	HLE Score
gpt-5-thinking-pro	42%
gpt-5-main	24.8%
gpt-5-thinking	20.2%
gpt-5-main-mini	16.7%
gpt-5-thinking-mini	14.7%
gpt-5-thinking-nano	8.7%

⚠️ Note: High HLE scores for SOTA models are typically achieved by multi-agent sophisticated and expensive thinking versions, which require significantly more computational resources and cost.

Source: p4, GPT-5 system card

Benchmarks and Scores

Metric	GPT-5 Main	GPT-5 Thinking	GPT-4o	o3	Gemini 2.5 Pro	Claude Opus 4.1
Hallucination Rate	—	-65% vs o3	—	86.7%	—	—
GPQA	89.4	—	—	—	~80	~75
HLE (Pro Mode)	42	—	—	—	~21	10.7

GPT-5’s pro mode with tool use scored well on GPQA and Humanity’s Last Exam compared to competitors.

Size and Technical Guesswork

Parameters: The center point is about 300B expert-merged parameters.
Dataset size: Large-scale text + multimodal data, with some sources noting images, video, audio, plus special datasets added.

“Parameter count and general model size is no longer an indicator of performance, but the complexity of ‘thinking mode’ makes estimates speculative.”

Product, Community & What’s Next

Official interfaces: Rolled out first to chatgpt.com (not yet globally), with Poe.com as a secondary option for early access.
System Card & Docs: OpenAI published a new GPT-5 system card and livestreamed developer intros.
Public benchmarks: Analysts note the marginal gains and call attention to OpenAI’s shift toward cost and inference at massive scale, not sheer capability leaps.
Comparisons: Gemini 2.5 Pro and Claude Opus 4.1 remain competitive at the top, but GPT-5 shows reliability gains.

🕐

Note: GPT-5 is not yet globally rolled out and may be limited to select users. Public documentation and system cards are now available for further technical analysis and comparison.

Links & Further Reading

GPT-5 looks set to power a new era of reliable, cost-efficient AI at scale with fewer hallucinations and smarter instruction following, but few headline grabbing breakthrough features over GPT-4o. As global rollout continues and community analysis deepens, the focus will be on how much further reliability can be pushed—and how quickly competitors answer.