GPT-5 Released: OpenAI's New Model in 2025—Marginal Gains, Major Scale

August 7, 20256 min read

GPT-5 Released: OpenAI's New Model in 2025—Marginal Gains, Major Scale

Explore the key highlights of OpenAI's GPT-5 launch in August 2025: reduced hallucinations, strategic optimizations, benchmark scores, parameter and dataset estimates, and how it compares to Gemini 2.5 and Claude Opus. See what the new system card reveals, and what's next in the AI race.

AI Software Engineering Agents

May 19, 202510 min read

AI Software Engineering Agents

An overview of SWE-agent, an open-source AI agent that autonomously fixes issues in GitHub repositories, and its place among other AI coding agents.

Next-Gen AI: Cognitive Primitives

April 20, 20256 min read

Next-Gen AI: Cognitive Primitives

Discover the essential skills (reasoning, planning, tool use) driving advanced AI development across major labs and enabling agentic systems.

LLM Agents Managing a Virtual Vending Machine: A Benchmark Study

February 27, 20255 min read

LLM Agents Managing a Virtual Vending Machine: A Benchmark Study

Study of LLMs managing a virtual vending machine business. While Claude 3.5 Sonnet turned $500 into $2,217 on average, all models eventually failed through mismanaged inventory, confused scheduling, or complete behavioral breakdowns - highlighting key limitations in AI's long-term reliability.

LLM Systems Architecture 2025

January 20, 20255 min read

LLM Systems Architecture 2025

Technical overview of modern LLM system architectures, focusing on inference, fine-tuning, and system integration.

Why Is My LLM Getting Dumber? (Cost-Cutting Reality)

December 2, 20243 min read

Why Is My LLM Getting Dumber? (Cost-Cutting Reality)

Analysis of how Large Language Models like ChatGPT are being optimized for cost efficiency, sometimes at the expense of intelligence, through techniques like pruning and quantization.

AI Agents: Autonomous Task Performers

September 9, 20241 min read

AI Agents: Autonomous Task Performers

Comprehensive exploration of AI agents: autonomous software entities that perform complex human-like tasks. Covers key features, diverse applications, current challenges, and future impact on industries and daily life.