OpenAI Codex Bug Burns Millions in SSD Wear

OpenAI's Codex coding agent is quietly destroying users' solid-state drives through a poorly designed logging system. The bug, which writes roughly 640 TB of data per year per machine, has already cost users an estimated low-single-digit millions of dollars in accelerated SSD wear between March and June 2026. For enterprise customers and individual developers relying on Codex, this is not just a nuisance—it's a direct hit to hardware budgets and operational reliability.

According to a bug report filed by developer Rui Fan, a project management committee member of Apache Flink, Codex's SQLite feedback logs are the primary culprit. After 21 days of uptime, Fan's main SSD had written 37 TB. Extrapolated, that's 640 TB per year. On a 1 TB Samsung 9100 PRO SSD with a 600 TBW rating, that means the drive's warranted endurance would be consumed in less than a year. Fan estimated his personal cost at $12.33 for 37 TB of writes. Another developer reported a $38.64 loss on a 2 TB Samsung 990 NVMe drive. OpenAI's own economic impact assessment, using a cost of $0.13 per TB written, pegs the total damage at low-single-digit millions—though actual costs may be higher given that consumer SSDs often cost $0.25–$0.33 per TBW.

The Root Cause: TRACE-Level Logging Run Amok

The issue traces back to February 2026, when Codex engineers introduced app-server SQLite logs at TRACE level—the most verbose logging tier. Unlike ERROR or WARN logs, TRACE logs emit a firehose of data, intended for deep debugging but never meant for continuous production use. The logs are stored locally on the user's device and are not uploaded unless the user explicitly includes them in a feedback report. However, the sheer volume of writes has turned what should be a diagnostic tool into a hardware destroyer.

OpenAI confirmed that engineers are aware of the problem and have been working on fixes, with several pull requests already merged. But the damage is done. The bug has been present since Codex's launch last year, and users have been complaining in GitHub issues for months. The fact that Codex itself—presumably running GPT-5.3—reviewed the offending commits makes the oversight even more glaring. It raises questions about the reliability of AI-assisted code review when the AI fails to catch a bug that directly impacts its own operational costs.

Strategic Consequences: Who Gains, Who Loses?

Winners: SSD manufacturers like Samsung stand to benefit from accelerated replacement cycles. Enterprise customers may also shift toward higher-endurance SSDs (e.g., 2 TB models with 1200 TBW) or cloud-based logging solutions, benefiting cloud storage providers. Additionally, competitors to OpenAI—such as Anthropic's Claude or Google's Gemini—could exploit this incident to position their own coding agents as more hardware-conscious.

Advertisement

Losers: OpenAI faces direct financial liability (estimated millions in extra SSD wear), reputational harm, and potential customer churn. Enterprise clients who deployed Codex at scale may now face unplanned hardware refresh costs. Individual developers on consumer-grade SSDs are the most exposed, as their drives may fail prematurely without warning.

Market Impact: This incident will accelerate a shift toward more transparent and efficient telemetry designs. We expect to see AI coding agents move from local, verbose logging to cloud-based or aggregated models that minimize write amplification. It may also prompt regulatory scrutiny, particularly in regions with strong consumer protection laws, if users were not adequately informed about the logging's impact on hardware lifespan.

Outlook & Next Steps

OpenAI has already begun deploying fixes, but the full remediation will take time. Users should monitor their SSD wear indicators (e.g., SMART data) and consider limiting Codex usage until a permanent fix is confirmed. Enterprise customers should audit their Codex deployments and negotiate compensation for accelerated hardware depreciation. In the longer term, expect industry-wide best practices for AI telemetry to emerge, likely favoring write-efficient designs and user-transparent logging policies.

For now, the key takeaway is clear: AI assistants come with hidden operational costs. The next time your coding agent suggests a fix, it might be worth checking whether it's also silently killing your drive.




Source: The Register

Rate the Intelligence Signal

Intelligence FAQ

Up to 640 TB per year per machine, enough to exhaust a 1 TB consumer SSD's warranty in under 12 months.

Check your SSD's SMART data for wear, limit Codex usage, and apply any available patches from OpenAI.