AI Outperforms ER Doctors in Diagnosis: 2026 Study Reveals Risk

Direct answer: A Harvard-led study published in Science reveals that OpenAI's o1 model delivered more accurate diagnoses than two attending emergency room physicians in a head-to-head comparison using real patient data. Key statistic: The o1 model achieved a 67% exact or close diagnosis rate at triage, versus 55% and 50% for the two physicians. Why it matters: This is not a laboratory simulation—the AI was fed raw, unprocessed electronic medical records identical to what the doctors saw, making the result a direct challenge to the current human-centric triage model.

Context: What Happened

Researchers at Harvard Medical School and Beth Israel Deaconess Medical Center tested OpenAI's o1 and 4o models against two attending physicians on 76 emergency room cases. The AI received the same text-based information from electronic medical records at the time of diagnosis. Two other attending physicians, blinded to the source, evaluated the diagnoses. At every diagnostic touchpoint, o1 performed nominally better than or on par with the physicians, with the largest gap at initial triage—the moment of highest uncertainty and urgency.