At its recent livestream unveiling GPT-5, OpenAI set out to dazzle the audience with several charts showcasing the model’s strengths. But a closer look revealed some puzzling inconsistencies in the data. For instance, one chart designed to compare deception evaluations across models showed GPT-5 with a 50.0% rate and o3 with 47.4%, yet the bar representing o3 appeared larger. In another case, GPT-5’s lower performance was depicted with a bigger bar than its counterpart, and yet another graph showed o3 and GPT-4o as equal—even though their scores differ.
OpenAI’s CEO Sam Altman called it a “mega chart screwup,” while a marketing representative apologised for what was termed an “unintentional chart crime.” This misstep is particularly striking given the recent claims of significant advances in reducing hallucinations with GPT-5. If you’ve ever wrestled with confusing data visualisations, you can appreciate how these errors undermine trust. It’s a useful reminder to always look closely at how data is presented.