TechCrunch, Engadget, and 1 more
Gizmodo, Axios, and 11 more

BBC, TechRadar, and 6 more

Futurism, Fortune, and 8 more

TechRadar, Futurism, and 2 more

Reuters, The Guardian, and 3 more

Ars Technica, TechCrunch, and 12 more

Ars Technica, CNET, and 12 more
Humanity's Last Exam
Humanity's Last Exam is a benchmark featuring 2,500 extremely difficult, expert-level questions across subjects like advanced math, physics, and biology. It was created to truly test AI capabilities as older benchmarks became too easy.
TechCrunch, Engadget, and 1 more
Gizmodo, Axios, and 11 more

TechCrunch, Engadget, and 2 more

BBC, TechRadar, and 6 more

Futurism, Fortune, and 8 more

TechRadar, Futurism, and 2 more

Reuters, The Guardian, and 3 more

Ars Technica, TechCrunch, and 12 more

Ars Technica, CNET, and 12 more
Humanity's Last Exam
Humanity's Last Exam is a benchmark featuring 2,500 extremely difficult, expert-level questions across subjects like advanced math, physics, and biology. It was created to truly test AI capabilities as older benchmarks became too easy.