ARC-AGI-3 dropped the same week Jensen Huang declared AGI achieved. Gemini scored 0.37%. GPT-5.4 got 0.26%. Humans hit 100%.
In the GenAI era, code is a commodity, but alignment is not. Traditional review boards can't scale with AI-generated output.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results