Grok 4, an AI chatbot from xAI, has been described by Elon Musk as the smartest AI globally. However, rankings from the UC Berkeley-developed LMArena show Grok 4 is third overall, trailing behind Google's Gemini 2.5 and OpenAI's reasoning models. While performing well in specific areas, Grok's claims lack substantiation. Expert criticism of the leaderboard suggests it has systematic issues, such as undisclosed private testing and retractable scores, which could distort AI model rankings and the credibility of reported intelligence capabilities.
Grok 4 ranked third place overall and on text generation, trailing behind Google's Gemini 2.5 and OpenAI's o3 and 4o reasoning models, indicating it is not the 'smartest AI in the world.'
Musk claimed Grok 4 is smarter than almost all graduate students, illustrating bold assertions about its capabilities despite evidence to the contrary.
The UC Berkeley-developed LMArena leaderboard crowdsources rankings of AI models, but expert criticism suggests it may have systematic issues, creating a distorted playing field.
Concerns have been raised about the leaderboard's methodology, suggesting that undisclosed private testing and the ability to retract scores might compromise the integrity of its rankings.
Collection
[
|
...
]