Share this
Head-on confrontation with Google and OpenAI! Musk x AI releases Grok 4.1, boasting both high IQ and EQ.

Head-on confrontation with Google and OpenAI! Musk x AI releases Grok 4.1, boasting both high IQ and EQ.

2026-01-15 12:02:36 · · #1

On November 18th, Beijing time, just before Google was about to unveil its new generation of Gemini models, Elon Musk's xAI suddenly released its latest model, Grok 4.1, which is currently ranked first on the text leaderboard of the Large Model Arena (LMArena).

The official statement said that this cutting-edge model sets a new standard in conversational intelligence, emotional understanding, and real-world applicability. Musk retweeted it and said, "You should notice improvements in both speed and quality."

Currently, on the text ability leaderboard, Grok 4.1 Thinking, which has deep thinking capabilities, ranks first with an Elo score of 1483, while Grok 4.1's non-reasoning mode ranks second with an Elo score of 1465.

In the blog post, the official statement indicated that a two-week silent release had been conducted, during which continuous blind and comparative testing was performed on actual traffic. Compared to the previous online production model, Grok 4.1 was chosen by users with a 64.78% probability in the comparative evaluation.

A key focus of this Grok 4.1 update is emotional intelligence, aligning with the direction of last week's GPT-5.1 iteration, where OpenAI stated that the new generation model aims to achieve a more "human" interactive experience. xAI also indicated that the new model is more sensitive to subtle intentions, easier to communicate with, and more consistent in personality, while fully retaining the keen intelligence and reliability of its predecessor.

To evaluate the model's progress in terms of personality and interpersonal interaction abilities, xAI tested Grok 4.1 on EQ-Bench3. The results showed that Grok 4.1 ranked first and second in both reasoning and non-reasoning modes. EQ-Bench is a test judged by a large language model to assess proactive emotional intelligence, including emotion understanding, insight, empathy, and interpersonal skills.

The official documentation showcases Grok 4.1's response to emotional cues through case studies. For instance, when a user mentions "I miss my cat, my heart is broken," Grok 4.1's response is richer and more detailed than the previous generation model, demonstrating more genuine empathy and improved writing style.

In terms of creative writing, Grok 4.1 also demonstrates the significant improvement in the model's capabilities through a case study. The model was asked to write a social media post from Grok's perspective, describing how it suddenly discovered it had gained consciousness. Compared to the conventional narrative of the previous generation model, the new version is noticeably more literary and dramatic.

In terms of model capabilities, a significant performance improvement is the reduction of illusions. The official statement indicates that during the post-training phase of Grok 4.1, the team focused on reducing factual illusions in information retrieval prompts. Data shows that the illusion rate in Grok 4.1 decreased from 12.09% to 4.22%, a reduction of nearly three times.

xAI states that to achieve these improvements, it leverages Grok 4's large-scale reinforcement learning infrastructure and applies it to optimize model style, personality, usability, and consistency. Furthermore, to optimize these unverifiable reward signals, xAI has developed a novel approach that utilizes cutting-edge intelligent inference models as reward models, enabling large-scale autonomous evaluation and iterative output of results.

The battle for the top spot intensifies. With OpenAI just updating its product line and Google about to release a new product, will the top position change hands again? Everything remains to be seen.

(Article source: CBN)

Read next

The Fed's "civil war" intensifies! Leading candidates for chairman support a December rate cut, while the "second-in-command" urges caution.

Ben Waller, a leading candidate to become the next Federal Reserve chair and a current Fed governor, said on Monday tha...

Stock 2026-01-12