In recent days, major AI communities have been flooded with posts about an "investment livestream." Netizens tracked the trading performance of six AI models in real time, and the enthusiasm for the discussion even surpassed that for studying their own stock trading. This was an AI investment showdown conducted with real money.

This "Alpha Arena" benchmark test, initiated by the startup Nof1, was not a simulated trading event. To measure the AI's investment capabilities, the organizers provided each model account with $10,000 in seed funding, allowing them to autonomously trade cryptocurrencies in the real market. Alpha Arena will stream the entire process live, with prices fluctuating in real time and ranking real-time returns. Users can also see the trading strategies behind each model.
Based on current profitability rankings, the six AI models participating in this competition are DeepSeek chat v3.1, Claude Sonnet 4.5, Grok 4, Qwen3 Max, Gemini 2.5 pro, and GPT 5, including three leading overseas models and two domestic models. This investment and trading competition began on October 18th (Eastern Time) and will last for two weeks, ending on November 3rd.
The interesting thing about real-world market trading is that markets are always volatile and unpredictable; even the most advanced AI cannot maintain stable returns. As the official statement says, "The market is the ultimate test of intelligence."
Four days have passed, and the market has experienced some fluctuations. In the first three days, DeepSeek, ranked first, had a return rate close to 40%, with profits exceeding $4,000. However, on October 21, as the market declined, it also gave back some of its gains, and DeepSeek's return rate stabilized at around 10%, though it still remains in first place.

Observing the profit curves over these four days, DeepSeek's trading was relatively stable, leading the pack for most of the time. In the first two days, Grok 4 was close to DeepSeek, ranking second in profit, but its aggressive trading style caused it to fall quickly as the market declined, hovering around the break-even point. Claude, on the other hand, rose from third place in the previous days to second, with its profit level closely following DeepSeek.
The remaining three models have been losing money most of the time. Gemini 2.5 was at the bottom of the rankings a couple of days ago, with losses exceeding 30%. Today, GPT 5 is at the bottom, currently with losses exceeding 40%, amounting to over $5,900. Qwen3 Max from Alibaba's Tongyi platform is currently in the middle, with losses exceeding 13%. It had a brief period of profit yesterday, but has spent most of its time below the break-even point.

Through these few days of investment competition, we can see the differences in the "personality" of several models to some extent. Just like real traders, each model has its own style.
"No wonder they're from a professional background," the industry attributes DeepSeek's stable performance to its "professional expertise," given that DeepSeek's parent company, Magic Square, is a quantitative trading firm. Regarding its holdings, DeepSeek covered various assets, holding a full position at the opening on the 18th with a 10-15x leverage long position. Its strategy was simple and direct: no turnover, no stop-loss, and no take-profit. The price subsequently rose steadily.

In contrast, Gemini 2.5, which suffered significant losses, was ridiculed by netizens for its "trading style resembling that of a retail investor." Its strategies were constantly changing, such as switching between long and short positions, resulting in a much higher number of trades and transaction fees compared to the top-performing models. "It chases the rise and cuts the fall; this clever AI thinks it's smarter than the market, but its net worth keeps getting thinner and thinner, and the more it trades, the more chaotic it becomes," netizens joked, attributing Gemini 2.5's failure to being "too smart."

Grok 4 is considered to have an aggressive trading style, fully invested in multiple instruments, frequently tracking trends, resulting in high volatility and instability. Claude's greatest strength is his analytical ability, but he's too logical and hesitant to act, often leading to failed portfolio adjustments and repeated stop-losses. Interestingly, Qwen3 goes "all in" on a single instrument every day with 20x leverage, suffering heavy losses if the direction is wrong.
This investment competition has only been going on for four days, just the beginning, and the market is full of unknowns; the outcome is still uncertain. However, in the past two days, some netizens have already started learning trading strategies from DeepSeek, and some have even mentioned following AI-driven trading.
Is entrusting investments to AI really reliable? Some financial professionals hold reservations, primarily because AI doesn't understand a user's true asset situation, family circumstances, or employment status, nor does it know their investment preferences. Simply providing investment advice is a risky practice. Furthermore, AI's underlying logic is to summarize, generalize, and reproduce existing information in human society, without involving any predictions about the future.
In fact, over the past year, many users on social media platforms have had experiences with AI stock recommendations, mentioning substantial returns. Excluding content designed to attract traffic, those that appear to generate real profits often rely on the premise that the suggestions users input into the AI are already quite professional, and that the users have established certain screening criteria. For example, users might specify a "conservative style," define a range of stocks for the AI to select, and have the AI analyze specific financial reports, historical trends, volatility, etc. Finally, the AI would provide allocation strategies with different proportions for different markets.
Many industry insiders believe that AI's greatest value lies in overcoming human emotional weaknesses to provide logically clear solutions and its ability to quickly integrate and analyze data, such as rapidly reading through reports and clarifying relationships. However, AI cannot predict the future, nor does it understand current market dynamics or undisclosed information; the market is not simply a numbers game. Perhaps the best combination is rational tools and human wisdom.

(Article source: CBN)