Google launches several attacks on OpenAI within a month.

"Cutting-edge intelligence born for speed," Google announced in a blog post in the early hours of December 18th, officially unveiling another game-changer: the Gemini 3 Flash. This is the fastest and most cost-effective model in the Gemini 3 series. However, what has garnered industry attention this time is that while achieving both speed and affordability, this Flash model can even outperform flagship models in some aspects.

It's worth noting that this is Google's fourth update in the field of large models within a month.

Google CEO Sundar Pichai posted that the Gemini 3 Flash breaks the Pareto limit in both performance and efficiency, outperforming the previous flagship model 2.5 Pro in performance and speed, while being much cheaper.

"Gemini 3 Flash proves that speed and scale don't have to come at the expense of intelligence," the official blog post boasted. And the benchmark data certainly confirms this.

In the SWE-bench Verified benchmark used to evaluate programming capabilities, the Gemini 3 Flash scored as high as 78%, surpassing its own flagship model, the Gemini 3 Pro, and Anthropic's Claude Sonnet 4.5. On the multimodal understanding benchmark MMMU-Pro, the Flash scored 81.2%, not only exceeding GPT-5.2 (79.5%), but also leaving Claude Sonnet 4.5 by more than ten percentage points.

All the data shows that this Flash model has made significant progress, breaking through the user's impression of the lightweight model: fast and cheap, but performance is often compromised. The Gemini 3 Flash has achieved near-flagship capabilities, satisfying both efficiency and cost-effectiveness.

According to data from the large model arena Imarena.ai, the Gemini 3 Flash is currently ranked in the top 5 in text, image, and programming categories, and ranked 2nd in math and creative writing categories. It is the most cost-effective cutting-edge model, with inputs of only $0.50 per million tokens and outputs of $3 per million tokens.

In comparison, Claude Sonnet 4.5 outputs $15 per million tokens, and GPT-5.2 outputs $14 per million tokens, nearly five times the price of Gemini 3 Flash.

Google states that the Gemini 3 Flash can flexibly adjust its think time when processing at its highest thinking level. For more complex application scenarios, it may require a longer think time, but according to typical traffic tests, it uses an average of 30% fewer tokens than the previous generation 2.5 Pro, thus completing everyday tasks more accurately and with higher performance.

The Gemini 3 Flash retains the groundbreaking performance of the Gemini 3 in complex inference, multimodal, agent, and programming tasks, while combining the latency, efficiency, and cost advantages of Flash-level performance. "This is the best model to date for agent workflows," Google stated.

A developer conducted a Python comparison test using Gemini 3 Flash and two "kings of cost-effectiveness": one was OpenAI's cost-effective version GPT-5Mini, and the other was DeepSeek-V3.2, a shining example of domestic open source technology.

The results show that the three models are similar in cost, but the Gemini 3 Flash takes only 9 seconds, while the GPT-5 Mini and DeepSeek-V3.2 take 35 seconds and 41 seconds respectively. At the same time, the Gemini 3 Flash wins in terms of performance. This is a model that balances speed and performance.

"Faster, cheaper, and free to use – that's what independent developers and small teams really need," an independent developer wrote. He added that if they were previously running applications on GPT-4o or Gemini 3 Pro, switching to Gemini 3 Flash could reduce costs by 50%-70%.

Starting today, Gemini 3 Flash will be available to all users, including free users. In the Gemini App, Gemini 3 Flash will replace 2.5 Flash as the new default model, while Gemini 3 Pro will be an option for users to handle more complex math and coding problems.

Last month, Google launched Gemini 3 Pro and Gemini 3 Deep Think, which gained widespread market recognition and overtook OpenAI to become the leader in the field of large models. The blog mentions that since its release, the internal API has processed over one trillion tokens daily. Users frequently use Gemini 3 for code simulation, learning complex topics, and building and designing interactive games. And understand various types of multimodal content.

With its cost-effectiveness and performance, the newly launched Flash is expected to be even more popular. Google says Flash has always been the most popular version internally, with previous versions like Flash 2 and Flash 2.5 handling trillions of tokens in hundreds of thousands of applications built by millions of developers.

"The Flash model is truly tailored for developers, while Flash 3 eliminates the need for them to compromise between speed and intelligence." Google has revealed another trump card, leaving OpenAI with little time to respond.

(Article source: CBN)

Google launches several attacks on OpenAI within a month.

Read next

Having already established a second publicly traded company in the US, Jia Yueting has set up a second trust to accelerate the repayment of his Chinese debts! He claims there are only two things in his life.

China UnionPay, together with its partners, jointly released the 2025 Financial Big Data Model Evaluation System.

Interest rates surge in unprecedented magnitude! Is the US experiencing a "cash crunch"?

Behind the surge in gold, silver, platinum, and palladium prices: Is the global "currency devaluation trade" making a comeback?