Login
Currencies     Stocks

DeepSeek says its newest AI model is as good as those of its American rivals, was cheaper to build and it’s available for free. What does that mean for US AI supremacy?

By Rashi Shrivastava and Richard Nieva, Forbes Staff

AChinese company called DeepSeek, which recently open-sourced a large language model it claims performs as well as OpenAI’s most capable AI systems, is now the white hot center of attention for the AI community. Its tech is being lauded as one of the best open-source challengers to top American AI models, stoking anxieties about China’s formidability in the intensifying international AI race and spurring U.S. startups to re-examine their own work after a foreign rival seemingly did so much more with so fewer resources.

In late December, the small Chinese lab, based in Hangzhou, released V3, a language model with 671 billion parameters, which was reportedly trained in two months for just $5.58 million. That’s a cost orders of magnitude less than OpenAI’s GPT-4, a larger model at an estimated 1.8 trillion parameters, but built with a $100 million price tag. Last week, DeepSeek threw down another gauntlet, releasing a model called R-1, which it claims rivals OpenAI’s o1 model on what’s called “reasoning tasks,” like coding and solving complex math and science problems. OpenAI charges users $200 per month for such models; DeepSeek offers its own for free.

The power of DeepSeek’s model and its pricing are already shifting the way American AI startups run their businesses. It’s a cheap, compelling alternative to offerings from incumbents like OpenAI, Jesse Zhang, CEO of Decagon, which builds AI agents for customer service, told Forbes. DeepSeek’s new model will likely force American AI giants like OpenAI and Anthropic to reevaluate their own prices.

Eiso Kant, CTO and co-founder of Poolside AI, a unicorn that builds AI for software engineering, told Forbes that DeepSeek’s strength is in its engineering ability to do more with less.

“What DeepSeek is showing the world is that when you put a strong emphasis on making your training compute-efficient, you can do a lot,” he said. “There’s incredible things that you can continue to squeeze out of these Nvidia chips to make them incredibly more efficient.”

With OpenAI’s o1 model allegedly bested on certain benchmarks, some startups have already begun acquiring data to train more advanced systems, Manu Sharma, CEO of data labeling company Labelbox told Forbes. “I think the AGI race is kind of reset in many ways,” he said. “We are going to just see much more competitiveness across the board.”

Alexandr Wang, the billionaire CEO of training data behemoth Scale AI, recently called the model “earth shattering.” And Aravind Srinivas, CEO of $9 billion-valued AI search startup Perplexity has integrated the model into the main search product. AI chip company Groq has already added DeepSeek’s R1 model to its language processing units. (In June, Forbes sent Perplexity a cease and desist after accusing the startup of using its reporting without permission.)

Others are less impressed. Writer CEO May Habib told Forbes she’s not surprised that DeepSeek’s models, trained on a significantly smaller budget, are able to match the most intelligent models in the US. In October, Writer launched a model that was trained with just $700,000, when it cost $4.6 million for OpenAI to build a model with similar capabilities. The company used synthetic data to lower its training costs.

“Even before DeepSeek’s model exploded on the scene, we have been saying that these models are commoditizing. They’re getting more and more distributed,” Habib said.

Over the weekend, as buzz about the company grew, DeepSeek surpassed ChatGPT on Apple’s app store, ranking No. 1 for free app downloads in the United States. Then, on Monday, several U.S. tech stocks nosedived as panic around DeepSeek’s successful model launch spread. By day’s end, AI chip behemoth Nvidia’s market cap had been shaved down nearly $600 billion.

It was a staggering upending of the AI world order. “It’s kind of wild that somebody can go in and spend hundreds of millions of dollars for a closed source model,” Greg Kamradt, president of ARC Prize, a nonprofit that benchmarks AI models, told Forbes. “And then all of a sudden you get an open-source one that’s just out there for free.”

For weeks DeepSeek’s models have been lauded by some of the most prominent names in the AI world including Meta’s chief AI scientist Yann LeCun, OpenAI cofounder Andrej Karpathy and Nvidia’s senior research scientist Jim Fan. But news of the company’s latest achievement has sent America’s AI heavyweights scrambling to figure out just how the Chinese company is getting such impressive results while spending a lot less money.

“Deepseek R1 is AI’s Sputnik moment,” investor-billionaire Marc Andreessen wrote on X.

Despite the pomp and bombast of the Trump administration’s recent AI announcements, DeepSeek has heightened fears that the U.S. could be losing its AI edge — particularly because it’s been so successful despite the tight US export controls that prevent it from using Nvidia’s state of the art AI chips. The company’s latest achievement is a sobering counterpoint to Project Stargate, a joint venture between OpenAI, Oracle and Japanese tech conglomerate Softbank, to invest $500 billion in AI infrastructure.

Ahead of a meeting with House Republicans in Florida on Monday, Trump acknowledged the threat. “The release of DeepSeek, AI from a Chinese company, should be a wakeup call for our industries that we need to be laser-focused on competing to win,” he said.

There are caveats to DeepSeek’s latest achievement. Researchers have found its AI models tend to self-censor on topics that are sensitive to the Chinese Communist Party (CCP). Security researcher Jane Manchun Wong told Forbes DeepSeek’s models do not respond to questions about Chinese President Xi Jinping and the 1989 Tiananmen Square protests. Beyond this, there are privacy concerns. Data entered into DeepSeek’s models is stored in servers located in China, according to its policies.

Divyansh Kaushik, a vice president at national security advisory firm Beacon Global Strategies warned Forbes against people using DeepSeek without thorough vetting. “Unless we can have clear national security and free speech evaluations of Chinese models, they should be treated like propaganda arms of the CCP,” he said. “They should be treated as Huawei on steroids.”

The problem is DeepSeek’s value proposition: a state of the art AI reasoning model that’s free to use and open in the closed, fee-based AI world being built by companies like OpenAI and Anthropic. “It’s much better to have a Chinese model that is open source versus an American model that is closed source,” said Labelbox’s Sharma.

Read the full article here

Share.
Leave A Reply

Exit mobile version