The Chinese AI service has Wall Street worried that it will be cheaper than expected to develop models. But as chip stocks sink, some analysts see a silver lining.
What if companies don't need to spend nearly as much as expected to develop artificial-intelligence models?
That's the big question on the minds of investors Monday, given newfound attention on DeepSeek, a Chinese AI app that has climbed to the top of the U.S. App Store. The company reportedly was able to build a model that functions like OpenAI's ChatGPT without spending to the same degree.
Wall Street is nervous about what DeepSeek's success means for companies like Nvidia Corp. $(NVDA)$, Broadcom Inc. $(AVGO)$, Marvell Technology Inc. $(MRVL)$ and others that have seen their stocks run up on expectations their businesses would benefit from lofty, AI-fueled capital-expenditure budgets in the years to come.
"If DeepSeek's innovations are adopted broadly, an argument can be made that model training costs could come down significantly even at U.S. hyperscalers, potentially raising questions about the need for 1-million XPU/GPU clusters as projected by some," Raymond James analyst Srini Pajjuri wrote in a note to clients over the weekend.
In a post titled "The Short Case for Nvidia Stock," former quant investor and current Web3 entrepreneur Jeffrey Emanuel said DeepSeek's success "suggests the entire industry has been massively over-provisioning compute resources."
He added that "markets eventually find a way around artificial bottlenecks that generate super-normal profits," meaning that Nvidia may face "a much rockier path to maintaining its current growth trajectory and margins than its valuation implies.."
But it's also worth digging into the numbers that have Wall Street so worried. Specifically, there's consternation about a paper that suggested DeepSeek's creator needed to spend $5.6 million to build the model. By contrast, large technology companies in the U.S. are shelling out tens of billions a year on capital expenditures and earmarking much of that for AI infrastructure.
The $5 million number, though, is highly misleading, according to Bernstein analyst Stacy Rasgon. "Did DeepSeek really 'build OpenAI for $5M?' Of course not," he wrote in a note to clients over the weekend.
That number corresponds to DeepSeek-V3, a "mixture-of-experts" model that "through a number of optimizations and clever techniques can provide similar or better performance vs other large foundational models but requires a small fraction of the compute resources to train," according to Rasgon.
But the $5 million figure "does not include all the other costs associated with prior research and experiments on architectures, algorithms, or data," he continued. And this type of model is designed "to significantly reduce cost to train and run, given that only a portion of the parameter set is active at any one time."
Meanwhile, DeepSeek also has an R1 model that "seems to be causing most of the angst" given its comparisons to OpenAI's o1 model, according to Rasgon. "DeepSeek's R1 paper did not quantify the additional resources that were required to develop the R1 model (presumably they were substantial as well)," Rasgon wrote.
That said, he thinks it's "absolutely true that DeepSeek's pricing blows away anything from the competition, with the company pricing their models anywhere from 20-40x cheaper than equivalent models from OpenAI."
But he doesn't buy that this is a "doomsday" situation for semiconductor companies: "We are still going to need, and get, a lot of chips."
Cantor Fitzgerald's C.J. Muse also saw a silver lining. "Innovation is driving down cost of adoption and making AI ubiquitous," he wrote. "We see this progress as positive in the need for more and more compute over time (not less)."
Raymond James' Pajjuri made a similar point. "A more logical implication is that DeepSeek will drive even more urgency among U.S. hyperscalers to leverage their key advantage (access to GPUs) to distance themselves from cheaper alternatives," he wrote.
Further, while the DeepSeek fears are centered on training costs, he thinks investors should also think about inferencing. Training is the process of showing a model data that will teach it to draw conclusions, and inferencing is the process of putting that model to work based on new data.
Pajjuri argued that "as training costs decline, more AI use cases could emerge, driving significant growth in inferencing," including for models like DeepSeek's R1 and OpenAI's o1.
Emanuel, though, wrote that DeepSeek is said to be "nearly 50x more compute efficient" than popular U.S. models on the training side, and perhaps even more so when it comes to inference.