Bernstein Report On DeepSeek: Impressive AI But Not A $5M Miracle

Published: Wednesday, January 29, 2025, 12:22 [IST]

DeepSeek, the latest AI company generating buzz on social media and stock markets, has caught attention for claims that it built a competitor to OpenAI for just $5 million. However, a recent report by Bernstein dismisses this notion, stating that while DeepSeek's models are impressive, the $5M claim is misleading, as reported by ANI.

Bernstein Report On DeepSeek Impressive AI But Not A 5M Miracle

Photo Credit: Unsplash

Bernstein Report On DeepSeek: Not a Miracle, But Still Impressive

The Bernstein report clarifies that DeepSeek has developed two major AI models: DeepSeek-V3 and DeepSeek R1. The V3 model is a large language model using a Mixture-of-Experts (MoE) architecture, allowing multiple smaller models to work together efficiently. This approach enhances performance while reducing computational costs. The model boasts 671 billion parameters, with 37 billion active at any moment, as reported by ANI.

Innovations such as Multi-Head Latent Attention (MHLA), which reduces memory usage, and mixed-precision FP8 training, which enhances efficiency, further improve the model's capabilities.

Bernstein Report On DeepSeek: The Reality of DeepSeek's Costs

Training the V3 model required 2,048 NVIDIA H800 GPUs over two months, accumulating approximately 2.7 million GPU hours for pre-training and 2.8 million GPU hours including post-training. While some estimated the cost at $5 million, Bernstein argues that this figure overlooks additional research, experimentation, and infrastructure expenses.

DeepSeek's second model, DeepSeek R1, builds on V3 but incorporates Reinforcement Learning (RL) for enhanced reasoning capabilities. The R1 model has demonstrated competitive performance against OpenAI's models, though the report notes that its development required substantial, albeit unquantified, resources.

Efficient Yet Not an OpenAI Challenger

Despite the exaggerated claims, Bernstein acknowledges that DeepSeek's models are highly efficient. For instance, training V3 used only 9% of the computing resources needed for some other leading AI models, as per ANI reports. Its performance across language, coding, and math benchmarks is on par with or better than many existing models.

Final Verdict: Hype vs. Reality

While DeepSeek has made notable AI advancements, Bernstein asserts that the panic surrounding its impact on the AI landscape is overblown. The claim of creating an OpenAI-level competitor for just $5M does not hold up under scrutiny. Nevertheless, DeepSeek's efficient approach to AI model training remains a significant development in the industry, as per media reports.

Published On January 29, 2025