Table of Contents
DeepSeek, a Chinese AI startup, is fast-tracking the launch of its R2 model, the successor to its R1 model, initially expected in May. This move is generating considerable excitement and concern within the AI industry. DeepSeek’s ability to create cost-effective AI models has already disrupted the market, and R2 promises to further challenge the dominance of established players.

What is DeepSeek R2?
While DeepSeek has remained tight-lipped about R2, some information has emerged. R2 is expected to have:
- Improved coding capabilities
- Enhanced multilingual reasoning
- Multimodal functionality
- Vast improvement in reasoning ability with expanded reinforcement learning (RL) training datasets
The company’s technical report on R1 suggests that R2 will demonstrate a significant leap in reasoning ability due to expanded reinforcement learning (RL) training datasets. DeepSeek aims to build upon the success of R1, which was developed using less powerful Nvidia chips and at a fraction of the cost compared to AI models from U.S. tech giants.
DeepSeek’s Rise and Impact
DeepSeek gained recognition for its cost-effective AI models. Its R1 model and V3 models have influenced pricing changes in the AI industry. The company utilized techniques like Mixture-of-Experts (MoE) and multihead latent attention (MLA) to achieve high performance at a lower cost. MoE activates only relevant areas of a model, while MLA processes multiple aspects of information simultaneously. Reportedly, DeepSeek’s models were 20 to 40 times cheaper than OpenAI’s equivalents, prompting rivals to cut prices and adjust strategies.
DeepSeek’s popularity rapidly increased, with its AI chatbot app briefly surpassing OpenAI’s ChatGPT as the most downloaded app on the Apple App Store in the U.S. This surge in popularity triggered a sell-off in global equity markets, causing Nvidia stock to drop by over 15% in a single trading day.
Challenges and Concerns
Despite its achievements, DeepSeek faces several challenges with the rollout of R2:
- Potential for further U.S. export restrictions: The release of R2 could prompt the U.S. government to impose stricter restrictions on the export of GPUs to China.
- Privacy concerns: Governments from South Korea to Italy have already removed DeepSeek from app stores due to privacy concerns.
- Need for advanced GPUs: While DeepSeek has developed R1 on lower computing capacity using techniques like MoE and MLA, achieving gains in post-training stages such as inferencing will require advanced GPUs.
State Support and Geopolitical Implications
DeepSeek has gained favor with the Chinese government. Liang Wenfeng, the billionaire founder of DeepSeek, met with Chinese Premier Li Qiang as the AI sectorโs representative. The success of DeepSeek’s cost-effective models has reinforced Beijing’s confidence in China’s ability to out-innovate the U.S. Chinese companies and government bodies have been quick to adopt DeepSeek’s models. Multiple city governments and state-owned energy companies have deployed DeepSeek into their systems, while tech giants like Lenovo, Baidu, and Tencent have integrated DeepSeek’s models into their products.
However, Western regulators may view the widespread adoption of DeepSeek across Chinese state entities as a reason to escalate restrictions on AI chips or software collaborations. Access to advanced AI chips remains a challenge for DeepSeek due to embargoes.
Impact on the AI Industry
The launch of DeepSeek’s R2 model could have a transformative impact on the AI industry. Its cost-effectiveness could drive companies worldwide to accelerate their AI efforts and challenge the dominance of a few major players. If DeepSeek can deliver a model with comparable quality to those from leading Western AI companies at a fraction of the cost, it could significantly alter the competitive landscape. This could lead to increased competition, faster innovation, and greater accessibility to AI technologies.