I'm Ylli Bajraktari, CEO of the Special Competitive Studies Project. With the news of PRC AI firm DeepSeek’s new headline-generating large language model, R-1, I sat down with SCSP staff PJ Maykish, David Lin, Channing Lee, Libby Lange, Brady Helwig, and Addis Goldman to share some of our perspectives for an urgent addition to our #MemostothePresident Series.
From the J-20 to the Mate 60, and now to R-1
In January 2011, just hours before then-U.S. Defense Secretary Robert Gates met with former PRC President Hu Jintao, the People’s Liberation Army (PLA) test-flew its J-20 fifth-generation stealth fighter - a clear display of its military modernization. In September 2023, Huawei launched its Mate 60 Pro smartphone during then-U.S. Commerce Secretary Gina Raimondo’s visit, defying U.S. export controls targeting the company’s access to 5G technology.
Last week, during President Trump’s second inauguration, PRC AI firm DeepSeek unveiled its R-1 large language model (LLM), an open-source model that reportedly matched the performance of many leading U.S. closed models, at much lower compute costs. Against the backdrop of a $500 billion investment by leading U.S. tech companies into AI infrastructure, and a sweeping regulatory framework proposed by the Biden Administration days before leaving office—both U.S. announcements underscoring the centrality of compute in the future of AI development, DeepSeek’s accomplishment appeared to be flipping the script on its head. The announcement, which only reached mainstream U.S. media outlets this week, has since sent shockwaves through the U.S. tech sector and financial market. From military jets to smartphones, and now AI, DeepSeek’s milestone is probably just as much a point of pride for the Chinese engineer as it is the latest sign of China’s unrelenting technological advancement in the face of growing geopolitical pressure.
Below, we break down some of the implications DeepSeek’s R-1 has for different aspects of the technology competition:
What Are the Implications of DeepSeek’s R-1 for…
…the Way AI Models are Trained?
Early analysis indicates DeepSeek models have better-than-Llama performance using a combination of innovations—most of which have been present in the global ecosystem but were combined in creative ways. Upfront, the DeepSeek models likely succeed because they carefully organize their memory, divide tasks among experts, and use CPU/GPU resources in very efficient ways. That’s the main reason they can perform above Llama’s level and give strong AI results.
DeepSeek’s standout performance comes from combining seven main innovations (most of which are listed in SCSP’s AGI trends paper)—Mixture of Experts (MoE), data quantization, smart GPU usage, clever CPU coordination, advanced model designs, high-quality data curation, and parallel processing—that together outperform models like Meta’s Llama. For example, Mixture of Experts can cut training costs by up to 4× by letting different parts of the model specialize in specific tasks; quantization can shrink AI parameters by more than 70% while keeping accuracy high. DeepSeek also uses GPU partitioning to speed up computations by splitting heavy jobs across multiple processors, and CPU coordination (making “Lite-GPUs”) to handle big data flows efficiently. Its model structure borrows from the existing “transformer architecture” used by other LLMs but adds unique tweaks for better learning such as focusing on reinforcement learning rather than supervised finetuning, while data curation makes sure the training material is clean and diverse. Finally, parallel processing helps DeepSeek handle large-scale tasks faster, keeping it at the cutting edge of modern AI performance.
…the Future of AI Infrastructure?
The release of DeepSeek's model has also amplified existing concerns from investors about whether the rapid expansion of AI infrastructure can be sustained. U.S. hyperscalers are set to spend well over $200 billion on AI infrastructure in 2025, implicating a wide range of suppliers, from chipmakers to fiber optic providers and energy companies. Now that groups are successfully running DeepSeek’s open-sourced models on Google Pixel smartphones and gaming laptops, are hundreds of billions of dollars in capital investment annually justified?
Similarly, DeepSeek has challenged assumptions about the unit economics of generative AI-enabled business models. By significantly reducing the cost-to-performance ratio traditionally associated with AI services delivery, some have argued that the release of DeepSeek has punctured the “AI bubble.” Its rise as a commercial competitor with global reach has called into question whether consumer AI companies like OpenAI and Anthropic have durable advantages, or whether silicon providers like Nvidia can continue to expect seemingly limitless demand for their products.
To be sure, these claims have strong counterpoints. Jevons Paradox suggests that improvements in compute efficiency will lead to increased diffusion of AI across the economy. The shift towards “test-time compute” as the new paradigm for AI scaling means that building massive data center clusters is still justified. But in the wake of DeepSeek’s emergence, expect the investors writing billion-dollar checks to ask tougher questions. Moreover, algorithmic improvements and efficiency gains in AI systems development were an inevitability. DeepSeek’s release has simply reshaped the competitive landscape, and given that it is an open-source model, U.S. competitors can rapidly adopt, adapt, and deploy its innovations.
…for the Open Source vs. Closed Source Model Weight Debate?
Perhaps the most striking feature of the DeepSeek story is that its model offerings are entirely open source (in the sense that the model weights that underpin its LLM offerings have been publicly disclosed). Given the Chinese Communist Party’s (CCP) preference for total control, its tolerance of norms and practices associated with open source technology development has come as a shock to many. At second glance, however, Beijing’s position is unsurprising. PRC officials clearly see the encouragement of open source approaches as part of a competitive strategy to offset U.S. advantages, particularly in terms of access to advanced AI chips.
DeepSeek has adopted a familiar playbook. Meta, for example, has long touted the competitive advantages of open source model development as a driver of algorithmic improvements at the software layer of the AI stack. In fact, DeepSeek has precisely demonstrated that it is possible to use an open source model like Meta’s Llama 3 and enable it to reason through reinforcement learning. DeepSeek's rise to prominence challenges the assumption that closed-source, proprietary approaches hold a decisive edge. Instead, the firm’s achievements suggest that open-source strategies can rival, and perhaps even surpass, proprietary development models on the global stage.
…for U.S. Export Controls?
The emergence of DeepSeek has raised questions about the effectiveness of U.S. export controls on advanced AI chips and whether AI developers even need the leading-edge chips to make breakthroughs. Many of these criticisms are warranted: significant shortcomings in the implementation and scope of U.S. controls have limited their effectiveness. The U.S. Government was slow to update controls, allowing companies like DeepSeek to amass large quantities of chips before stricter measures took effect (details are murky, but reports indicate DeepSeek has access to approximately 50,000 H800 GPUs). DeepSeek continues to have access to substantial compute resources through both Chinese and foreign cloud services, and even after the December 2024 update to the controls, Nvidia’s H20 GPU—a cut-down, but still powerful inference chip—remains unrestricted. The need for multiple rounds of updates to close gaps suggests the initial restrictions were not comprehensive enough, and that, as the Permanent Subcommittee on Investigations has determined, Congress has not provided adequate resources to enforce them.
U.S. export controls on advanced AI chips have achieved some meaningful successes. The measures have created genuine constraints on Chinese AI companies, as demonstrated by DeepSeek's founder explicitly citing "bans on advanced chips" as the company's biggest constraint. The controls have also successfully limited access to the most cutting-edge AI chips, forcing companies to rely on less capable versions. With the launch of the Department of Commerce’s AI Diffusion Framework on January 13, additional loopholes that allowed foreign adversaries to access cloud computing resources and smuggle GPUs via third countries are now closed, further tightening China’s access to U.S. compute offerings.
…for PRC Competitiveness in Mobile Applications?
As with TikTok, there are legitimate—if not more concerning—national security threats posed by the utilization of a PRC-built LLM, namely, the potential PRC government access to U.S. citizens’ data. People share copious amounts of sensitive and personal data with LLMs, perhaps due to the conversation-like user interface that leads users to let their guards down. This data could be used to track individuals, create detailed profiles about them, and gradually influence their beliefs. DeepSeek’s developers don’t try to hide their expansive data collection and storage practices: the app automatically collects information such as “device model, operating system, key stroke patterns or rhythms, IP addresses, and system language.” They also collect information on “the features you use and the actions you take” and partner with advertisers to collect data on user activity outside of the app itself. All of this data is stored in servers located in the People’s Republic of China.
The ascent of DeepSeek should serve as a stark reminder of the national security dimensions of using PRC-based platforms. Despite concerns around Chinese government censorship—or whether or not a PRC-made product will answer questions about Tiananmen Square—average users select their LLMs based on performance, i.e. which model is likely to complete a request most accurately and quickly. Our government and industry leaders should take the lead in educating the public about these risks before the app becomes entrenched.
…for the Future of the U.S.-China Tech Competition?
If nothing else, DeepSeek’s arrival further underscores that the United States and People’s Republic of China—and their technological innovation ecosystems—are engaged in close competition. U.S. companies made their first moves with the release of ChatGPT, Claude, Gemini, and the like; PRC companies have made moves, too, albeit a couple of years after Baidu’s less successful ErnieBot.
At the end of the day, however, global leadership in AI will be defined by more than one LLM. DeepSeek’s R-1 has merely forced tech executives and policymakers alike to revisit some longstanding assumptions about AI development, the resources required to train LLMs, and China’s ability to push through technological barriers. Like the J-20 stealth fighter and the Mate 60 Pro, DeepSeek’s breakthrough serves as a wake-up call that the pace of the tech competition is continuing unabated.
Have to escalate these. Huge implications.
Also time to:
1. Innovate
2. Reinvest / reshore
3. Build
Domestically.
It’s do or die time.