Home

Finance & Economics

Arts & Culture

Science & Technology

Politics

@King’s

Contact us

Subscribe

The 10 Trillion Dollar Question
12 February, 2025

“DeepSeek Model cost only $5.8 Million to Train. This is the headline that impacted and shook the markets on Monday. It resulted in a selloff in all Big Tech companies and Nvidia was hit the hardest, with a single-day drop of 600 Billion in Market Cap. The largest ever in the history of the United States. DeepSeek has sowed doubts among investors on the American approach to AI and the hundreds of billions of dollars that have already been invested and whether or not we will see the promised returns. Most news articles and commentators are focusing on the fact that this means the need for large data centres and Nvidia GPUs has decreased. Some call this an overreaction and there may be some merit to that, but I believe the market is still missing the main takeaways. DeepSeek is truly disruptive as it will impact Big Tech business models and investment patterns, the government’s approach to policy and investments, and the value proposition for AI use cases among enterprises.

There is currently scepticism around the reported cost and spending figures when it comes to the R1 model. However, in the published paper on the DeepSeek V3 and R1 model, there are a few key technological advancements made. They leverage a “teacher-student” distillation approach which reduces human intervention and makes the training process more autonomous. DeepSeek also utilizes a Mixture of Experts (MoE) framework with a new method for load balancing, so that only about 5% of the model is running during inference, resulting in a significant enhancement in efficiency. Additionally, they introduced Multi-head Latent Attention (MLA), which shrinks the memory footprint for tokens and slashes overall memory usage during inference. The combination of these novel approaches has led to significant gains in efficiency and reduction in memory usage which makes the cost figure plausible. The 10 trillion-dollar question is what does this mean for the future of AI?

Big Tech’s AI playbook until now has been simple: invest to build massive data centres and secure the computational infrastructure to train increasingly large models. The strategy was that once AI truly ramped up productivity, Big Tech would be right at the centre of monetizing it by offering enterprise customers both the sophisticated AI models and the hardware to run them. Up to this point, the poster child for this strategy has been OpenAI. They’re market leaders in terms of model quality, in part because they had access to significant capital—most notably through Microsoft. Microsoft’s financial and infrastructure support allowed OpenAI to keep advancing the boundaries of what’s possible. The thesis with OpenAI was that to create AI applications there needs to be scale and billions of dollars invested. DeepSeek’s emergence has compelled the market to question this vision. The fact is that advanced models can be trained for millions of dollars instead of billions, and as a result, Big Tech is bound to refocus spending on real-world use cases, rather than raw R&D and invest in data centres backed by consumer demand.

The launch of this model will further accelerate an ongoing shift in the start-up landscape. Instead of building the next large language model, start-ups are pivoting to Agentic AI, meaning they will offer actual tools and functionality that businesses can adopt. Garry Tan, CEO of Y Combinator, in a podcast, said that the past decade was dominated by SaaS, but the next one is going to be all about Agentic AI startups providing specialized, dynamic solutions to enterprises.

As the investment focus of big tech shifts, a void will be created as there is still a need to further invest in the development of advanced AGI models. This is where the US government will step in. I think the US government already had a sense that they would need to intervene as DeepSeek V3 was released in November 2024, and hence it may have not been a coincidence that the Stargate Project announcement timed with the launch of R1. Apart from investment, this moment will also mark a shift in American policy towards AI. Necessity is the mother of all invention and the US through the 2023 Executive Order on AI(EO 14110) and the export control on advanced chips created the perfect storm that led to the DeepSeek breakthrough. The Trump administration’s decision to rescind EO 14110 will probably foster a more open-source and collaborative environment, which could accelerate further innovation. On the export control and tariff side I believe we can now understand the administration’s hesitancy to increase tariffs.

What does this mean for Nvidia? There are two sides to the story. The first is do we need as many Nvidia chips as we thought we needed. Satya Nadella best described the situation with his tweet “Jevons Paradox strikes again.” The Jevons Paradox refers to the idea that when technology becomes more efficient, it can stimulate demand for that underlying resource. So, it is hard to conclude whether or not we would need as many chips as we thought we did. The second is does Nvidia’s moat still exist? Nvidia’s two key advantages were efficiently combining multiple GPUs into effectively a single virtual GPU and that the programming language of choice was CUDA, which only worked with Nvidia’s GPUs. DeepSeek achieved what it did without using these key features which means Nvidia’s moat has weakened.

All things considered, the unveiling of the DeepSeek R1 model feels like a watershed moment. Over the last few years, we’ve seen staggering sums poured into AI, with everyone talking about the massive productivity gains that would eventually come. Training and inference have become drastically cheaper, which will result in Big Tech making targeted investments towards application and real use cases rather than just scaling up compute power. The U.S. government, seeing the strategic importance of AI research, is prepared to fill the gap and ensure the country remains a leader in this field. Meanwhile, Nvidia finds itself at a crossroads as on the one hand there is uncertainty about its future demand and on the other hand, it may have just narrowed its moat as it will face increased competitive and pricing pressure.

In short, DeepSeek’s debut has kicked off a whole new phase in AI. Its innovative architecture is going to transform how Big Tech invests, how Washington designs AI policy, and how companies large and small approach the AI opportunity. I believe that we are really at the start of a new wave where DeepSeek’s $5.8 million training cost isn’t just a headline. It is a sign that AI is already knocking at our doors and either companies and governments adopt, or risk being left behind.

Shiv Batra
+ posts

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related articles