Look at this word cloud. It’s not just a colorful visualization – it’s the pulse of our technological future captured in a single image. The words that dominated Jensen Huang’s GTC 2025 keynote tell a story that should make every technologist, investor, and futurist sit up straight.
“GPU” “AI.” “Computing.” “Factory.” “Token.” These aren’t just buzzwords – they’re the vocabulary of a revolution unfolding in real time.
And then Jensen dropped the bombshell that sent shockwaves across the industry:
We need 100x more compute
“The scaling law, this last year, is where almost the entire world got it wrong.The computation requirement, the scaling law of AI is more resilient and in fact, hyper accelerated. The amount of computation we need at this point is easily 100 times more than we thought we needed this time last year.”
Let that sink in. Not 20% more. Not double. One hundred times more compute than anticipated just twelve months ago.
Remember when we thought AI was advancing fast? Turns out, we were dramatically underestimating the compute hunger of truly intelligent systems. This isn’t gradual evolution – it’s a sudden, dramatic reimagining of what our infrastructure needs to become.
Why? Because AI has learned to think and act.
Jensen illustrated this with a seemingly simple problem – organizing a wedding seating chart while accommodating family feuds, photography angles, and traditional constraints. A llama3.1 tackled it with a quick 439 tokens, confidently serving up the wrong answer. But a deepseek – the reasoning model? It generated over 8,000 tokens, methodically thinking through approaches, checking constraints, and testing solutions.
This is the difference between an AI that simply responds and one that truly reasons. And that reasoning requires exponentially more computational horsepower.
What does this mean for the industry?
If you’re building AI applications, your infrastructure roadmap just changed dramatically. If you’re investing in tech, the winners will be those who can solve this compute challenge. And if you’re watching from the sidelines, prepare to witness a massive transformation of our digital landscape.
The hunt for 100X compute isn’t just NVIDIA’s problem – it’s the defining challenge for the entire tech ecosystem. And how we respond will reshape industries, markets, and possibly society itself.
The question isn’t whether we need to scale dramatically – it’s how we’ll achieve this scale in ways that are practical, sustainable, and accessible to more than just the tech giants.
The race for the next generation of compute has officially begun. And the stakes couldn’t be higher.
Data Centres will be power limited
While Jensen’s 100X revelation left the audience stunned, it was his description of how computing itself is changing that truly illuminates the path forward.
“Every single data center in the future will be power limited.The revenues are power limited.”
This isn’t just a technical constraint – it’s an economic reality that’s reshaping the entire compute landscape. When your ability to generate value is directly capped by how much power you can access and efficiently use, the game changes completely.
The traditional approach? Build bigger data centers. But as Jensen pointed out, we’re approaching a trillion-dollar datacenter buildout globally – a staggering investment that still won’t satisfy our exponentially growing compute demands, especially with these new power constraints.
This is where the industry finds itself at a crossroads, quietly exploring alternative paths that could complement the traditional centralized model.
What if the solution isn’t just building more massive data centers, but also harnessing the vast ocean of underutilized compute that already exists? What if we could tap into even a fraction of the idle processing power sitting in devices worldwide?
Jensen himself hinted at this direction when discussing the transition from retrieval to generative computing:
“Generative AI fundamentally changed how computing is done. From a retrieval computing model, we now have a generative computing model.”
This shift doesn’t just apply to how AI generates responses – it can extend to how we generate and allocate compute resources themselves.
At Spheron,we are exploring precisely this frontier – envisioning a world where compute becomes programmable, decentralized, and accessible through permissionless protocol. Rather than just building more centralized factories, our approach aims to create fluid marketplaces where compute can flow to where it’s needed most.
Agents,Agents & Agents
Jensen didn’t just talk about more powerful hardware – he laid out a vision for a fundamentally new kind of AI:
“Agenetic AI basically means that you have an AI that has agency. It can perceive and understand the context of the circumstance. It can reason, very importantly, can reason about how to answer or how to solve a problem and it can plan an action. It can plan and take action.”
These agentic systems don’t just respond to prompts; they navigate the world, make decisions, and execute plans autonomously.
“There’s a billion knowledge workers in the world. They’re probably going to be 10 billion digital workers working with us side-by-side.”
Supporting 10 billion digital workers requires not just computational power, but computational independence – infrastructure that allows these digital workers to acquire and manage their own resources.
An agent that can reason, plan, and act still hits a wall if it can’t secure the computational resources it needs without human intervention.
As Jensen’s presentation made clear, we’re building AIs that can think, reason, and act with increasingly human-like capabilities. But unlike humans, most of these AIs can’t independently acquire the resources they need to function. They remain dependent on API keys, cloud accounts, and payment methods controlled by humans.
Solving this requires more than just powerful hardware – it demands new infrastructure models designed specifically for agent autonomy. This is where Spheron’s programmable infrastructure comes into play where agents can directly lease compute resources through smart contracts without human intermediation.
New approach to increase efficiency
As Jensen guided us through his roadmap for the next generation of AI hardware, he revealed a fundamental truth that transcends mere technical specifications:
“In a data center, we could save tens of megawatts. Let’s say 10 megawatts, well, let’s say 60 megawatts, 60 megawatts is 10 rubin ultra racks… 100 rubin ultra racks of power that we can now deploy into rubins.”
This isn’t just about efficiency – it’s about the compute economics that will govern the AI era. In this world, every watt saved translates directly into computational potential. Energy isn’t just an operating expense; it’s the fundamental limiting factor on what’s possible.
When the computational ceiling is determined by power constraints rather than hardware availability, the economics of AI shift dramatically.
The question becomes not just “How much compute can we build?” but “How can we extract maximum value from every available watt?”
While NVIDIA focuses on squeezing more computation from each watt through better hardware design, we have designed a complementary approach that tackles the problem from a different angle.
What if, instead of just making each processor more efficient, we could more efficiently utilize all the processors that already exist?
This is where decentralized physical infrastructure models(DePIN) like Spheron find its economic rationale ensuring that no computational potential goes to waste.
The numbers tell a compelling story.At any given moment,compute worth more than $500B sit idle or underutilized across millions of powerful GPUs in data centres,gaming PCs, workstations, and small server clusters worldwide which are. Even harnessing a fraction of this latent compute power could significantly expand our collective AI capabilities without requiring additional energy investment.
The new compute economics isn’t just about making chips more efficient – it’s about ensuring that every available chip is working on the most valuable problems.
What lies ahead
The 100X computation requirement isn’t just a technical challenge – it’s an invitation to reimagine our entire approach to infrastructure. It’s pushing us to invent new ways of scaling, new methods of allocation, and new models for access that extend far beyond traditional data center paradigms.
The word cloud we began with captures not just the keywords of Jensen’s keynote, but the vocabulary of this emerging future – a world where “scale,” “AI,” “token,” “factory,” and “compute” converge to create possibilities we’re only beginning to imagine.
As Jensen himself put it: “This is the way to solve this problem is to disaggregate… But as a result, we have done the ultimate scale up. This is the most extreme scale up the world has ever done.”
The next phase of this journey will involve not just scaling up, but scaling out – extending computational capacity across new types of infrastructure, new access models, and new autonomous systems that can manage their own resources.
We’re not just witnessing an evolution in computation, but a revolution in how computation is organized, accessed, and deployed.And in that revolution lies perhaps the greatest opportunity of our technological era – the chance to build systems that don’t just augment human capability, but fundamentally transform what’s possible at the intersection of human and machine intelligence.
The future will require not just better hardware, but smarter infrastructure that’s as programmable, as flexible, and ultimately as autonomous as the AI systems it powers.
That’s the true horizon of possibility that emerged from GTC 2025 – not just more powerful chips, but a fundamentally new relationship between computation and intelligence that will reshape our technological landscape for decades to come.