Business

Even Nvidia’s own research teams can’t get enough GPUs

Welcome to Eye on AI, with AI reporter Sharon Goldman. The pro-Iran meme machine trolling Trump with AI Lego cartoons…Amazon’s Andy Jassy defends Amazon’s $200 billion spending spree...OpenAI pauses Stargate U.K. data center, citing energy costs.

It’s been another one of those wild weeks in AI, with Anthropic electing not to release its new Claude Mythos model because of concerns about the cybersecurity risks it poses (and forming a coalition to use a preview version of the model to bolster cybersecurity defenses); Meta releasing its first AI model since hiring Alexandr Wang; and mounting expectations about OpenAI’s upcoming new “Spud” model.

Most of these AI models run on Nvidia GPUs, the sophisticated and expensive AI chips (at over $30,000 a pop) that power their training and output. But across the industry, access to those chips remains a bottleneck. OpenAI president Greg Brockman, for example, has said allocating GPUs at OpenAI is “pain and suffering.”

This week, at the HumanX conference in San Francisco, I discovered that even inside Nvidia, GPUs are scarce.

I sat down with Bryan Catanzaro, who leads applied deep learning research at Nvidia, overseeing teams working on AI-driven graphics, speech recognition, and simulation. Catanzaro was also among the first, back in the early-to-mid 2010s, to notice researchers snapping up Nvidia GPUs to train AI models—a signal that helped push CEO Jensen Huang to double down on AI, setting the stage for the company’s now-historic run.

Today, though, even Catanzaro’s teams are struggling to access enough GPUs. “My team uses AI very deeply in our work, and their primary complaint is they want higher limits,” Catanzaro told me. “They want more GPUs.”

“Efficiency is also intelligence”

In fact, he said one of his main jobs now is simply trying to secure more compute for his teams. “We’re all supply constrained,” he said. “Jensen will say, ‘I’m sorry, Bryan, but those are sold.’ We operate within those constraints.”

One of Catanzaro’s projects has been leading the team building Nvidia’s Nemotron, a family of models that are open source—meaning users can freely download them to use, study, or modify. To be clear, Nvidia isn’t trying to compete in the model-building race with the likes of OpenAI and Anthropic. Instead, it’s building them to strengthen a developer ecosystem that remains tied to Nvidia hardware and software.

The Nemotron models are known for being particularly GPU-efficient. And Catanzaro said it’s the very constraints on GPU access at Nvidia itself that is driving the push to make Nemotron models more efficient. “In a supply-constrained world, efficiency is also intelligence,” he said.

No longer a science project

But surprisingly, efficiency isn’t bad for business. Catanzaro said it was Jevons Paradox at work: When something becomes more efficient, demand often surges. “People find all sorts of new ways to use a thing when it gets more efficient,” he said.

Still, he acknowledged that Nemotron’s growing visibility inside Nvidia has also helped unlock more resources. “We’ve been working on [Nemotron] for a long time, but it’s really only in the past six months that it’s gotten more attention. As people inside Nvidia better understand the importance of this work, you get better storytelling, better collaboration, and more support across the company.”

Nvidia has realized, he added, that it can no longer take a hands-off approach to the AI ecosystem. In the past, Nvidia could rely on others to build the models and applications that drove demand for its chips. Now, as AI becomes more competitive and chip-constrained, the company sees a more active role for itself in shaping how that ecosystem develops.

“In the past, some people felt like we could just let the ecosystem take care of itself,” he said. “Now it’s much more obvious that Nvidia has a bigger role to play—a real responsibility and opportunity with Nemotron.”

That framing also helps elevate the Nemotron work inside Nvidia, where teams are competing for scarce GPU resources. “This isn’t a science project,” Catanzaro said. “It’s not just me asking for resources for my team. This is about Nvidia’s future.”

With that, here’s more AI news.

Sharon Goldman
sharon.goldman@fortune.com
@sharongoldman

FORTUNE ON AI

Meta unveils Muse Spark, its first AI model since hiring Alexandr Wang and a bellwether for CEO Mark Zuckerberg’s multi-billion dollar AI push–by Jeremy Kahn

Supermicro launches internal probe after cofounder’s arrest on charges of $2.5 billion in chip smuggling–by Amanda Gerut

A Meta employee created a dashboard so coworkers can compete to be the company’s No. 1 AI token user—and Zuckerberg doesn’t even rank in the top 250–by Jacqueline Munis

AI IN THE NEWS

The pro-Iran meme machine trolling Trump with AI Lego cartoons. A new report from Wired describes how a group of young pro-Iranian creators called Explosive Media is using AI-generated, Lego-style videos to spread sophisticated, viral propaganda during the current conflict, reaching millions across TikTok, X, and Instagram. Unlike traditional state messaging, the videos blend humor, internet-savvy cultural references, and simplified storytelling to resonate with American audiences, even incorporating memes and English-language rap. Researchers say the strategy is effective because it distills complex geopolitical events into highly shareable content while tapping into existing disaffection in the U.S., illustrating how AI tools are enabling a new kind of “slopaganda” war—where influence campaigns are faster, more targeted, and far more culturally fluent than in the past.

Amazon’s Andy Jassy defends Amazon’s $200B spending spree. GeekWire reported on Amazon CEO Andy Jassy’s latest shareholder letter, which revealed that AWS’s AI business has already reached a $15 billion annual revenue run rate, which Jassy argued means demand is strong enough to justify roughly $200 billion in planned capex. Jassy framed AI as a “once-in-a-lifetime” opportunity and positioned Amazon squarely in the middle of the current AI “land rush,” pointing to surging demand for its custom chips like Trainium—some of which are already largely sold out years in advance—as well as interest from customers eager to secure future capacity. The letter makes clear that Amazon is betting aggressively on owning more of the AI stack, from infrastructure to chips to potentially selling those capabilities externally.

OpenAI pauses Stargate UK data center, citing energy costs. According to Bloomberg, OpenAI is pausing its planned Stargate data center project in the UK, highlighting how even the most aggressive AI infrastructure buildouts are running up against real-world constraints like energy costs and regulation. The move comes as the company reins in spending ahead of a potential IPO and narrows focus to its core ChatGPT business amid intensifying competition from Anthropic and Google. While OpenAI says it still sees long-term potential in the UK, the decision underscores a broader reality: Massive AI infrastructure bets—from Texas to Norway to the UAE—are increasingly shaped not just by ambition, but by economics, geopolitics, and access to affordable power.

EYE ON AI NUMBERS

75%

That’s how many executives say their AI strategy is more about optics than any actual internal guidance, according to Writer’s new 2026 Enterprise AI Adoption Report, which surveyed 2,400 knowledge workers including 1,200 C-suite executives and 1,200 employees. In addition, 39% have no plan for how AI actually drives revenue. Yet, 69% are planning layoffs this year.

In a LinkedIn post, Writer CEO May Habib called this trend “‘AI theater’ at its worst,” adding “this strategy vacuum up top is literally tearing companies apart.”

AI CALENDAR

June 8-10: Fortune Brainstorm Tech, Aspen, Colo. Apply to attend here.

July 6-11: International Conference on Machine Learning (ICML), Seoul, South Korea.

July 7-10: AI for Good Summit, Geneva, Switzerland.

Aug. 4-6: Ai4, Las Vegas.

Source link

Miami Select