Connect with us

Business

OpenAI’s new AI safety tools could give a false sense of security

Published

on



OpenAI last week unveiled two new free-to-download tools that are supposed to make it easier for businesses to construct guardrails around the prompts users feed AI models and the outputs those systems generate.

The new guardrails are designed so a company can, for instance, more easily set up contorls to prevent a customer service chatbot responding with a rude tone or revealing internal policies about how it should make decisions around offering refunds, for example.

But while these tools are designed to make AI models safer for business customers, some security experts caution that the way OpenAI has released them could create new vulnerabilities and give companies a false sense of security. And, while OpenAI says it has released these security tools for the good of everyone, some question whether OpenAI’s motives aren’t driven in part by a desire to blunt one advantage that its AI rival Anthropic, which has been gaining traction among business users in part because of a perception that its Claude models have more robust guardrails than other competitors.

The OpenAI security tools—which are called gpt-oss-safeguard-120b and gpt-oss-safeguard-20b—are themselves a type of AI model known as a classifier, which is designed to assess whether the prompt a user submits to a larger, more general-purpose AI model as well as that larger AI model produces meet a set of rules. Companies that purchase and deploy AI models could, in the past, train these classifiers themselves, but the process was time-consuming and potentially expensive, since the developers would have to collect examples of content that violates the policy in order to train the classifier. And then, if the company wanted to adjust the policies used for the guardrails, they would have to collect new examples of violations and retrain the classifier.

OpenAI is hoping the new tools can make that process faster and more flexible. Rather than being trained to follow one fixed rulebook, these new security classifiers can simply read a written policy and apply it to new content.

OpenAI says this method, which it calls “reasoning-based classification,” allows companies to adjust their safety policies as easily as editing the text in a document instead of rebuilding an entire classification model. The company is positioning the release as a tool for enterprises that want more control over how their AI systems handle sensitive information, such as medical records or personnel records.

However, while the tools are supposed to be safer for enterprise customers, some safety experts say that they instead may give users a false sense of security. That’s because OpenAI has open-sourced the AI classifiers. That means they have made all the code for the classifiers available for free, including the weights, or the internal settings of the AI models.

Classifiers act like extra security gates for an AI system, designed to stop unsafe or malicious prompts before they reach the main model. But by open-sourcing them, OpenAI risks sharing the blueprints to those gates. That transparency could help researchers strengthen safety mechanisms, but it might also make it easier for bad actors to find the weak spots and risks, creating a kind of false comfort.

“Making these models open source can help attackers as well as defenders,” David Krueger, an AI safety professor at Mila, told Fortune. It will make it easier to develop approaches to bypassing the classifiers and other similar safeguards.”

For instance, when attackers have access to the classifier’s weights, they can more easily develop what are known as “prompt injection” attacks, where they develop prompts that trick the classifier into disregarding the policy it is supposed to be enforcing. Security researchers have found that in some cases even a string of characters that look nonsensical to a person can, for reasons researchers don’t entirely understand, convince an AI model to disregard its guardrails and do something it is not supposed to, such as offer advice for making a bomb or spew racist abuse.

Representatives for OpenAI directed Fortune to the company’s blog post announcement and technical report for the models.

Short-term pain for long-term gains

Open-source can be a double-edged sword when it comes to safety. It allows researchers and developers to test, improve, and adapt AI safeguards more quickly, increasing transparency and trust. For instance, there may be ways in which security researchers could adjust the model’s weights to make it more robust to prompt injection without degrading the model’s performance.

But it can also make it easier for attackers to study and bypass those very protections—for instance, by using other machine learning software to run through hundreds of thousands of possible prompts until it finds ones that will cause the model to jump its guardrails. What’s more, security researchers have found that these kinds of automatically-generated prompt injection attacks developed on open source AI models will also sometimes work against proprietary AI models, where the attackers don’t have access to the underlying code and model weights. Researchers have speculated this is because there may be something inherent in the way all large language models encode language that similar prompt injections will have success against any AI model.

In this way, open sourcing the classifiers may not just give users a false sense of security that their own system is well-guarded, it may actually make every AI model less secure. But experts said that this risk was probably worth taking because open-sourcing the classifiers should also make it easier for all of the world’s security experts to find ways to make the classifiers more resistant to these kinds of attacks.

“In the long term, it’s beneficial to kind of share the way your defenses work— it may result in some kind of short-term pain. But in the long term, it results in robust defenses that are actually pretty hard to circumvent,” Vasilios Mavroudis, principal research scientist at the Alan Turing Institute, said.

Mavroudis said that while open-sourcing the classifiers could, in theory, make it easier for someone to try to bypass the safety systems on OpenAI’s main models, the company likely believes this risk is low. He said that OpenAI has other safeguards in place, including having teams of human security experts continually trying to test their models’ guardrails in order to find vulnerabilities and hopefully improve them.

“Open-sourcing a classifier model gives those who want to bypass classifiers an opportunity to learn about how to do that. But determined jailbreakers are likely to be successful anyway,” Robert Trager, co-director of the Oxford Martin AI Governance Initiative, said.

“We recently came across a method that bypassed all safeguards of the major developers around 95% of the time — and we weren’t looking for such a method. Given that determined jailbreakers will be successful anyway, it’s useful to open-source systems that developers can use for the less determined folks,” he added.

The enterprise AI race

The release also has competitive implications, especially as OpenAI looks to challenge rival AI company Anthropic’s growing foothold among enterprise customers. Anthropic’s Claude family of AI models have become popular with enterprise customers partly because of their reputation for stronger safety controls compared to other AI models. Among the safety tools Anthropic uses are “constitutional classifiers” that work similarly to the ones OpenAI just open-sourced.

Anthropic has been carving out a market niche with enterprise customers, especially when it comes to coding. According to a July report from Menlo Ventures, Anthropic holds 32% of the enterprise large language model market share by usage compared to OpenAI’s 25%. In coding‑specific use cases, Anthropic reportedly holds 42%, while OpenAI has 21%. By offering enterprise-focused tools, OpenAI may be attempting to win over some of these business customers, while also positioning itself as a leader in AI safety.

Anthropic’s “constitutional classifiers,” consist of small language models that check a larger model’s outputs against a written set of values or policies. By open-sourcing a similar capability, OpenAI is effectively giving developers the same kind of customizable guardrails that helped make Anthropic’s models so appealing.

“From what I’ve seen from the community, it seems to be well received,” Mavroudis said. “They see the model as potentially a way to have auto-moderation. It also comes with some good connotation, as in, ‘we’re giving to the community.’ It’s probably also a useful tool for small enterprises where they wouldn’t be able to train such a model on their own.”

Some experts also worry that open-sourcing these safety classifiers could centralize what counts as “safe” AI.

“Safety is not a well-defined concept. Any implementation of safety standards will reflect the values and priorities of the organization that creates it, as well as the limits and deficiencies of its models,” John Thickstun, an assistant professor of computer science at Cornell University, told VentureBeat. “If industry as a whole adopts standards developed by OpenAI, we risk institutionalizing one particular perspective on safety and short-circuiting broader investigations into the safety needs for AI deployments across many sectors of society.”



Source link

Continue Reading

Business

Jensen Huang says AI bubble fears are dwarfed by ‘largest infrastructure buildout in human history’

Published

on



Pushing back against growing skepticism regarding the sustainability of artificial intelligence spending, Nvidia CEO Jensen Huang argued against the mountain backdrop of Davos, Switzerland, that high capital expenditures are not a sign of a financial bubble, but rather evidence of “the largest infrastructure buildout in human history.”

Speaking in conversation with BlackRock CEO Larry Fink, the interim co-chair of the World Economic Forum, Huang detailed an industrial transformation that extends far beyond software code, reshaping global labor markets and driving unprecedented demand for skilled tradespeople. While much of the public debate focuses on the potential for AI to replace white-collar jobs, Huang pointed to an immediate boom in blue-collar employment required to physically construct the new computing economy.

“It’s wonderful that the jobs are related to tradecraft, and we’re going to have plumbers and electricians and construction and steel workers,” Huang said. He noted the urgency to erect “AI factories,” chip plants, and data centers has radically altered the wage landscape for manual labor. “Salaries have gone up, nearly doubled, and so we’re talking about six-figure salaries for people who are building chip factories or computer factories,” Huang said, emphasizing the industry is currently facing a “great shortage” of these workers.

Ford CEO Jim Farley has been warning for months about the labor shortage in what he calls the “essential economy,” exactly the type of jobs mentioned by Huang in Davos. Earlier this month, Farley told Fortune these 95 million jobs are the “backbone of our country,” and he was partnering with local retailer Carhartt to boost workforce development, community building, and “the tools required by the men and women who keep the American Dream alive.” 

It’s time we all reinvest in the people who make our world work with their hands,” Farley said.

In October, at Ford’s Pro Accelerate conference, Farley shared that his own son was wrestling with whether to go to college or pursue a career in the trades. The Ford CEO has estimated the shortage at 600,000 in factories and nearly the same in construction.

Huang dismisses bubble fears

Fink brought up the bubble talk for a good reason: Fear of a popping bubble gripped markets for much of the back half of 2025, with luminaries such as Amazon founder Jeff Bezos, Goldman Sachs CEO David Solomon, and, just the previous day in Davos, Microsoft CEO Satya Nadella, warning about the potential for pain. Much of this originated in the underwhelming release of OpenAI’s GPT-5 in August, but also the MIT study that found 95% of generative AI pilots were failing to generate a return on investment. “Permabears” such as Albert Edwards, global strategist at Société Générale, have talked about how there’s likely a bubble brewing—but then again, they always think that.

Huang, whose company became the face of the AI revolution when it blew past $4 trillion in market capitalization (a bar recently reached by Alphabet on the positive release of its Gemini update), tackled these fears in conversation with Fink, arguing the term misdiagnoses the situation. Critics often point to the massive sums being spent by hyperscalers and corporations as unsustainable, but Huang countered the appearance of a bubble happens because “the investments are large … and the investments are large because we have to build the infrastructure necessary for all of the layers of AI above it.”

Huang went deeper on his food metaphor, describing the AI industry as a “five-layer cake” requiring total industrial reinvention, with Nvidia’s chips a particularly crunchy part of the recipe. The bottom layer is energy, followed by chips, cloud infrastructure, and models, with applications sitting at the top. The current wave of spending is focused on the foundational layers—energy and chips—which creates tangible assets rather than speculative vapor. Far from a bubble, he described a new industry being built from the ground up.

“There are trillions of dollars of infrastructure that needs to be built out,” Huang said, noting that the world is currently only “a few 100 billion dollars into it.”

To prove the market is driven by real demand rather than speculation, Huang offered a practical “test” for the bubble theory: the rental price of computing power as seen in the price of Nvidia’s GPU chips.

“If you try to rent an Nvidia GPU these days, it’s so incredibly hard, and the spot price of GPU rentals is going up, not just the latest generation, but two-generation-old GPUs,” he said. This scarcity indicates established companies are shifting their research and development budgets—such as pharmaceutical giant Eli Lilly moving funds from wet labs to AI supercomputing—rather than simply burning venture capital.

Beyond construction and infrastructure, Huang addressed the broader anxiety regarding AI’s impact on human employment. He argued AI ultimately changes the “task” of a job rather than eliminating the “purpose” of the job. Citing radiology as an example, he noted that despite AI diffusing into every aspect of the field over the last decade, the number of radiologists has actually increased. Because AI handles the task of studying scans infinitely faster, doctors can focus on their core purpose: patient diagnosis and care, leading to higher hospital throughput and increased hiring.

Fink reframed the issue, based on Huang’s pushback. “So what I’m hearing is, we’re far from an AI bubble. The question is, are we investing enough?” Fink asked, positing that current spending levels might actually be insufficient to broaden the global economy.

Huang appeared to say: not really. “I think the the opportunity is really quite extraordinary, and everybody ought to get involved. Everybody ought to get engaged. We need more energy,” he said, adding the industry needs more land, power, trade, scale and workers. Huang said the U.S. has lost its workforce population in many ways over the last 20-30 years, “but it’s still incredibly strong,” and in Europe, pointing around him in Switzerland, he saw “an extraordinary opportunity to take advantage of.” He noted 2025 was the largest investment year in venture capital history, with $100 billion invested around the world, mostly on AI natives.”

Huang concluded by emphasizing this infrastructure buildout is global, urging developing nations and Europe to engage in “sovereign AI” by building their own domestic infrastructure. For Europe specifically, he highlighted a “once-in-a-generation opportunity” to leverage its strong industrial base to lead in “physical AI” and robotics, effectively merging the new digital intelligence with traditional manufacturing. Far from a bubble, he seemed to be saying, this is just the beginning.



Source link

Continue Reading

Business

Nearly 400 millionaires and billionaires are demanding Davos leaders to tax them more: ‘Tax us. Tax the super rich.’

Published

on



While the wealthiest business leaders from U.S. president Donald Trump to Nvidia CEO Jensen Huang touch down in the Swiss town of Davos to discuss the state of the world, a cohort of the ultra-rich are already sounding the alarm. Hundreds of millionaires and billionaires released an open letter in time for the World Economic Forum, calling on leaders attending the conference to fight raging wealth inequality with taxes. 

“Millionaires like us refuse to be silent. It is time to be counted. Tax us and make sure the next fifty years meet the promise of progress for everyone,” the letter stated

“Extreme wealth has led to extreme control for those who gamble with our safe future for their obscene gains. Now is the time to end that control and win back our future.”

So far, nearly 400 millionaires and billionaires across 24 countries have signed the letter condemning extreme wealth, including the likes of Hollywood actor Mark Ruffalo, Disney heirs Abby and Tim Disney, and real estate developer Jeffrey Gural.

The open letter is part of a “Time to Win” campaign, led by wealth redistribution organizations including Patriotic Millionaires, Millionaires for Humanity, and Oxfam. It criticized global oligarchs with riches who have “bought up” democracies, exacerbated poverty, stifled tech innovation, dampened press freedom, and overall, “accelerated the breakdown of our planet.” After all, 77% of millionaires from G20 nations think extremely wealthy individuals buy political influence, and 71% believe those with riches can significantly influence elections, according to a poll conducted for Patriotic Millionaires.

The Time to Win wealthy signatories offer a simple solution: “Tax us. Tax the super rich.”

“As millionaires who stand shoulder to shoulder with all people, we demand it,” the open letter continued. “And as our elected representatives—whether it’s those of you at Davos, local councillors, city mayors, or regional leaders—it’s your duty to deliver it.

Stars and billionaires are calling out the super-rich for being ungenerous 

As the world mints hundreds of thousands of millionaires yearly and billionaire wealth soars to record highs, some leaders can’t stand to stay quiet. Celebrities and the ultra-rich haven’t just sent a message to money-hoarders with the Time to Win letter—some have even called out billionaires in person, questioning their existence. 

“If you’re a billionaire, why are you a billionaire? No hate, but yeah, give your money away, shorties,” Eilish said onstage last year at the WSJ Magazine Innovator Awards with Meta mogul Mark Zuckerberg, worth $214 billion, in attendance. 

Even the most philanthropic members of the ultra-rich club are wary of their peers’ lack of charity. Billionaires have started their own initiatives like Warren Buffett, Melinda French Gates, and Bill Gates’ The Giving Pledge, which attracted more than 250 billionaires who pledged to donate at least half of their wealth during their lifetimes, or in their wills. But efforts have largely fallen short. Last year, French Gates admitted that the signatories haven’t given enough; And in a letter to shareholders, Buffett fessed up to the fact that billionaires aren’t following through. 

“Early on, I contemplated various grand philanthropic plans. Though I was stubborn, these did not prove feasible,” Buffett wrote. “During my many years, I’ve also watched ill-conceived wealth transfers by political hacks, dynastic choices, and, yes, inept or quirky philanthropists.”

Billionaire and millionaire wealth is on the rise 

There’s more people rolling in riches than ever before, and it’s fueling an equity crisis at the bottom of the economic ladder. 

In 2024 alone, the U.S. minted 379,000 new millionaires—over 1,000 millionaires every day—as the proportion of Americans in the ultrawealthy club swelled by 1.5%, according to a 2025 report from investment bank UBS. This cohort held about $107 trillion in total wealth at the end of that year: more than four times the amount they owned at the turn of the millennium. 

In 2000, there were only 13.27 million everyday millionaires, but by the end of 2024, the group swelled to 52 million people worldwide. 

While it might appear that eye-watering riches are spreading out to a larger number of individuals, it’s mainly concentrating at the top. America’s top 20% household earners—averaging a net worth of $4.3 million—accounted for about 71% of the U.S.’s total wealth at the end of 2024, according to 2025 data from the Federal Reserve. 

Meanwhile, the bottom half of American households, averaging about $60,000 in wealth, owned just 2.5% of the country’s wealth. For the vast majority of U.S. citizens, joining the millionaire club—and even more so, the billionaire club—is a total pipe dream.



Source link

Continue Reading

Business

Trump fast tracks ‘three-week’ nuclear approval for big tech to fuel AI race

Published

on



President Donald Trump offered Silicon Valley an extraordinary deal on Wednesday: Build your own nuclear power plants to fuel AI, and his administration will approve them in just three weeks.

Speaking at the World Economic Forum in Davos, Switzerland, Trump addressed a room of tech executives struggling with an aging U.S. electrical grid.

“I came up with the idea,” Trump said. “You people are brilliant. You have a lot of money. You can build your own electric generating plants.”

Trump talked for about 10 minutes about energy in his speech, making it clear Trump views a straining electric grid as a central economic risk of 2026. As artificial intelligence pushes electricity demand to record highs, the administration is framing power shortages as an existential threat to growth and national security. Slashing approval timelines, Trump argued, is a necessary response to an energy system he said he believes is fundamentally unprepared for the AI era.

“We needed more than double the energy currently in the country just to take care of the AI plants,” Trump said. 

The proposal marks a radical departure from the traditional Nuclear Regulatory Commission (NRC) process, which historically requires four to five years for environmental and design approvals as well as rigorous site selection. Trump claimed that while tech leaders initially “didn’t believe him,” he assured them the government would deliver approvals for oil and gas plants in just two weeks, with nuclear projects following in three.

Trump said he wasn’t “a big fan” of nuclear power before, but now sees it as a newly viable solution due to safety improvements. 

“The progress they’ve made with nuclear is unbelievable,” he said. “We’re very much into the world of nuclear energy, and we can have it now at good prices and very, very safe.” 

While the potential upcoming wave of small modular nuclear reactors (SMR) could receive regulatory approvals in less than two years, there is little basis for going through an approval process with the Nuclear Regulatory Commission in closer to three weeks, and such an expedited process would trigger widespread concerns about safety and environmental risks.

Trump also touted a new energy alliance with Venezuela, noting the U.S. secured 50 million barrels of oil last week following the “end of an attack” on the nation that led to the deposition of President Nicolás Maduro. He said the new cooperation between the two nations would make Venezuela “fantastically well” while driving U.S. gasoline prices toward $2.00 a gallon.

Gasoline prices are the main inflationary measure by which costs have fallen during the first year of the new Trump administration. But they’re nowhere close to $2.00 per gallon. The national average for a gallon of regular unleaded is $2.76 per gallon this week, down 32 cents from a year ago, primarily because of rising OPEC oil production.

But Trump drew a sharp contrast with Europe’s energy landscape. Trump mocked the “Green New Scam,” citing a 64% spike in German electricity prices and the “catastrophic” decline of energy production in the United Kingdom. He targeted the North Sea and the proliferation of wind farms, which he labeled “losers” that “kill the birds.”

“Stupid people buy” wind farms, Trump laughed.



Source link

Continue Reading

Trending

Copyright © Miami Select.