Business

Aliens, rovers and energy crystals: How Lego’s obsession with detail has kept fans hooked for 92 years and counting

Published

2 days ago

April 2, 2025

Opening a Lego set can feel equal parts overwhelming and exciting. With numerous bricks and tiny details laced into each element found in a box, the eagerness to build brick castles, rocket ships, city skylines, and more has attracted kids in droves for 92 years.

Few companies have been able to replicate Lego’s success thus far. Its toys span generations, from adult hobbyists reconnecting with their favorite toys to the next generation.

Since its humble beginnings in 1932 as no more than a carpenter’s passion project, Lego toys have become an indispensable part of childhood. Name the topic, and there’s likely a set for it, whether architecture, anime, racing, or jazz music.

6-year-old Philippa Smith plays with a Lego city at Selfridges department store in London, 22nd August 1962. (Photo by Kent Gavin/Keystone/Hulton Archive/Getty Images)

Over the decades, Lego could very well have been replaced by more addictive and appealing electronic gadgets. But that wasn’t the case—if anything, things couldn’t have been better for the family-owned Danish company. It reported record results in 2024, with a 12% sales growth against the toy market’s 1% decline.

What, then, is Lego’s secret sauce to keep kids (and, more recently, adults) hooked to its colorful bricks?

Fortune takes an exclusive look behind the scenes of Lego’s product development and the secret to keeping the iconic brand relevant.

One of Lego’s long-standing themes—space—illustrates what makes its approach unique and helps it stand the test of time. Space was one of the company’s three official categories within which it developed toys (“castle” and “city” were the others) dating back to the 1970s. It was meant to represent the mysteries of the future, much like castles did for the past. Space’s popularity with kids has endured through the years as it has captured kids’ imaginations as a realm of endless opportunities.

“Lego-building is a passion in its own right,” Julia Goldin, Lego’s chief product and marketing officer, told Fortune in an interview last year.

Listening to kids, for kids

Lego realized early on that there was no proxy to understanding what kids want without hearing from them directly. Goldin said the company made this deliberate decision about 10 years ago, and it’s helped the company change how it pursued toy-making.

“What makes a Lego set unique is, first and foremost, really understanding the audience,” Goldin. “Not just understanding what will be of interest for them, but what are the right dynamics of the experience.”

APPROVED JULIA GOLDIN HEADSHOT FINAL

The quality of Lego’s bricks is another factor that sets it apart, as sets can get passed from one generation to the next, according to Frédérique Tutt, global toy industry advisor at market research firm Circana. Unlike mindless games, parents think their kids could gain something good from Lego toys, whether that’s engineering abilities or using their creativity.

“When parents buy Lego for their child, they think it’s going to help them build their brain,” Tutt told Fortune. “They [Lego] try to develop products for anyone and everyone.”

Turning an idea into reality

As a long-time toy maker, Lego has developed a well-oiled machine to help it constantly generate new ideas. The company does a “boost week” once a year—think of it like a rapid brainstorming session typically associated with startups that spur new concepts. Designers come up with fresh ideas or work on existing ones, giving them creative freedom outside their day-to-day schedules. There isn’t a checklist of what needs to be achieved, although the goal is to see what can be turned into a potential Lego set, said Daniel Meehan, one of the brick company’s creative leads.

The next step is to figure out how “decodable” the models are, including finding elements that tell stories and make them easier to play with, like Lego astronauts or purple collectible crystals.

In addition to milking ideas from the company’s designated toy developers, the company hears directly from its audience.

“We play-test stuff as well with kids extensively,” Meehan said.

The company brings kids together across the world, from Germany to China, to see what they want more of. That process yielded one of the critical elements we see in Lego’s space-themed sets today, said Meehan, who is spearheading the company’s recent space campaign.

During one of its space “DIY tests,” one of the kids was flying around a vehicle with wheels, collecting aliens along the way—both of which weren’t part of the initial set’s design.

“We’re very practical, we’re adults … but in the eyes of kids, it was a perfect space flying vehicle. But there was one complaint: he [the kid] said we need more aliens. And we actually did put more aliens in the box as a result of that one kid,” Meehan said.

kids playing with Lego set — 60433_lifestyle_cons_2

The addition of aliens to Lego sets, such as in a Lego space station, adds more layers to what would otherwise be a straightforward set and also marks a common thread that ties sets from other categories together. For instance, Lego aliens can also be found in the space science lab and rover sets. The little green creatures were deliberately designed to look alike as a cue to Lego builders, Meehan tells Fortune.

Lego’s quality and complexity can make its products expensive—sometimes pricier than the latest iPhone. That’s especially true of products pulled out of the market, making them rare. The novelty of its products has made them a collector’s dream and even the object of $100,000 heists in the U.S. The company says it offers sets across different price points so no one feels priced out. Its most simplified products can cost single-digit dollars, just as its 7,500-piece Millennium Falcon set could cost about $960.

For the love of detail

To be sure, Lego’s care for quality and detail isn’t a new phenomenon. The company’s founder, Ole Kirk Kristiansen, imbibed it strictly to his son, who once tried using two instead of three coats of paint to hasten an order and was reprimanded.

The company’s penchant for detail applies not just to its space creations or toy development process but also to its business. Goldin, for instance, straddles meetings that look at the company’s present performance while also discussing the pipeline for the next few years.

So much of the Danish company’s legacy as a toy maker is linked to how it makes play accessible across age groups, interests, and experience levels. The theme of space, Meehan explains, can be aimed at three types of audiences: storytellers, who are mostly kids with a fascination for the subject; enthusiasts, who have an interest in learning about the field; and others, who are generally drawn to all things space, including its artistic side.

“Another strength they have is they appeal to the young children as well as the teenagers or adults with intricate pieces. So, they grow with you,” Tutt said.

kid playing with lego — MUNICH, GERMANY – MAY 25: A kid is playing with LEGO during the LEGO Summer Birthday Bash on May 25, 2022 in Munich, Germany. (Photo by Marc Mueller/Getty Images for LEGO Summer Birthday Bash)

The granular approach also applies to how Lego prices products and designs, and markets sets for its up-and-coming adult fanbase, ensuring there’s a toy for everyone. But one thing is sure: irrespective of the motivations, the company tries not to dial down on details because that gives Lego toys their character.

Goldin says Lego fans “really notice” the little elements it adds, as they “bring a lot of excitement.”

“It’s much more than a toy because it’s a very immersive experience,” she said.

A version of this story was originally published on Fortune.com on Aug. 25, 2024.

This story was originally featured on Fortune.com

Source link

Business

Stop chasing AI benchmarks—create your own

Published

14 minutes ago

April 4, 2025

Jace Porter

Every few months, a new large language model (LLM) is anointed AI champion, with record-breaking benchmark scores. But these celebrated metrics of LLM performance—such as testing graduate-level reasoning and abstract math—rarely reflect real business needs or represent truly novel AI frontiers. For companies in the market for enterprise AI models, basing the decision of which models to use on these leaderboards alone can lead to costly mistakes—from wasted budgets to misaligned capabilities and potentially harmful, domain-specific errors that benchmark scores rarely capture.

Public benchmarks can be helpful to individual users by providing directional indicators of AI capabilities. And admittedly, some code-completion and software-engineering benchmarks, like SWE-Bench or Codeforces, are valuable for companies within a narrow range of coding-related, LLM-based business applications. But the most common benchmarks and public leaderboards often distract both businesses and model developers, pushing innovation toward marginal improvements in areas unhelpful for businesses or unrelated to areas of breakthrough AI innovation.

The challenge for executives, therefore, lies in designing business-specific evaluation frameworks that test potential models in the environments where they’ll actually be deployed. To do that, companies will need to adopt tailored evaluation strategies to run at scale using relevant and realistic data.

The mismatch between benchmarks and business needs

The flashy benchmarks that model developers tout in their releases are often detached from the realities of enterprise applications. Consider some of the most popular ones: graduate-level reasoning (GPQA Diamond) and high school-level math tests, like MATH-500 and AIME2024. Each of these was cited in the releases for GPT o1, Sonnet 3.7, or DeepSeek’s R1. But none of these indicators is helpful in assessing common enterprise applications like knowledge management tools, design assistants, or customer-facing chatbots.

Instead of assuming that the “best” model on a given leaderboard is the obvious choice, businesses should use metrics tailored to their specific needs to work backward and identify the right model. Start by testing models on your actual context and data—real customer queries, domain-specific documents, or whatever inputs your system will encounter in production. When real data is scarce or sensitive, companies can craft synthetic test cases that capture the same challenges.

Without real-world tests, companies can end up ill-fitting models that may, for instance, require too much memory for edge devices, have latency that’s too high for real-time interactions, or have insufficient support for the on-premises deployment sometimes mandated by data governance standards.

Salesforce has tried to bridge this gap between common benchmarks and their actual business requirements by developing its own internal benchmark for its CRM-related needs. The company created its own evaluation criteria specifically for tasks like prospecting, nurturing leads, and generating service case summaries—the actual work that marketing and sales teams need AI to perform.

Reaching beyond stylized metrics

Popular benchmarks are not only insufficient for informed business decision-making but can also be misleading. Often LLM media coverage, including all three major recent release announcements, uses benchmarks to compare models based on their average performance. Specific benchmarks are distilled into a single dot, number, or bar.

The trouble is that generative AI models are stochastic, highly input-sensitive systems, which means that slight variations of a prompt can make them behave unpredictably. A recent research paper from Anthropic rightly argues that, as a result, single dots on a performance comparison chart are insufficient because of the large error ranges of the evaluation metrics. A recent study by Microsoft found that using a statistically more accurate clustered-based evaluation in the same benchmarks can significantly change the rank ordering of—and public narratives about—models on a leaderboards.

That’s why business leaders need to ensure reliable measurements of model performance across a reasonable range of variations, done at scale, even if it requires hundreds of test runs. This thoroughness becomes even more critical when multiple systems are combined through AI and data supply chains, potentially increasing variability. For industries like aviation or healthcare, the margin of error is small and far beyond what current AI benchmarks typically guarantee, such that solely relying on leaderboard metrics can obscure substantial operational risk in real-world deployments.

Businesses must also test models in adversarial scenarios to ensure the security and robustness of a model—such as a chatbot’s resistance to manipulation by bad actors attempting to bypass guardrails—that cannot be measured by conventional benchmarks. LLMs are notably vulnerable to being fooled by sophisticated prompting techniques. Depending on the use case, implementing strong safeguards against these vulnerabilities could determine your technology choice and deployment strategy. The resilience of a model in the face of a potential bad actor could be a more important metric than the model’s math or reasoning capabilities. In our view, making AI “foolproof” is an exciting and impactful next barrier to break for AI researchers, one that may require novel model development and testing techniques.

Putting evaluation into practice: Four keys to a scalable approach

Start with existing evaluation frameworks. Companies should start by leveraging the strengths of existing automated tools (along with human judgment and practical but repeatable measurement goals). Specialized AI evaluation toolkits, such as DeepEval, LangSmith, TruLens, Mastra, or ARTKIT, can expedite and simplify testing, allowing for consistent comparison across models and over time.

Bring human experts to the testing ground. Effective AI evaluation requires that automated testing be supplemented with human judgment wherever possible. Automated evaluation could include a comparison of LLM answers to ground truth answers, or the use of proxy metrics, such as automated ROUGE or BLEU scores, to gauge the quality of text summarization.

For nuanced assessments, however, ones where machines still struggle, human evaluation remains vital. This could include domain experts or end-users conducting a “blind” review of a sample of model outputs. Such actions can also flag potential biases in responses, such as LLMs giving responses about job candidates that are biased by gender or race. This human layer of review is labor-intensive, but can provide additional critical insight, like whether a response is actually useful and well-presented.

The value of this hybrid approach can be seen in a recent case study where a company evaluated an HR-support chatbot using both human and automated tests. The company’s iterative internal evaluation process with human involvement showed a significant source of LLM response errors was due to flawed updates to enterprise data. The discovery highlights how human evaluation can uncover systemic issues beyond the model itself.

Focus on tradeoffs, not isolated dimensions of assessment. When evaluating models, companies must look beyond accuracy to consider the full spectrum of business requirements: speed, cost efficiency, operational feasibility, flexibility, maintainability, and regulatory compliance. A model that performs marginally better on accuracy metrics might be prohibitively expensive or too slow for real-time applications. A great example of this is how Open AI’s GPT o1(a leader in many benchmarks at release time) performed when applied to the ARC-AGI prize. To the surprise of many, the o1 model performed poorly, largely due to ARC-AGI’s “efficiency limit” on the computing power used to solve the benchmark tasks. The o1 model would often take too long, using more compute time to try to come up with a more accurate answer. Most popular benchmarks don’t have a time limit even though time would be a critically important factor for many business use cases.

Tradeoffs become even more important in the growing world of (multi)-agentic applications, where simpler tasks can be handled by cheaper, quicker models (overseen by an orchestration agent), while the most complex steps (such as solving the broken-out series of problems from a customer) could need a more powerful version with reasoning to be successful.

Microsoft Research’s HuggingGPT, for example, orchestrates specialized models for different tasks under a central language model. Being prepared to change models for different tasks requires building flexible tooling that isn’t hard-coded to a single model or provider. This built-in flexibility allows companies to easily pivot and change models based on evaluation results. While this may sound like a lot of extra development work, there are a number of available tools, like LangChain, LlamaIndex, and Pydantic AI, that can simplify the process.

Turn model testing into a culture of continuous evaluation and monitoring. As technology evolves, ongoing assessment ensures AI solutions remain optimal while maintaining alignment with business objectives. Much like how software engineering teams implement continuous integration and regression testing to catch bugs and prevent performance degradation in traditional code, AI systems require regular evaluation against business-specific benchmarks. Similar to the practice of pharmacovigilance among users of new medicines, feedback from LLM users and affected stakeholders also needs to be continuously gathered and analyzed to ensure AI “behaves as expected” and doesn’t drift from its intended performance targets.

This kind of bespoke evaluation framework fosters a culture of experimentation and data-driven decision-making. It also enforces the new and critical mantra: AI may be used for execution, but humans are in control and must govern AI.

Conclusion

For business leaders, the path to AI success lies not in chasing the latest benchmark champions but in developing evaluation frameworks for your specific business objectives. Think of this approach as “a leaderboard for every user,” as one Stanford paper suggests. The true value of AI deployment comes from three key actions: defining metrics that directly measure success in your business context; implementing statistically robust testing in realistic situations using your actual data and in your actual context; and fostering a culture of continuous monitoring, evaluation and experimentation that draws on both automated tools and human expertise to assess tradeoffs across models.

By following this approach, executives will be able to identify solutions optimized for their specific needs without paying premium prices for “top-notch models.” Doing this can hopefully help steer the model development industry away from chasing marginal improvements on the same metrics—falling victim to Goodhart’s law with capabilities of limited use for business—and instead free them up to explore new avenues of innovation and the next AI breakthrough.

Read other Fortune columns by François Candelon.

Francois Candelon is a partner at private equity firm Seven2 and the former global director of the BCG Henderson Institute.

Theodoros Evgeniou is a professor at INSEAD and a cofounder of the trust and safety company Tremau.

Max Struever is a principal engineer at BCG-X and an ambassador at the BCG Henderson Institute.

David Zuluaga Martínez is a partner at Boston Consulting Group and an ambassador at the BCG Henderson Institute.

Some of the companies mentioned in this column are past or present clients of the authors’ employers.

This story was originally featured on Fortune.com

Source link

Business

Foreign college students are now losing their visas and being ordered to leave over misdemeanor crimes or traffic infractions

Published

45 minutes ago

April 4, 2025

Jace Porter

A crackdown on foreign students is alarming colleges, who say the Trump administration is using new tactics and vague justifications to push some students out of the country.

College officials worry the new approach will keep foreigners from wanting to study in the U.S.

Students stripped of their entry visas are receiving orders from the Department of Homeland Security to leave the country immediately — a break from past practice that often permitted them to stay and complete their studies.

Some students have been targeted over pro-Palestinian activism or criminal infractions — or even traffic violations. Others have been left wondering how they ran afoul of the government.

At Minnesota State University in Mankato, President Edward Inch told the campus Wednesday that visas had been revoked for five international students for unclear reasons.

He said school officials learned about the revocations when they ran a status check in a database of international students after the detention of a Turkish student at the University of Minnesota in Minneapolis. The State Department said the detention was related to a drunken driving conviction.

“These are troubling times, and this situation is unlike any we have navigated before,” Inch wrote in a letter to campus.

President Donald Trump campaigned on a promise to deport foreign students involved in pro-Palestinian protests, and federal agents started by detaining Columbia graduate student Mahmoud Khalil, a green-card-holder and Palestinian activist who was prominent in protests at Columbia last year. Secretary of State Marco Rubio said last week students are being targeted for involvement in protests along with others tied to “potential criminal activity.”

In the past two weeks, the government apparently has widened its crackdown. Officials from colleges around the country have discovered international students have had their entry visas revoked and, in many cases, their legal residency status terminated by authorities without notice — including students at Arizona State, Cornell, North Carolina State, the University of Oregon, the University of Texas and the University of Colorado.

Some of the students are working to leave the country on their own, but students at Tufts and the University of Alabama have been detained by immigration authorities — in the Tufts case, even before the university knew the student’s legal status had changed.

Feds bypass colleges to move against students

In this new wave of enforcement, school officials say the federal government is quietly deleting foreigners’ student records instead of going through colleges, as was done in the past.

Students are being ordered to leave the country with a suddenness that universities have rarely seen, said Miriam Feldblum, president and CEO of the Presidents’ Alliance on Higher Education and Immigration.

In the past, when international students have had entry visas revoked, they generally have been allowed to keep legal residency status. They could stay in the country to study, but would need to renew their visa if they left the U.S. and wanted to return. Now, increasing numbers of students are having their legal status terminated, exposing them to the risk of being arrested.

“None of this is regular practice,” Feldblum said.

At North Carolina State University, two students from Saudi Arabia left the U.S. after learning their legal status as students was terminated, the university said. N.C. State said it will work with the students to complete their semester from outside the country.

Philip Vasto, who lived with one of the students, said his roommate, in graduate school for engineering management, was apolitical and did not attend protests against the war in Gaza. When the government told his roommate his student status had been terminated, it did not give a reason, Vasto said.

Since returning to Saudi Arabia, Vasto said his former roommate’s top concern is getting into another university.

“He’s made his peace with it,” he said. “He doesn’t want to allow it to steal his peace any further.”

Database checks turn up students in jeopardy

At the University of Texas at Austin, staff checking a federal database discovered two people on student visas had their permission to be in the U.S. terminated, a person familiar with the situation said. The person declined to be identified for fear of retaliation.

One of the people, from India, had their legal status terminated April 3. The federal system indicated the person had been identified in a criminal records check “and/or has had their visa revoked.” The other person, from Lebanon, had their legal status terminated March 28 due to a criminal records check, according to the federal database.

Both people were graduates remaining in the U.S. on student visas, using an option allowing people to gain professional experience after completing coursework. Both were employed full time and apparently had not violated requirements for pursuing work experience, the person familiar with the situation said.

Some students have had visas revoked by the State Department under an obscure law barring noncitizens whose presence could have “serious adverse foreign policy consequences.” Trump invoked the law in a January order demanding action against campus antisemitism.

But some students targeted in recent weeks have had no clear link to political activism. Some have been ordered to leave over misdemeanor crimes or traffic infractions, Feldblum said. In some cases, students were targeted for infractions that had been previously reported to the government.

Some of the alleged infractions would not have drawn scrutiny in the past and will likely be a test of students’ First Amendment rights as cases work their way through court, said Michelle Mittelstadt, director of public affairs at the Migration Policy Institute.

“In some ways, what the administration is doing is really retroactive,” she said. “Rather than saying, ‘This is going to be the standard that we’re applying going forward,’ they’re going back and vetting students based on past expressions or past behavior.”

The Association of Public and Land-grant Universities is requesting a meeting with the State Department over the issue. It’s unclear whether more visas are being revoked than usual, but officials fear a chilling effect on international exchange.

Many of the association’s members have recently seen at least one student have their visas revoked, said Bernie Burrola, a vice president at the group. With little information from the government, colleges have been interviewing students or searching social media for a connection to political activism.

“The universities can’t seem to find anything that seems to be related to Gaza or social media posts or protests,” Burrola said. “Some of these are sponsored students by foreign governments, where they specifically are very hesitant to get involved in protests.”

There’s no clear thread indicating which students are being targeted, but some have been from the Middle East and China, he said.

America’s universities have long been seen as a top destination for the world’s brightest minds — and they’ve brought important tuition revenue and research breakthroughs to U.S. colleges. But international students also have other options, said Fanta Aw, CEO of NAFSA, an association of international educators.

“We should not take for granted that that’s just the way things are and will always be,” she said.

This story was originally featured on Fortune.com

Source link

Business

Global recession on the cards

Published

1 hour ago

April 4, 2025

Jace Porter

In today’s CEO Daily: Geoff Colvin on the effect of Trump’s tariffs on corporate profits.
The big story: Forecasters eye a global recession.
The markets: Worst since Covid in 2020.
Analyst notes from JPMorgan, Wedbush, UBS, and Oxford Economics on the risk of economic contraction under the new global trade rules.
Plus: All the news and watercooler chat from Fortune.

Good morning. Today’s worldwide economic chaos, sparked by President Trump’s new tariffs, may be shocking, but it isn’t new. A similar story played out eight years ago, in Trump’s first term as president. A look at what he did, and the repercussions that followed, is instructive for business leaders, investors, and consumers. And it is by no means encouraging.

Unlike in his current term, Trump back then didn’t immediately launch a trade war. He devoted his first year as president to easing business regulation and getting a historic tax cut through Congress. CEOs were jubilant. But then, in January of his second year, he showed why he had declared himself Tariff Man. He imposed tariffs on China and then quickly broadened tariffs to more countries. The party was over. Specifically:

Tariffs helped a few U.S. companies but also injured thousands of others. For example, Trump imposed tariffs on imported steel—great for the handful of U.S. steelmakers but a painful cost increase for the thousands of U.S. manufacturers that use steel. Expand the steel example across the economy and the result was a hard punch to profits. During Trump’s first year in office (2017), before he imposed tariffs, U.S. corporate profits rose 8%. In the following five quarters, with tariffs, profits lurched into reverse, shrinking 1.5%, annualized.

Stock prices got whacked. From Trump’s 2016 election until tariffs began in January 2018, the S&P 500 rose at a 27.3% annualized pace. But with tariffs added, the S&P rose at just 3.8% annualized (January 2018 to November 2019).

CEOs reversed their view of Trump. Immediately after Trump won in 2016, bosses raised their confidence as measured by the Conference Board, and confidence varied slightly up and down around that new level during Trump’s first year in office. But soon after he declared his trade wars, CEO confidence plunged to levels not seen since the worst days of the financial crisis in 2008-09.

Note that Trump is executing his main economic policies in the reverse order he followed in his first term. Back then he got the tax bill done first, then turned to tariffs. Now, having declared a historic trade war, he will spend much of 2025 on that tax bill, many elements of which are scheduled to sunset on December 31. He will try to keep that bill’s tax cuts and even cut taxes further. If he succeeds, he might regain his currently ebbing support from business leaders, investors, and consumers. But that’s a big “if” and a big “might.” — Geoff Colvin

More news below.

Contact CEO Daily via Diane Brady at diane.brady@fortune.com

This story was originally featured on Fortune.com

Source link