AI’s Mathematical Revolution: From High School Problems to Cutting-Edge Research

In a stunning turn of events that has sent shockwaves through the mathematical community, artificial intelligence has evolved from struggling with high school algebra to tackling problems that would challenge even the most brilliant human minds. What began as a modest bet between mathematicians has transformed into a full-blown reckoning with how AI is reshaping one of humanity’s oldest intellectual pursuits.

The Bet That Changed Everything

In March 2025, Daniel Litt, a mathematician at the University of Toronto, made what seemed like a conservative wager with a colleague. He predicted only a 25% chance that AI could write a mathematical paper at the level of the world’s best mathematicians by 2030. Just twelve months later, Litt publicly admitted he was wrong, declaring on his blog that he now expects to lose that bet.

“I now expect to lose this bet,” Litt wrote, acknowledging the breathtaking pace of AI advancement in mathematics. His admission reflects a growing consensus among mathematicians that their field is experiencing one of the fastest evolutions in its history.

From Useless to Unstoppable

The transformation has been nothing short of remarkable. As recently as a couple of years ago, AI systems were essentially useless for solving even basic high school math problems. Today, they’re capable of tackling research-level mathematics that appears in the daily work of professional mathematicians.

“We are running out of places to hide,” wrote Jeremy Avigad, a prominent mathematician at Carnegie Mellon University, in a recent essay. “We have to face up to the fact that AI will soon be able to prove theorems better than we can.”

This isn’t hyperbole. The progress has been so rapid that mathematicians who once dismissed AI as irrelevant to their work are now scrambling to understand its implications. The technology has moved from theoretical curiosity to practical tool in what feels like overnight.

Gold Medals and Erdős Problems

The turning point came when major AI companies like OpenAI and Google DeepMind achieved gold-medal performances on the International Mathematical Olympiad, an elite competition for high school students that many experts had previously written off as beyond AI’s capabilities.

But the real shock came in January when amateur mathematicians began using similar AI tools to solve long-standing problems posed by the legendary Hungarian mathematician Paul Erdős. These weren’t just any problems—they were puzzles that had stumped mathematicians for decades, representing some of the most challenging questions in the field.

First Proof: A New Benchmark

In February, Nikhil Srivastava and his team at the University of California, Berkeley, launched the First Proof project to create a more realistic benchmark for testing AI’s mathematical capabilities. The project presented 10 problems that researchers had actually needed to solve in their day-to-day work, drawn from wildly different mathematical fields.

“They were naturally occurring problems in our day-to-day research,” Srivastava explained. “They weren’t super hard, but they weren’t routine either. There was really a range.”

The problems represented the kind of mathematical challenges that graduate students and early-career researchers encounter regularly—complex enough to require genuine insight, but not so esoteric that they existed only in theoretical realms.

AI Steps Up to the Challenge

When the problems were made public, solutions began pouring in from researchers at tech companies and academic institutions alike. OpenAI claimed it answered half of the problems correctly, based on feedback from expert mathematicians. Google DeepMind scored 6 out of 10, according to mathematicians the company consulted for each problem.

“Things have changed so fast,” said Thang Luong at Google DeepMind. “For us, now AI has really become a serious collaborator, either to produce serious research work or, in the case of First Proof, it can also actually propose a solution by itself.”

Google’s AI math tool, called Aletheia, uses a computationally intensive version of its Gemini AI chatbot, paired with a verification algorithm to look for flaws in possible solutions. The system can then iteratively produce improvements until it arrives at an answer.

The Verification Challenge

Not all the problems were unanimously agreed upon as being solved. With problem 8, which dealt with a niche area of geometry, only five of the seven experts that Google asked agreed that the proposed solution was correct.

Ivan Smith at the University of Cambridge, who wasn’t involved in the Google effort, said the AI does appear to be taking a sensible approach to this problem and shows good progress. “If this was a PhD student coming back with their thoughts, it would be encouraging and would build confidence that the result was actually true,” Smith noted.

This highlights a fundamental challenge with AI-generated proofs: checking them is hard work. Mathematicians worry about a scenario where AI can generate proofs faster than humans can verify them. If a theorem is proved by an AI, but nobody is around to check it, has it been proved?

Formalizing the Future

AI is also improving rapidly at translating handwritten proofs in natural language into a format that can be checked by computers, a process called formalisation. This capability could revolutionize how mathematics is verified and shared.

The AI company Math, Inc. recently stunned mathematicians by announcing that its AI tool, called Gauss, had formalized an award-winning proof and verified it was correct. The proof concerned how many spheres can be packed into a space and was the subject of Maryna Viazovska’s 2022 Fields Medal, often called the Nobel Prize of mathematics.

The effort to formalize Viazovska’s work began with a small group of mathematicians at the end of 2024, working separately from Math, Inc. They hoped to manually translate the problem into computer code. While they were making steady progress, Math, Inc., which had later provided assistance to the researchers, announced it already had a full proof, and then, a more general version of a result for 24 dimensions.

“We had made all the pieces, but we hadn’t written the instruction manual that explains how to put them together,” said Chris Birkbeck at the University of East Anglia, who was part of the team.

The Human Element

The final proof was around 200,000 lines of code, which constitutes about 10% of all existing formalized mathematics. Although it’s likely that this code is about 10 times longer than a human would have produced to do the same task, it’s still a huge achievement, says Johan Commelin at Utrecht University in the Netherlands.

“This is a big deal. This is Fields medal-winning work, and it’s being auto-formalised,” Commelin emphasized.

Similar efforts should now be possible for a large number of other fields, says Commelin, which could transform how mathematics is practiced. “The future that we’re all thinking of is that we’ll have tooling that will automatically formalize new research and mathematical papers, and flag whether there are mistakes or not,” he said. “This will have huge implications for, say, peer review and refereeing work.”

The Learning Opportunity

However, not everyone is celebrating this technological revolution. Some mathematicians worry about the detrimental effects AI might have on our ability to practice and come up with new mathematics.

Using machines to solve the types of problems posed in First Proof may produce concrete proofs, says Anna Marie Bohmann at Vanderbilt University, but we lose the “learning opportunity.” “Struggling to create and formulate new ideas and to solve new problems is one of the main ways in which both students and mathematics professionals consolidate their knowledge,” Bohmann explained.

Tony Feng, one of the Aletheia team at Google DeepMind, feels similarly and is cautious about using the tool himself. “A lot of times I feel like I should be doing my own homework and going through the process of building my own intuition,” Feng said.

Even formalizing proofs can generate important insights, says Mehta, and he and his colleagues will now need to untangle the 200,000-line AI proof to work out what might be useful for other projects.

A New Style of Mathematician

Despite these concerns, mathematicians remain hopeful there will be a place for them in an increasingly machine-led future. Looking to history, Commelin notes that manual calculations were once a large part of being a mathematician, but they are now done automatically.

“I think similar things will happen here, where we radically change what we’re doing, but 10 or 20 years from now, we will still recognize what we’re doing as mathematics, in a new style,” Commelin predicted.

The revolution in mathematical AI represents both an opportunity and a challenge. While AI can solve problems faster and more accurately than humans in many cases, the creative process of mathematical discovery—the struggle to understand, the joy of insight, the satisfaction of proof—remains deeply human.

As AI continues to advance, mathematicians will need to find new ways to collaborate with these powerful tools while preserving the intellectual traditions that have made mathematics one of humanity’s greatest achievements. The future of mathematics may be automated, but it will still require human wisdom to guide it.

Tags:

AI mathematics revolution, mathematical AI breakthrough, computer proof verification, formal mathematics, AI solving research problems, mathematical Olympiad AI, Paul Erdős problems AI, Fields Medal AI proof, mathematical collaboration AI, future of mathematics, AI theorem proving, mathematical automation, computer-assisted proof, AI mathematical research, mathematical innovation

Viral Phrases:

“Mathematics in the Library of Babel”
“Running out of places to hide”
“AI will soon be able to prove theorems better than we can”
“The future that we’re all thinking of”
“Fields medal-winning work, and it’s being auto-formalised”
“Struggling to create and formulate new ideas”
“Doing our own homework”
“A new style of mathematician”
“10% of all existing formalized mathematics”
“The learning opportunity”

Mathematics is undergoing the biggest change in its history

AI’s Mathematical Revolution: From High School Problems to Cutting-Edge Research

The Bet That Changed Everything

From Useless to Unstoppable

Gold Medals and Erdős Problems

First Proof: A New Benchmark

AI Steps Up to the Challenge

The Verification Challenge

Formalizing the Future

The Human Element

The Learning Opportunity

A New Style of Mathematician

Tags:

Viral Phrases:

Leave a Reply

Leave a Reply Cancel reply

Interesting links

Pages

Categories

Archive