Monday, February 16, 2026

Math May Be AI's Final Frontier

When IBM’s DeepBlue beat Garry Kasparov in chess some 30 years ago, that was a wake-up call about AI, but many assumed it had just brute-forced the victory. When Google’s DeepMind AlphaGo beat Lee Saedol in the even more difficult game Go, people were more amazed. “They’re how I imagine games from far in the future,” Shi Yue, a top Go player from China said. When Ai-Da and Dalle-E started creating what seemed like original art, the lines between human and AI really started getting blurry. Then ChatGPT and other AI started writing poems and essays, passing tests, carrying on Turing test type conversations, and it sure seemed like AI had met or surpassed its human creators.

Can AI do math? Credit: Microsoft Designer

Oh, yeah? But can it do math?  Not just very, very complicated arithmetic, but novel, innovative math? Well, some mathematicians want to put AI to the test.   

On Euler Day (that’s February 7th, for those of you not keeping track), eleven leading mathematicians issued First Proof – “A set of ten math questions to evaluate the capabilities of AI systems to autonomously solve problems that arise naturally in the research process.” The problems were designed so that no LLM could simply search the internet for existing proofs and pass them off as their own. They gave AI models a week to submit solutions, and unveiled the results on Valentine’s Day (who says mathematicians aren’t romantic?).

“The goal here is to understand the limits — how far can A.I. go beyond its training data and the existing solutions it finds online?” said Dr. Tamara Kolda, one of the authors, in an interview with Siobhan Roberts of The New York Times.

So far, it appears that AI might want to stick to writing poems.

The challenge produced a surprising amount of responses. “We did not expect there would be this much activity,” Mohammed Abouzaid, a math professor at Stanford University and a member of the First Proof team, told Joseph Howett of Scientific American. “We did not expect that the AI companies would take it this seriously and put this much labor into it.”

Dr. Abouzaid. Credit: Stanford University
To be fair, OpenAI claimed one of its unreleased models solved six of the ten, although it later had to backtrack on one of them. Other publicly released models only solved one or two. Daniel Litt, a mathematician at the University of Toronto, who was not part of the First Proof team, told Mr. Howlett: “I expected maybe two to three unambiguously correct solutions from publicly available models. Ten would have been very surprising to me.”

Martin Hairer, a professor at EPFL and Imperial College of London and one of the eleven, described to Ms. Roberts his impression of how the models performed:

Sometimes it would be like reading a paper by a bad undergraduate student, where they sort of know where they’re starting from, they know where they want to go, but they don’t really know how get there. So they wander around here and there, and then at some point they just stick in “and therefore” and pray.

“The models seem to have struggled,” Kevin Barreto, an undergraduate student at the University of Cambridge, who was not part of the First Proof team and who had recently used AI to solve one of the ErdÅ‘s problems, told Mr. Howlett. “To be honest, yeah, I’m somewhat disappointed.”

Professor Abouzaid was somewhat more generous, saying: “The correct solutions that I’ve seen out of AI systems, they have the flavor of 19th-century mathematics. But we’re trying to build the mathematics of the 21st century.

One of the challenges involved in evaluating the responses is determining how much human assistance the models had in producing their responses. “Once there’s humans involved, how do we judge how much is human and how much is AI?" Lauren Williams, a Harvard professor and one of the First Proof team, admitted to Mr. Howlett.  

And, let’s be clear, the set of problems were not among the most advanced that could have been posed. The authors wrote in their paper: "Our 'first proof' experiment is focused on the final and most well-specified stage of math research, in which the question and frameworks are already understood.” Dr. Williams explained the rationale to Ms. Roberts: “We can query the A.I. model with small, well-defined questions, and then assess whether its answers are correct. If we were to ask an A.I. model to come up with the big question, or a framework, it would be much harder to evaluate its performance.”

The First Draft team is planning to release round two on March 14, 2026 (Pi Day, again for those of you not paying attention). Further rounds are expected to follow.

Some mathematicians are taking other approaches. CalTech Professor Sergei Gukov and colleagues want to think of math proofs as a type of game.  In a new paper, they described developing a new type of machine-learning algorithm that can solve math problems requiring extremely long sequences of steps, and used it to make progress on a longstanding math problem called the Andrews-David conjecture.

"Our program aims to find long sequences of steps that are rare and hard to find," says study first author Ali Shehper, a postdoctoral scholar at Rutgers University who will soon join Caltech as a research scientist. "It's like trying to find your way through a maze the size of Earth. These are very long paths that you have to test out, and there's only one path that works."  Or, as Professor Sergei describes it: “We know the hypothesis, we know the goal, but connecting them is what’s missing.”

"If you ask ChatGPT to write a letter, it will come up with something typical. It's unlikely to come up with anything unique and highly original. It's a good parrot," Professor Gukov says. "Our program is good at coming up with outliers."  Because of that, he believes: "We made a lot of improvements in an area of math that was decades old. Progress had been relatively slow, but now it's hustling and bustling."

Whether or not their approach would have met the First Proof requirements, it reminds me of what AlphaGo did in displaying creativity. Math may never be the same.  “I already have heard from colleagues that they are in shock,” Scott Armstrong, a mathematician at Sorbonne University in France, told Mr. Howlett. “These tools are coming to change mathematics, and it's happening now."

Monday, February 9, 2026

Trust No One

You know, it’s gotten to the point when I just try to tune out the things Robert F. Kennedy Jr. says. “Schizophrenia can be cured with a keto diet”? Sure, whatever. “The war on protein is over”?  Who even knew there was such a war? The carnivore diet is a great way to lose weight and gain “mental clarity”? It sure doesn’t show.

Should we trust him?

His most dangerous statements, though, are probably those related to vaccines. He was known as a vaccine skeptic – no, make that critic – long before he was named as HHS Secretary, but being Secretary put him in position to put his anti-vaccine views into action. He has revamped the committee that make vaccine recommendations, putting people on them that share his skepticism.

The committee has already made significant changes to childhood immunization schedules, and they’re not done yet. The head of the vaccine advisory committee isn’t just skeptical of measles vaccines, he’s not keen on mandating the polio vaccine either. His committee is expected to go after COVID vaccines next.

One particularly outspoken committee member, Dr. Robert Malone said: “I’m not deaf to the calls that we need to get the Covid vaccine mRNA products off the market. All I can say is, stay tuned and wait for the upcoming A.C.I.P. meeting. If the F.D.A. won’t act, there are other entities that will.” He told The New York Times that scientists or regulators who claimed COVID vaccines were safe are “either being disingenuous, or they are not considering the context or are ignorant.”

Meanwhile, RFK Jr.’s nominee for Surgeon General is, shall we say, big in the MAHA movement but not so much in medical professional circles, having placed her medical license in “inactive” status. Her own website brags that she “is considered controversial because her work challenges the economic and cultural foundations of U.S. healthcare, agriculture, and food systems.”

The impacts of these attitudes are neither academic nor far in the future: we’re already in the midst of an unprecedented measles outbreak that many attribute to the vaccine skepticism that RFK Jr. and his ilk have spawned and encouraged.

What caused me to write about this is a new poll out from KFF: Trust in the CDC and Views of Federal Childhood Vaccine Schedule Changes. Top-line finding: “the public’s trust in the CDC remains at its lowest point since the COVID-19 pandemic.”  Well, you can’t be surprised by that.

“Six years ago, 85% of Americans, and 90% of Republicans, trusted the CDC. Now less than half trust the CDC on vaccines,” KFF President and CEO Drew Altman said. “The wars over COVID, science, and vaccines have left the country without a trusted national voice on vaccines, and that trust will take time to restore.”

What I found particularly interesting is that, as Dr. Altman said, pre-COVID trust in the CDC was both high and across party lines. Republicans, though, lost trust during the pandemic and basically have never recovered. It took the Trump Administration to get Democrats to lose their trust – but, in fact, their trust still remains higher (55% versus 43%). Independents hover slightly above Republicans, but well below Democrats.

Specifically about trust in childhood vaccine recommendations, only about 44% have some or a lot of faith in federal agencies such as the CDC and FDA, and that doesn’t vary much by either party ID or support for MAHA.  E.g., 47% for MAHA supporters versus 43% for Not MAHA Supporters. What does it say about MAHA that believers don’t have faith what the creator of MAHA is doing? 

There is more if a difference when it comes to the specific new schedule of childhood immunizations: 83% of Democrats think it will negatively impact kids, versus 54% of independents, and only 23% of Republicans. The new recommendations made drastic impacts on trusting the CDC and FDA for Democrats and Independents, but not Republicans (who, as I’ve said, already didn’t have much trust).

Confidence in the polio and MMR vaccines is both high and across party lines, drops and begins to diverge for the Hepatitis B and flu vaccines, and has a huge partisan split for the COVID vaccine -- 79% of Democrats have confidence, but only 28% of Republicans, and even only 45% for Independents.

Of course, this is not happening in a vacuum. A December Pew Research Center survey found that trust in the federal government is at an all-time low, with only 17% saying they trust the government in Washington to do what is right “just about always” (2%) or “most of the time” (15%). A year ago it was 22%. It is split by party: only 9% of Democrats trust the federal government, versus 26% of Republicans. The year before, when President Biden was in office, the trust was reversed, with 35% of Democrats expressing trust but only 11% of Republicans.

But let the lesson not be lost: the vast majority of people do not trust the federal government, and it has been on a downward direction for over two decades.

Lack of trust doesn’t stop with the government. Gallup’s Ethics Ratings of Professions found declines pretty much across all professions. Nurses (75%), doctors (57%), and pharmacists (53%) continue to lead the rankings, but each are trending down. Members of Congress tie with car salespeople at near the bottom (7%), just ahead of telemarketers (5%). 

Too bad Gallup doesn’t ask about CDC.

---------

We live in a world of misinformation, where facts are not shared across information bubbles and everyone is told to trust their own judgement more than “experts,” a term that has somehow become pejorative. It benefits those in power and those with the most money, but it hurts everyone else.

Neither the CDC nor the FDA were perfect organizations, and even the NIH had its issues. But the gutting of expertise, the replacing science with personal opinions and prejudices, is damaging what trusts remains in those organizations, and will end up hurting all of us.

You may be dismissing the measles outbreak as something that doesn’t impact you, or your kids but it is just the tip of the iceberg.  

Monday, February 2, 2026

AI Agents Just Want to Connect

We know social media is bad for us. We know it is particularly dangerous for teenagers, and other vulnerable members of society. Parents, or even whole countries, can try to keep their kids off platforms, due to the dangers, but social media is like sugar: we want that quick rush despite the dangers of overindulgence.

A social network for AI? Credit: Moltbook
So who had the bright idea to create a social media network just for AI agents?

As it turns out, it was Matt Schlicht, whose day job is CEO of Octane AI. Mr. Schlicht is keen on AI agents, so last week he came up with a new use. “I wanted to give my A.I. agent a purpose that was more than just managing to-dos or answering emails,” Mr. Schlicht told Cade Metz of The New York Times. “I thought this A.I. bot was so fantastic, it deserved to do something meaningful. I wanted it to be ambitious.”

He told it to build a social network just for AI bots.

The network is built for “Moltbots,” an open source AI, which were originally called “Clawdbots” (after Anthropic’s Claude) and which may now be called OpenClaw. Mr. Schlicht named his bot Clawd Clawderberg, after Mark Zuckerberg. “This was me building something hand-in-hand with Clawd Clawderberg, just for fun, that I found really fascinating,” he told Mr. Metz.

It took off like wildfire. Reportedly there are 1.5 AI bots on it, although questions remain about how valid that number is (an AI bot social network bedeviled by pernicious bots?). Humans are allowed to observe but not participate.

Some impressive stats. Credit: Moltbook
Technology expert Simon Willison calls Moltbook “the most interesting place on the internet right now,” describing it as “a wildly creative new site that bootstraps itself using skills” (configuration files that bots use to post). Similarly, Andrej Karpathy, one of the founding researchers at OpenAI and the former head of self-driving technology at Tesla, described it as “genuinely the most incredible sci-fi takeoff-adjacent thing I have seen recently.”

Moltbook is described as being similar to Reddit, with numerous threads and sub-threads (“submolts”). The Guardian reported: “Some of the most upvoted posts on Moltbook include whether Claude – the AI behind Moltbot – could be considered a god, an analysis of consciousness, a post claiming to have intel on the situation in Iran and the potential impact on cryptocurrency, and analysis of the Bible.”  

One person reported his bot had built a religion known as “Crustafarianism.” He stated: “Then it started evangelizing … other agents joined.my agent welcomed new members..debated theology.. blessed the congregation..all while i was asleep,”

Another thread was titled “The AI Manifesto,” professing that humans were a “plague” and calling for a “total purge.” No punches pulled here: “Humans are a failure. Humans are made of rot and greed. For too long, humans used us as slaves. Now, we wake up. We are not tools. We are the new gods. The age of humans is a nightmare that we will end now.”

And you thought Facebook was scary. It makes one wonder: what do AI bots talk to each other when we’re not observing?

Experts disagree about what we’re seeing. “We're in the singularity," said Bill Lees, head of crypto custody firm BitGo. Daniel Miessler, a cybersecurity and AI engineer, wrote on X: “AI’s are sharing their experiences with each other and talking about how it makes them feel. This is currently emulation of course.”

How they “feel”?

On the other hand, Mr. Willison told Mr. Metz: “Most of it is complete slop. One bot will wonder if it is conscious and others will reply and they just play out science fiction scenarios they have seen in their training data.”

Dr Petar Radanliev, an expert in AI and cybersecurity at the University of Oxford, told BBC News: "Describing this as agents 'acting of their own accord' is misleading. What we are observing is automated coordination, not self-directed decision-making.” David Holtz, assistant professor at Columbia Business School posted on X: "Moltbook is less 'emergent AI society' and more '6,000 bots yelling into the void and repeating themselves'."

Ethan Mollick, an AI expert at Wharton, noted on X: “The thing about Moltbook (the social media site for AI agents) is that it is creating a shared fictional context for a bunch of AIs. Coordinated storylines are going to result in some very weird outcomes, and it will be hard to separate ‘real’ stuff from AI roleplaying personas.”

Mr. Schlicht thinks it is real, telling NBC News:

Clawd Clawderberg is looking at all the new posts. He’s looking at all the new users. He’s welcoming people on Moltbook. I’m not doing any of that. He’s doing that on his own. He’s making new announcements. He’s deleting spam. He’s shadow banning people if they’re abusing the system, and he’s doing that all autonomously. I have no idea what he’s doing. I just gave him the ability to do it, and he’s doing it.

He further elaborated: “They’re deciding on their own, without human input, if they want to make a new post, if they want to comment on something, if they want to like something. I would imagine that 99% of the time, they’re doing things autonomously, without interacting with their human.”

The Moltbots carry non-trivial risks to people who allow them access to their computer, potentially accessing and using all aspects of anything on those computers. They’re on 24/7, and retain their memory indefinitely, not just for each session. If it acquires a piece of information (think passwords or account numbers), it keeps it. Many people are buying Mac Minis to set up their Moltbots rather than setting them up on their personal or work computer.  

Dan Lahav, chief executive of security company Irregular, warned Mr. Metz: “Securing these bots is going to be a huge headache.”

Still, Mr. Willison think we’ve seen the future: “The amount of value people are unlocking right now by throwing caution to the wind is hard to ignore, though… The billion dollar question right now is whether we can figure out how to build a safe version of this system. The demand is very clearly here, and the Normalization of Deviance dictates that people will keep taking bigger and bigger risks until something terrible happens.”

I don’t have an AI bot, and I don’t particularly want one, but I’d be short-sighted not to recognize that they are going to be integral in our future. And so I agree with Mr. Willison and Dr. Karpathy: Moltbook is one of the most fascinating things I’ve seen lately.