The State of AI in 2025
"Look at these people. Wandering around with absolutely no idea what's about to happen." - Margin Call
Back to work!
Simon Willison has a great post on stuff we learned about AI in 2024 (mostly technical stuff). Inspired by him, here is a roundup of the top events of 2024 in AI and where we are now.
1. Everybody caught up to OpenAI.
The top tier is OpenAI, Google, and Anthropic. They are very close, for most purposes the difference is insignificant, for some use cases one is better than the other.
With Gemini 2.0, Google seems to be the leader by a modest amount on overall base model quality. Gemini is better enough, and cheaper enough that I would switch to it for some jobs.
Flash is very good, very fast, and very cheap
Long context
Great native multimodality (although OpenAI has caught up on vision/audio/video)
Deep research, thinking models, NotebookLM state-of-the-art text-to-speech.
OpenAI is still ahead on product, like structured outputs, o1 (LMarena says tied with Google’s thinking model), overall ecosystem. First-mover advantage is real. Google now supports the OpenAI API semantics, which attests to OpenAI being the leader and Google the fast follower.
Claude is very strong on technical discussions and coding. They pioneered the Canvas, are doing interesting stuff on API discovery.
Use lmarena.ai and artificialanalysis.ai to compare models for different use cases.
2. The rise of small and open-source models nipping at the leaders' heels.
The thing that shocked me most in 2025 was Gemma2-9b-it-SIMPO. Gemma is Google's excellent open source model. A group at Princeton fine-tuned it to score higher than early GPT-4, and shrank it to run adequately fast on a MacBook. Bananas.
There are a slew of good dark horses: Llama, Grok, Nova in the US; Qwen, Yi, DeepSeek from China, Mistral in Europe.
The gap between the frontier models and the flash/mini or smaller open-source models shrank.
3. The scaling laws hit the wall.
Gemini 2.0 is a significant improvement over 1.0 but not as big as GPT 3.5 to GPT-4
OpenAI failed to deliver a foundation model that was a worthy successor to GPT-4 despite 2 gigantic training runs, according to reports.
These top 3 bullet points reflect the commoditization of AI, no one has a monopoly, the price of pretty good AI has dropped by multiple orders of magnitude.
4. OpenAI pivoted to inference-time compute with o1 and o3.
Faced with GPT-5 hitting the wall, OpenAI pivoted to models like o1 that think, run multiple LLM calls, use search algos to look for the best way to answer a question via a multi-step process.
Probably the 2nd most shocking thing I saw in 2024 was this thread by math professor and Fields medalist Terence Tao, saying that o1 can do math at the level of a strong math major. I thought claims about it solving problems from the Putnam test which is the leading US undergrad math competition were probably overblown or training data contamination, but even if you fuzz them it can still solve a few of them at least. Bananas. In fact it seemingly gives the lie to the notion that LLMs can ‘only’ do language and deep understanding is skin-deep. I don’t know what to make of it, either it’s overhyped or they found something important.
o1 is very good but very expensive.
o3 was announced, no one has seen it but OpenAI has hyped it for release in about a month, saying it is a superstar programmer and beat an AGI test. There is so much hype and so much pressure to beat Google that I am in the 'show me' camp. I expect a notable improvement over o1. But with Google already releasing 'thinking' models, it's not clear how big a lead OpenAI has.
Interesting consequences in the chip world. Possibly tired: 100,000-GPU Nvidia training clusters. Wired: Fast inference chips like d-Matrix, Groq, Cerebras, and a slew of others
5. Everything is agents now.
To me the definition of agent is when the LLM makes control flow decisions, whether with tools or a full-blown autonomy loop. And everything does that now, when you use the chatbot it looks at your query and routes it to different models and tools. Devin-style full-service software developers are still a bit pie-in-the-sky but everything short of that, from low-code automation up to superintelligent assistants, is very much in play.
7. The Advanced Voice Mode with vision and other crazy sci-fi stuff.
Point your phone at anything and have a conversation about it. Fast conversational multimodal FTW. There was a huge leap in video generation with Sora, Kling, Runway. Also associated tech like NotebookLM, Suno. We are not that far from a fully synthetic, realistic person to interact with. (Apols for the NSFWish projection but I think OnlyFans creators might be the first victims of robots, before even truck drivers)
8. Self-driving cars achieved product-market fit, with one player left standing: Google's Waymo.
And possibly a European and Chinese competitor. Tesla is not in the game, no robotaxis on the road yet even with safety drivers, maybe that will change and maybe it won’t. Waymo has a big lead.
IMHO we will see this type of highly autonomous robot, static or on wheels, for 5 or maybe 10 years before we see a lot of humanoid robots. And ofc fully virtual robots as discussed above. Power, cost issues, normie workplaces can't keep copiers and printers working.
As an aside, the meme of AI killing Google is overblown, more likely Google integrates AI in search and laps the competition, cures cancer with AI and leads the AI revolution, even as search declines a bit in favor of chatbots. As long as the thing we interact with is a text box, Google should do OK. When AI becomes ubiquitous and phones and homes and smart glasses are environments we talk to that watch us and interact with us, that’s a bigger bridge to cross.
9. Politics hit center stage.
US-China trade war, EU regs, California regs, major competition authority actions against all the Big Techs. With Musk in charge of tech policy in US, safe AI is so 2023, takes a back seat to accelerationism. But brace for the Butlerian Jihad (via Christopher Mims). Everything is mediated through big tech, and people view AI with suspicion as IP theft, surveillance state, manipulation and dark patterns. People look for scapegoats.
9. Dead internet and AI slop everywhere.
Stack Overflow is in freefall. Most questions AI answers better, just because it reads the documentation carefully. Same for big social media like X and Facebook. Anything big enough gets overrun by slop and adverse selection. Back to artisanal social media like Discords, and painstakingly building your own small social and professional circle.
10. 2025 and beyond?
Smart glasses, pervasive AI
A lot of social issues about IP theft, surveillance, deepfake attacks. We need a new social contract around this stuff and it’s hard.
Hyper-personalization.
Huge medical advances with AI, diagnostics, wearables, AI science, AI medical assistants.
Eventually a financial crash because markets are absolutely insane even as AI powers everything.
I think AGI and super-intelligence are actually pretty close. Take what we got and add better knowledge representation and world models, the ability to learn on the fly instead of moonshot training runs, and you get something like AGI. On a scale of a few years or decades, it’s imminent. On a lot of benchmarks and tasks like say detecting tuberculosis on a scan, the AI is much better than most professionals on some tight time limit like 15 seconds. A the best professionals will be much better than the AI on a higher time limit like 5 minutes. So there is some time budget crossover where the humans start to beat AI. And over time probably the AI will beat more humans at any given time limit, and the crossover where humans outperform the AI will shift to higher time limits. AI assistants will infiltrate more and more knowledge work. Use the AI for what it’s good at, processing lots of language, use the tools for what they are good at, looking up knowledge and running predictable tasks, and use the human for judgment and critical thinking. And over time the human component will shrink.
And the social upheaval will be unmanageable unfortunately. The techbros think they will solve all of society’s problems if you just give them a few trillion. I think everything is breaking and tech is the reason why, it’s the cause of as many problems as it solves and these Elon Musk and Sam Bankman-Fried types are mountebanks, or clowns, or blinded by greed, or in some cases just want to burn everything down.
Every few years, the bandit, the revolutionary and the priest trade hats. And the tech folks have gone from revolutionaries to bandits. A guy I used to think highly of in tech, said that the one big thing in 2024 was Silicon Valley's hostile takeover of the federal government, via an infiltration of Donald Trump's MAGA movement, and Trump's subsequent thrashing of Joe Biden's "establishment" government in the November election. The thing is, if you are posting groyper memes, changing your name to Kekius Maximus, and pimping the AfD, who is infiltrating who? The word of the day is that Muslims, Mexicans, immigrants, whoever you don’t like, are not human, and democracy is for losers, and that does not end well. Never forget that Neville Chamberlain was a ‘realist’ conservative. These people are going to sell out democracy, sell out to Putin, and blow up the world, literally.
I had AI write the first version of this, Gemini 1.5 with Deep Research. Did not use AI at all after reading that output, blame me for any typos and bad writing.