AI Reading for Friday January 24
Everyone started freaking out about DeepSeek a couple of weeks ago and I was initially pretty skeptical. For one thing, every single paper with a new model says they beat OpenAI on some cherry-picked metric. And then sometimes gullible journalists write, new 8b parameter model obliterates OpenAI. For another, it beat other open source models like Llama in Chatbot Arena, but it didn’t beat OpenAI and Gemini.
Today, something interesting happened, the new DeepSeek R1 reasoning model matched or maybe slightly edged out OpenAI o1 on the Chatbot Arena leaderboard.
That’s news because it’s the first time an open source model you could run on your own hardware or cloud beat out OpenAI. And academics are distilling versions to run on normal hardware like you and I have (that probably won’t beat OpenAI or full-size versions).
And it’s the first time a China model really gave the US leaders a run for their money.
Keep in mind that
OpenAI has already announced o3, coming out within weeks, supposedly a big improvement over o1.
Gemini 2.0 is probably the best straight non-reasoning LLM, OpenAI also has a lead over the base non-reasoning DeepSeek.
There are allegations DeepSeek might have been trained on pirate OpenAI chats, because it claims to be ChatGPT. Seems plausible. On the other hand if Llama and maybe other models like ChatGPT train on pirate videos from YouTube and texts from Library Genesis , which is to books what The Pirate Bay is to Oscar movies, OpenAI doesn’t much of a leg to stand on.
You can’t search for Tienanmen Square massacres in DeepSeek, naturally. Highlights the issue of how China plays by its own repressive rules which conflict with Western views on fairness, free speech, property, and human rights, and tilts the field in favor of local, politically connected champions.
Suggests you don’t need massive GPU farms and Biden administration export restrictions were ineffective.
Why everyone in AI is freaking out about DeepSeekeveryone in AI is freaking out about DeepSeek - VentureBeat
Chinese start-ups such as DeepSeek are challenging global AI giants - FT
Meta is particularly panicking as DeepSeek outclasses Llama - Tech Startups - Startups and Technology news
OpenAI ships Operator - OpenAI
First looks are flowing in. - Simon Willison
OpenAI’s new Operator AI agent can do things on the web for you - The Verge
Agent framework can perform tasks for you, provides a browser in the cloud so it can’t brick your computer, and lets you take over if necessary to e.g. enter credit cards.
Bengio talks about risks of AI agents at Davos - Business Insider
Perplexity assistant on Android can watch your screen, take actions for you in multiple apps. - The Verge
Emotionally available AI agents for the win - Bloomberg
Still trash talking about Stargate - Mashable India
Stargate will be exclusively OpenAI - FT
Very different economics from existing OpenAI business, seems like it complicates the transition to for-profit a lot - Business Insider
The Apple exec tasked with whipping Siri into shape with AI - Bloomberg
Tired of menu overload? Let Just Salad's AI find your perfect bowl - Fast Company
AI shooter detection did not trigger in Nashville school shooting - NBC News
‘Eternal You’ and the Ethics of Using A.I. to ‘Talk’ to Dead Loved Ones - NY Times
Follow the latest AI headlines via SkynetAndChill.com on Bluesky