AI Reading for Tuesday February 25
Claude 3.7 Sonnet and Claude Code - Anthropic
Extended thinking - Anthropic
All eyes on GPT-4.5 as both Gemini 2.0 and Sonnet 3.7 match up with GPT-4o pretty well. - Ars Technica
A bit on the expensive side ... Sonnet feels great to me, sounds more college-educated, scientific, and neurotic vs less-educated, commercial, direct OpenAI. When I upload a paper and ask questions about it, Sonnet seems best vs. more simplistic OpenAI, and Gemini sometimes goes off the rails. But despite leading on artifacts, protocols and other stuff, Sonnet has been punching below its weight in benchmarks, even though to me it is top tier with OpenAI and Google (and maybe now R1 and Grok 3). Also OpenAI is way ahead on product, advanced voice mode, web search, agentic stuff etc.
Not 100% clear if Sonnet 3.7 catches up with o3-mini and DeepSeek R1 but seems promising (Gemini flash thinking is OK but maybe not as good as those 2) - One Useful Thing
Anthropic raising $3.5b at $61.5b valuation - Bloomberg
When reasoning models expose their chain of thought, they leave a detailed breadcrumb trail for how to jailbreak them. - The Register
Trump wants tighter AI chip export restrictions but fired all the Feds who know anything about designing and implementing chip policy - Tom's Hardware
Huawei and SMIC said to be getting better at manufacturing AI chips in China - FT
Apple to open AI server factory in Texas as part of '$500 billion' U.S. investment - CNBC
Google says its AlphaGeometry2 AI now 'better than human gold medalists' at solving International Math Olympiad geometry problems. - livescience.com
"ChatGPT Saved My Life (No, Seriously, I’m Writing this from the ER)" - Hard Mode First
How Is Your Team Spending the Time Saved by Gen AI? - Harvard Business Review
Adobe ships much-improved mobile Photoshop including Firefly AI features - The Verge
"You killed a forest for this" - The Daily Dot
Giant mining dumptrucks get self-driving capability - Bloomberg
You can never be too skeptical with this stuff - Pivot to AI
Well, BBC was a bit hype about this story, this story is hating on it, truth is, time will tell if this stuff really helps scientists and develops good hypotheses, or it just got lucky or had something similar in training data.
Perplexity wants to reinvent the web browser with AI—but there’s fierce competition - Ars Technica
Grok gives detailed instructions on how to make WMDs - Xitter
Debate rages over whether Grok 3 really beat OpenAI's models at math - TechCrunch
Feds now have to justify their continued employment weekly to a Grok AI prompt. - NBC
Seems designed to make everyone quit who is able to find work in private sector (where they get higher pay and better working conditions).
AI deepfake plays at HUD - New York Post
Sign up for this year’s Gen AI Intensive live course from Google and Kaggle. - Google
Follow the latest AI headlines via SkynetAndChill.com on Bluesky