AI Reading for Tuesday May 14, and Oh! GPT-4o!
Samantha is in the house!
OpenAI showed 3 important things:
GPT-4o with faster response times at half the cost of ‘turbo’. Free in the public-facing ChatGPT, which will also have access to GPTs from the GPT Store.
Multimodality, interchangeably handling text, audio, and video.
A charismatic, human-like assistant that can respond almost instantly to what it hears and sees. This is maybe the big one.
For context, OpenAI is currently still the clear leader, but Gemini, Claude, and open-source models like Llama-3 have started nipping at OpenAI’s heels.
The first point is a bit like clearing technical debt to keep up with the competition. The benchmarks are a modest improvement over GPT-4-turbo.
The second point may also be mostly catch-up to Gemini which was fully multi-modal out of the blocks. The huge question: is GPT-4o better than Gemini at understanding what it sees and hears?
Gemini in my experience is OK. It’s more like, you can look at a painting and ask what it shows. I don’t think it’s understanding everything it sees. You’re not putting it in AR glasses and asking it, “where did I leave my keys?” It’s not really, here is a screen recording of me performing a task, write me a script to automate it. A part that is missing from GPT-4o is Gemini’s 1m-token-long context. Seems like they went for speed instead. I think without a huge context video understanding is inevitably going to be janky?
The 3rd point, the assistant, introduces something new. Based on the demo, it’s very fast and emotionally expressive, and feels like talking to a human. This leans into AI’s skill at imitating stuff.
There is a strong Eliza effect with ChatGPT, where even smart people are fooled into thinking it’s smarter than it is. ChatGPT is a poet, not a quant. It is skilled at imitating and bullshitting, it knows a lot of facts but it lacks deep understanding. There is a real risk that people will be even more bamboozled by this near-human doppelganger. Then, eventually they feel like AI is a bit of a scam. And also, they actually will be scammed by AIs that sound human and think the whole thing is just a scam.
AI today is a lot like early days of the Web. When Netscape came out in 1995 (or maybe by 1999), the corporate benefits of giving information workers access to all the information everywhere were clear. Similarly, the benefits are obvious when you get computing that has good understanding of language semantics and connotations and intent, and can make sense of text and multimedia, and even create them.
The Web drove a huge amount of investment in infrastructure like Cisco, not to mention Worldcom, Sun Micro, Akamai. Some of them panned out, most of them didn’t.
The consumer facing stuff like Amazon, Yahoo, TheGlobe.com, also some of them panned out but most of them didn’t. But it took a long long time for Amazon to really go mainstream. There’s a learning curve for consumers, how do I find what I need, do I really trust this thing. I suspect it will be the same with AI, eventually every site will have this sort of human assistant. But it will have to get a lot better and consumers will have to get comfortable with it.
This event was about driving more mass adoption, and not getting upstaged/buried by Google. But I think that consumer adoption will take years. This Samantha-type thing might drive a hype / valley-of-despair cycle when it disappoints. It might distract from laying the foundation, corporate adoption that pays the bills.
But I love the vision! And really look forward to trying it but it makes me a bit uncomfortable that with this Samantha / Sora stuff OpenAI is leaning into the sizzle. Gimme that GPT-5 and Q* steak!
Some other stuff I saw today:
A violin recital composed and played by AI. - YouTube
AI can make up songs now, but who owns the copyright? - The Conversation
Swedish startup aims to use AI to disrupt financial advisory. - Business Insider
Another weird/creepy AI ad, this one by KFC. It mocks AI so maybe the creatives need to chill. I get why they are big mad though. - Fast Company
A short combining Sora with real actors. - Twitter
Google I/O coming later, can probably follow live blogs and the Google Twitter - The Verge
Publishers brace for carnage as AI is expected to reduce traffic from search by 25% over next couple of years. - Washington Post
Follow the latest AI headlines via SkynetAndChill.com on Bluesky