AI predictions for 2025 (Part 3)
As January draws to a close, it’s time for the third and final installment of my AI predictions for 2025 (here’s Part 1 and Part 2).
9.) AI agents will proliferate but won’t be reliable enough outside of a few narrow domains
Most lists of AI predictions for 2025 will tell you that this is the year of AI agents.
However, there isn’t a single agreed definition of what an AI agent is.
ChatGPT just token-predicted me this definition: “An AI agent is a software programme or system designed to autonomously perform tasks, make decisions, or solve problems by perceiving its environment, processing information, and taking actions to achieve specific goals”.
OpenAI CEO Sam Altman gave a pithier definition whilst introducing the ‘research preview’ of its browser-based AI agent, Operator, last week: “AI systems that can do work for you independently”.
What’s missing from both these definitions is the word ‘reliably’.
Cognition launched its autonomous AI software developer, Devin, last March. However, a team who spent a month testing it recently concluded:
“When it worked, it was impressive…but…it rarely worked. Out of 20 tasks we attempted, we saw 14 failures, 3 inconclusive results, and just 3 successes.”
Early reviews of OpenAI’s Operator (which is currently only accessible to US ChatGPT subscribers on the $200 per month Pro plan) have voiced similar frustrations with its unreliability:
The process was painstaking and inefficient in a way that personally made me laugh but I imagine might drive others insane. In the end, adding six bananas, a 12-pack of seltzer, and a package of raspberries to a cart had taken me 15 minutes.
Yes, it’s a research preview, but Operator’s unreliability is a reflection of the inherent unpredictability of generative AI, especially when it meets the complexity of the real world.
Whilst I’m confident there will be material leaps forward in the reliability of AI agents within narrow domains in 2025 (including coding), I don’t believe we’ll see the sort of general purpose assistant that the term ‘AI agent’ typically evokes.
At least, not one I would trust to book me flights, wait in an online queue for Glastonbury tickets or even order me a pizza.
Instead, we’ll get AI that can undertake ever more sophisticated research and do more of the navigational/form-filling grunt work for us (see last year’s prediction ‘We’ll see more AIs navigating GUIs’).
Like the current generation Alexa, a high failure rate is likely to lead adopters of AI agents to only delegate low stakes tasks they’re confident the agent can handle. AI agents will remain too brittle to be trusted with high stakes assignments.
As is so often the way, the companies most likely to succeed with AI agents are those who already control our browsers and operating systems. Apple, Google and Microsoft have a significant advantage in being able to have their agents operate across apps and websites running on their platforms and already having our log-in credentials.
Whilst they’ll still be somewhat dependent on services allowing their agents access, they have much bigger carrots and sticks than individual consumer apps (Perplexity Assistant, launched last week, is available for Android but not iOS and OpenAI’s Operator is reportedly blocked from accessing YouTube and Reddit).
The case against: We’ve seen a number of big leaps forward in AI in the last few years (Large Language Models, Low-Rank Adaptation, Retrieval Augmented Generation, natively multi-modal models, ‘reasoning’ models). It’s possible another leap will significantly increase the reliability of AI agents and their ability to navigate the complexity of real world scenarios.
10.) AI assistants will get better at triage / air-traffic control
I’ve written before about the need for AI companies to invest more time and effort in developing product experiences which pick the right tool for the task in hand and not expect users to make sense of dropdowns like this:
One of the most interesting things about the recently-released DeepSeek-R1 is its application of a Mixture of Experts approach to decide which specialised element of the model is best placed to handle a given task.
As well as making the model more efficient (and therefore less environmentally impactful), it helps addresses AI assistants’ current Don’t Make Me Think problem.
The case against: The current pace of innovation in AI is leading to a proliferation of new models which AI companies are desperate to get into consumers’ hands ahead of the competition. Synthesising different models and abstracting the complexity and decision-making away from the user experience may continue to fall lower down the priority list than adding new models to the dropdown.
11.) Smart glasses will become an AI battleground with Google (re)entering the fray
From the Apple Newton (1993) to Microsoft’s Tablet PC (2001), the history of tech is littered with products that were ‘too soon’. Google Glass (2013) was one such product.
12 years on and generative AI has changed the potential utility of smart glasses and Meta Ray-Bans have shown you can persuade people to buy and wear smart glasses if they don’t look weird.
Smart glasses prototypes keep showing up in Google’s Project Astra demos and last week it confirmed a smart glasses development partnership with Samsung.
Whilst Google haven’t formally announced it’s planning to release its own smart glasses, I think they’ll pull out the stops to try and get their own frames to market in 2025.
Having their own handset in market has proved advantageous in mobile and I don’t think Google will want to leave Meta to further corner the market (although there’s no shortage of upstarts chasing a slice of the pie: see DreamSmart, Halliday, RayNeo, XREAL, Even Realities, Looktech).
Meanwhile, Apple looks set to continue its ‘best to market rather than first to market’ approach, with Apple smart glasses unlikely to appear in 2025.
The case against: The popularity of Meta Ray-Bans may not extend beyond early adopters, with privacy concerns trumping the functional benefits. Having got egg on its face with Google Glass, Google may be circumspect about re-entering the smart glasses product space and choose to focus on providing the platform.
12.) Microsoft and/or Amazon will acquire a major AI startup
Reverse acquihires were very much in vogue in 2024 (see Google/Character.AI, Amazon/Covariant, Amazon/Adept, Microsoft/Inflection) as a means of securing AI talent and tech without the potential regulatory pain of a regular acquisition.
With Trump back in office and Lina Khan very much out of office, tech companies are likely to be less shy of outright acquisitions in 2025.
Anthropic and Mistral are top of my list of potential targets. Both have built impressive models but neither has much of a moat (Operator is similar to Claude’s computer use and OpenAI recently upgraded its Canvas feature to more closely mirror the functionality of Claude’s Artifacts). As LLMs become increasingly commoditised and open-source (see prediction #8) and investors get spooked, it’s going to be harder for AI companies who are big but not huge to continue to go it alone.
Amazon is the obvious frontrunner to acquire Anthropic. It’s already the company’s biggest investor (it’s put in another $6bn since I created this chart) and it appears it could use more in-house AI firepower (see its reported struggle to upgrade Alexa).
Whilst Mistral’s CEO has denied the company’s for sale, Microsoft would do well to further lessen their dependence on OpenAI and they’ve already made a modest investment in Mistral (€15m). Whilst Microsoft’s smaller Phi models are impressive, having Mistral’s team and tech fully integrated could help them improve Copilot without increasing their dependence on OpenAI.
The case against: Investing in, rather than acquiring, companies like Anthropic and Mistral may continue to make sense for companies like Amazon and Microsoft. Whilst the US FTC may now be more receptive to acquisitions, it’s unclear if the UK’s CMA and the European Commission’s DG COMP will, especially if the target is European (Mistral is French). Google’s investment in Anthropic (recently topped up to $3bn) could also complicate an Amazon take over.
That’s it for my 2025 predictions. I’ll be checking back in on progress in a few months.
Thanks for reading Dan’s Media & AI Sandwich. If you’ve found this post interesting please consider liking and/or sharing it.