> We believe in training our models using diverse and high-quality data. This includes
data that we’ve licensed from publishers, curated from publicly available or open-
sourced datasets, and publicly available information crawled by our web-crawler,
Applebot.
> We do not use our users’ private personal data or user interactions when training
our foundation models. Additionally, we take steps to apply filters to remove certain
categories of personally identifiable information and to exclude profanity and unsafe
material.
> Further, we continue to follow best practices for ethical web crawling, including
following widely-adopted robots.txt protocols to allow web publishers to opt out
of their content being used to train Apple’s generative foundation models. Web
publishers have fine-grained controls over which pages Applebot can see and how
they are used while still appearing in search results within Siri and Spotlight.
When Apple inevitably partners with OpenAI or Anthropic, which by their definition isnt doing "ethical crawling", I wonder how I should be reading that.
I mean they also buy from companies with less ethical supply chain practices than their own. I don’t know that I need to feel anything about that beyond recognizing there’s a big difference between exercising good practices and refusing to deal with anyone who does less.
You shouldn't believe Big Tech on their PR statements.
They are decades behind in AI. I have been following AI research for a long time. You can find best papers published by Microsoft, Google, Facebook in past 15 years but not Apple. I don't know why but they didn't care about AI at all.
Apple used to be at the edge of AI. They shipped Siri before "AI assistant" went mainstream, they were one of the first to ship an actual NPU in consumer hardware and put neural networks into features people use. They were spearheading computational photography. They didn't publish research, they're fucking Apple, but they did do the work.
And then they just... gave up?
I don't know what happened to them. When AI breakthrough happened, I expected them to put up a fight. They never did.
>I don't know what happened to them. When AI breakthrough happened, I expected them to put up a fight. They never did.
Apple always had the luxury of time. They work heavily on integrating deeply into their ecosystems without worrying about the pace of the latest development. eg. Widgets were a 2023 feature for iOS. They do it late, but do it well.
The development in the LLM space was and is too fast for Apple to compete in. They usually pave their own path and stay in their lane as a leader.
The impact on Apple's brand image will be tarnished if Google, Meta, OpenAI, MS all leapfrog Apple's models every 2-3 months. That's just not what the Apple brand is associated with.
One problem with Apple's approach here is that they were scraping the web for training data long before they published the details of their activities and told people how to exclude them using robots.txt
That seems like a potentially very useful addition to the robots.txt "standard": Crawler categories.
Wanting to disallow LLM training (or optionally only that of closed-weight models), but encouraging search indexing or even LLM retrieval in response to user queries, seems popular enough.
If you're using a specific user agent, then you're saying "I want this specific user agent to follow this rule, and not any others." Don't be surprised when a new bot does what you say! If you don't want any bots reading something, use a wildcard.
Yes, but given the lack of generic "robot types" (e.g. "allow algorithmic search crawlers, allow archival, deny LLM training crawlers"), neither opt-in nor opt-out seems like a particularly great option in an age where new crawlers are appearing rapidly (and often, such as here, are announced only after the fact).
Sure, but I still think it's OK to look at Apple with a raised eyebrow when they say "and our previously secret training data crawler obeys robots.txt so you can always opt out!"
I've been online since before the web existed, and this is the first time I've ever seen this idea of some implicit obligation to give people advance notice before you deploy a crawler. Looks to me like people are making up new rules on the fly because they don't like Apple and/or LLMs.
It's not controversial, it's just not how the ecosystem works. There has never been an expectation that someone make a notification about impending crawling.
It might be nice if there were categories that well-behaved bots could follow, as noted above, but even then the problem exists for bots doing new things that don't fall into existing categories.
My complaint here isn't what they did. It's that they explain it as "here's how to opt out" when the information was too late to allow people to opt out.
> Using our web crawling strategy, we sourced pairs of images with corresponding alt-texts.
An issue for anti-AI people, as seen on Bluesky, is that they're often "insisting you write alt text for all images" people as well. But this is probably the main use for alt text at this point, so they're essentially doing annotation work for free.
It's fine if you want to, but I think they should consider that basically nobody is reading it. If it was important for society, photo apps would prompt you to embed it in the image like EXIF.
Computer vision is getting good enough to generate it; it has to be, because real-world objects don't have alt text.
I actually use Claude to generate the first draft of most of my alt text, but I still do a manual review of it because LLMs usually don't have enough contents to fully understand the message I'm trying to convey with an image: https://simonwillison.net/2025/Mar/2/accessibility-and-gen-a...
Why would photo apps do what's "important for society"?
Annotating photos takes time/effort, and I could totally imagine photo apps being resistant to prompting their users for that, some of which would undoubtedly find it annoying, and many more confusing.
Yet I don't think that one can conclude from that that annotations aren't helpful/important to vision impaired users (at least until very recently, i.e. before the widespread availability of high quality automatic image annotations).
In other words, the primary user base of photo editors isn't the set of people that would most benefit from it, which is probably why we started seeing "alt text nudging" first appear on social media, which has both producer and consumer in mind (at least more than photo editors).
> Why would photo apps do what's "important for society"?
One would hope they're responsive to user demands. I should say Lightroom does have an alt text field, but like phone camera apps don't.
Apple is genuinely obsessed with accessibility (but bad at social media) and I think has never once advocated for people to describe their photos to each other.
> An issue for anti-AI people, as seen on Bluesky, is that they're often "insisting you write alt text for all images" people as well. But this is probably the main use for alt text at this point, so they're essentially doing annotation work for free.
How did you come to the conclusion that those two groups overlap so significantly?
This is a well known fact. A bunch of AI researchers tried to migrate to the platform from Twitter but got a ton of hate and death threats from other users so they went back. Bluesky has a pretty strong anti-AI bias and the community of folks talking about it despite that is very small.
So you found a couple people expressing this conflicting view and assumed it applies to a larger group? Doesn’t sound very reliable to me but I see this all the time and it makes sense if you look at it as a mechanism to explain the world .
Gotta polish that fig-leaf to hide Apple's real stance towards user privacy: arstechnica.com/tech-policy/2023/12/apple-admits-to-secretly-giving-governments-push-notification-data/
> Apple has since confirmed in a statement provided to Ars that the US federal government "prohibited" the company "from sharing any information,"
All I can say is, I asked Siri today (verbatim): What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit — and it offered a web search about fahrenheit. The "and" completely disabled its most basic ability to do metric conversions.
So, it's nice to see Apple is doing research and talking about it, but we're out here waiting, still waiting, for anything useful to make of it all on our thousand-dollar devices that literally connect us to the world and contain our entire life data. It's what I would've expected from one of the most valuable companies in the world.
First, most of the English speaking world is not native.
"As of 2022, there were about 400 million native speakers of English. Including people who speak English as a second language, estimates of the total number of Anglophones vary from 1.5 billion to 2 billion."
Second, all popular models I tested did well with that query, including Gemini on Android (aka "ok Google"), except Apple's.
I am not sure why you go on the subject of English speaking world etc.
Anyway, the models you tested with that query, which I am not sure why we think is a good benchmark, are local models running on a wireless device or they use datacenter and only convey the text back and forth?
I'm fairly sure Siri still sends user voice samples to a data center. At least for a while, it used to use multipath TCP to decrease latency over multiple available network connections if I'm not misremembering.
Some modern Apple devices support "local Siri", but it's a limited subset of both voice recognition performance and capabilities.
That exact phrase "What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit" given to ChatGPT produces the correct result (it infers that the second degrees must be Celsius) and ChatGPT gives me a nicely laid out formula for the math of the conversion.
A tool that can handle more than one question at a time is useful. Modern LLMs handle that with ease. So it's completely reasonable to be critical of that limitation.
A plethora of LLMs are available on Apple platforms. If someone wants a chatbot, they can get a chatbot on Apple products. It’s not hard.
Are all Android users using Gemini exclusively? Are all Windows users only using Copilot? Where is the native Linux desktop LLM?
I really don’t understand this criticism. Would it be nice if Siri could do more, sure. Do I have tolerance for Siri to start hallucinating on simple problems it used to use real math for, no. Do I have other options to use in the meantime to get the best of both worlds, absolutely. Where is the hardship?
Siri is the default and only voice assistant that has access to all the data on your phone. It doesn't matter if I have ChatGPT, Claude, Gemini, or another SOTA model on my iPhone—I can't easily activate them in the car or another handsfree situation or use them with any other app or data on my iPhone.
The LLMs aren't necessarily competitors. Apple doesn't need to have the best all around LLM. They need to create an AI with excellent integration into their OS and the data the user's store on those systems. Beyond that, they need to have a good system for plugging into whatever other generic LLM a person might want/need. Having something decent out of the box is nice, for basic questions, but being able to easily switch to whatever specialist company is in the lead, or best suited for a user's need, is a lot better than being stuck with one first-party option. Based on how ChatGPT looks in the Apple Settings, I wouldn't be surprised if this is the plan.
Much like with the internet, Apple didn't need to re-invent every website to own it all. From Apple platforms a user can access Amazon, Google, or whatever else. Apple didn't create the internet, they sold a gateway to it. AI could be done largely the same way. This way it doesn't matter who wins, Apple can support it. At the end of the day, an LLM doesn't exist on its own, it needs to be accessed through hardware/software people enjoy using, and not be yet another device to charge and carry. Apple has a very popular phone and the most popular wearable. This positions them very well. They are often late to the party, but tend to be best dressed. The first iPhone didn't even have video, and people clowned them for it, and now iPhone video is largely considered one of the best in the smartphone world.
Sure, what’s not reasonable is expecting Siri to be a modern LLM, when they know it’s not. They asked a question they knew Siri couldn’t handle just to slam it. I’m not critical of a 5-function calculator for not one-shotting complex equations like a computer.
While Siri only does one thing at a time, I trust the answer more, because it’s doing the actual math and not just guessing what the most likely answer is, like an LLM. We need to pick the right tool for the right job. Frankly, I don’t think an LLM is the right tool for conversations like this, and jumbling multiple questions into a single question is something people do with LLMs to get more use out of them during the day, this is an adaptation to a limitation of the free tier (and sometimes speed) of the LLM.
On Android phone, the equivalent voice assistant (Gemini) handles the question gracefully. Regardless of what you think about Google, having a single-button LLM-powered voice assistant, deeply integrated into the phone's OS, is a very useful feature, and Apple is quite far away from developing a competing version of this. They'll have to buy it or go without.
The unreasonable part is acting like Siri got its big LLM update, when they know it didn’t. Just like it would be unreasonable to expect any famously delayed, or unannounced, feature to magically start happening.
Amazon just needs a generic LLM. Apple, from the sound of it, is trying to create deep integration with the OS and on-device data. That’s a different problem to solve. They also seem to be trying to do it while respecting user privacy, which is someone most other companies ignore.
I don’t see what the big deal is. I’d rather wait for something good than have them rush out a half-ass “me too” chatbot, that is indistinguishable from the dozens of other chatbots I can simply download as an app for that.
If we believe what Craig Federighi said, they had something, it just wasn’t up to their standards when talking about rolling it out to a billion devices. Which is fair, I run into bad data from ChatGPT and other LLMs all the time. Letting it mature a little more is not a bad thing.
ChatGPT spent a couple months getting my dad pumped up for an elective open heart surgery; he was almost arrogant going into it about how the recovery would go, thinking ChatGPT gave him all the info he could possibly need and a balanced view of reality. Reality hit him pretty hard in the ICU. He sent me some of the chats he had, it was a lot of mutual ego stroking. He was using ChatGPT to downplay the negatives from the doctors and boosting the positives. While it’s good to feel confident, I think it went too far. I spent the whole week in the hospital trying to pull him out of his depression and recalibrating the unrealistic expectations that ChatGPT reinforced. I hope Apple finds a way to be more responsible. If that takes time, great.
"holding it wrong" was exactly the right phrase given how that phrase was used with the iPhone antenna bridging problem. This is an Apple product failing.
It’s been this way for over a decade. If someone hasn’t figured it out by now, that’s kind of on them.
I’m not even sure why those two things would be asked as a single question. It seems like a very unnatural way to pose those two questions. Most humans would trip on that, especially if it was asked verbally.
I can’t talk to ChatGPT hands-free on my Apple devices, but I can to ChatGPT.
Besides that, many people don’t install any apps, and Apple not pre-installing a reasonable LLM to cater to that market just seems incredibly out of character.
And there’s enough credible reporting and personnel reshuffling happening to suggest that it’s not available yet because they failed to make it work, not because they didn’t try.
OP isn't asking how to use Siri to do his contrived task. OP is saying that Siri in 2025 should be able to handle that relatively simple albeit contrived task.
Your usage of Siri today (probably on an old version of iOS) frankly has nothing to do with the article we are discussing. Sorry to say this but it is going to take time. Comparing the performance of a chatgpt running in a big data center with a model running locally on a phone device... give it a few years.
People have been giving Siri a few years for a decade now. Siri used to run in a data center (and still does for older hardware and things like HomePods) and it has never supported compound queries.
Siri needs to be taken out back and shot. The problem with “upgrading” it is the pull to maintain backwards compatibility for every little thing Siri did, which leads them to try and incorporate existing Siri functionality (and existing Siri engineers) to work alongside any LLM. Which leads to disaster, and none of it works and just made it all slower. They’ve been trying to do an LLM assisted Siri for years now and it’s the most public facing disaster the company has had in a while. Time to start over.
As a user, I'd gladly opt into a slightly less deeply integrated Siri that understands what I want from it.
Build a crude router in front of it, if you must, or give it access to "the old Siri" as a tool it can call, and let the LLM decide whether to return its own or a Siri-generated response!
I bet even smaller LLMs would be able to figure out, given a user input and Siri response pair, whether the request was resonably answered or whether the model itself could do better or at least explain that the request is out of capabilities for now.
> Build a crude router in front of it, if you must, or give it access to "the old Siri" as a tool it can call, and let the LLM decide whether to return its own or a Siri-generated response!
Both of these approaches were tried internally, including even the ability for the LLM to rewrite siri-as-a-tool's response, and none of them shipped, because they all suck. Putting a router in front of it makes multi-turn conversation (when Siri asks for confirmation or disambiguation) a nightmare to implement, and siri-as-a-tool suffers from the same problem. What happens when legacy siri disambiguates? Does the LLM try to guess at an option? Does it proxy the prompt back to the user? What about all the "smart UI" like having a countdown timer with Siri saying "I'll send this" when sending a text message? Does that just pass through? When does the LLM know how/when to intervene in the responses the Siri tool is giving?
This was all an integration nightmare and it's the main reason why none of it shipped. (Well, that and the LLM being underwhelming and the on-device models not being smart enough in the first place. It was just a slower, buggier siri without any new features.)
The answer is that they need to renege on the entire promise of a "private" siri and admit that the only way they can get the experience they want is a _huge_ LLM running with a _ton_ of user context, in the cloud, and don't hinder it all with backwards compatibility with Siri. Give it a toolbox of things it can do with MCP to your device, bake in the stock tools with LoRA or whatever, and let it figure out the best user experience. If it's a frontier-quality LLM it'll be better than Siri on day one, without Apple having to really do anything other than figure out a good system prompt.
The problem is, Apple doesn't want to admit the whole privacy story is a dead-end, so they're going to keep trying to pursue on-device models, and it's going to continue to be underwhelming and "not meeting our quality bar", for the foreseeable future.
Very good details on why just bolting on an LLM isn't that trivial I hadn't really considered before, thank you!
But regarding Apple not wanting to admit that client side compute isn't enough: Haven't they essentially already done that, with Private Cloud Computing and all that? I believe not even proofreading and Safari summarization work fully on-device, at least according to my private compute privacy logs.
> Your usage of Siri today (probably on an old version of iOS) frankly has nothing to do with the article we are discussing.
Yes, but isn't that infuriating? The technology exits! It even exists, as evidenced by this article, in the same company that provides Siri!
At least I feel that way every time I interact with it – or for that matter my Google Home speaker, ironically made and operated by the company that invented transformer networks.
Despite all the “Apple is evil” or “Apple is behind” (because they don’t do evil). Well, what they made with the Foundation Model is great. The fact that they build a system within the Swift language that allows you to specify structured data models (structs) to be used like any other model in a modern programming language, and you actually get back generated data in that format is great. Unlike a lot of other AIs where you might get back a well formatted JSON after a carefully crafted request, but still you never can’t be sure and need to implement a bunch of safeguards. Obviously it’s still the beginning and other tools might do something similar. But as an iOS developer that makes the usage of AI so much simpler. Especially with the bridge to external AIs that still allows you to map back to the type safe structured Swift models. I try not to be a hater, every progress, even slow or underwhelming at first might lead to improvements everywhere else.
Nobody is forcing "hardcore LLM" features on anyone, besides maybe Microsoft. This is that same cope as "I'm glad Apple Car can't crash and threaten people's lives" despite the fact that... yunno, Apple really wanted to bring it to market.
Siri, sideloading and AI features all all the same way; give people options and nobody will complain.
If they give Siri LLMs, there will be headlines that it drove kids to suicide. People really don't need LLMs.
Sideloading is bad for business. Most users don't care. Remember, we, the devs, are not the core target/biggest spenders. They are targeting a large audience of young people who are not tech-savvy.
Sorry if I didn’t use the correct terms. Didn’t catch up on all the terminology coming from my native language. ;) But yes, I agree, the fact that parts, different parameters, of the model can be completed asynchronous by streaming the output of the model, is quite unique. Apple/swift was late with async/await, but putting it all together, it probably plays well with the ‘never’ (I know ) asynchronous and reactive coding.
I have this toy agent I'm writing, I always laugh that I, human, write a code that generates human-readable markdown, that I feed to llm where I ask it to produce a json, so I can parse (by code I, or it wrote) and output in a consistent human-readable form.
I'm thinking about let it output freeform and then use another model to use to force that into structured.
I've found this approach brings slightly better result indeed. Let the model "think" in natural language, then translate it's conclusions to Json. (Vibe checked, not benchmarked)
How do you think their implementation works under the hood? I'm almost certain it's also just a variant of "structured outputs", which many inference providers or LLM libraries have long supported.
Huh? Grammar-based sampling has been commonplace for years. It's a basic feature with guaranteed adherence. There is no "carefully crafting" anything, including safeguards.
Every time I see a paper from Apple I just feel like, OK so why isn’t my iPhone actually doing any of this yet?
Why give this to developers if you haven’t been able to get Siri to use it yet? Does it not work or something? I guess we’ll find out when devs start trying to make stuff
I agree. You can revert this UI frustration. Go to your settings. Then, go into screen time. Next, go into content and privacy restrictions and enable that. Under Apple intelligence in the restrictions you can then disable individual features.
My anonymous friend who wrote that settings pane would like to say that is not what they meant it for, but have fun.
(Since it's meant to restrict your children, using it to restrict yourself will disable some features that'd let you escape it. I forget what exactly, but you might not be able to change the time or something like that.)
Yes, but why do I have to open a third-party app to do these things when Apple, the company that primarily popularized the entire genre of mobile voice assistants, could very feasibly bake all of that into theirs?
I mean, the thing even lets me ask ChatGPT things if I explicitly ask it to! But why do I need to ask in the first place?
I don’t speak for Apple but certainly you can appreciate that there is a fine balance between providing basic functionality and providing apps. Apple works to provide tools that developers can leverage and also tries to not step on those same developers. Defining the line between what should be built-in and what should be add-on needs to be done carefully and often is done organically.
> why isn’t my iPhone actually doing any of this yet?
Probably Apple is trying to distill the models so they can run on your phone locally. Remember, most, if not all, of Siri is running on your device. There's no round trip whatsoever for voice processing.
Also, for larger models, there will be throwaway VMs per request, so building that infra takes time.
The model now available to developers (in beta, not in released versions of iOS) is the same model that powers stuff like the much-maligned notification summaries from iOS 18. So your phone does have features that are powered by this stuff… you may just not be particularly overwhelmed by those features.
That’s kinda my point though - is this only capable of things like this? If it ia capable of more, why isn’t there something more yet, it’s been a long time waiting…
They just launched "Private Cloud Compute" with much fanfare to enable server-side LLM processing, so between that and the fact that Siri has been server-based for most of its existence (local processing is fairly new), I don't think that's their main constraint at this point.
That said, "Private Cloud Compute" does run on proprietary Apple hardware, so availability might be a concern (assuming they don't want to start charging for it).
I know apple is methodical and don’t show their hand but I cannot help but feel they are releasing all this research because they haven’t integrated any into the phone or provided a compelling AI functionality for their users. This is their only way to say “hey we are good with AI too”.
AFAICT this is the first commercial model trying to be marketed as responsibly-sourced. Love it, but it also seems like the noise around this issue has died down. Is this for legal cover? Or more apple-privacy marketing
"Sorry we are hilariously far behind everyone else in the industry after having made a huge amount of fanfare about 'Apple Intelligence' for years. It's just that we have shot ourselves in the knee to satisfy Bluesky posters and the NY Time's lawyers"
My son (he's 11 years old now and fairly skilled with all the main AI tools, eg chatgpt, gemini, etc) and I retry her every month or so, and this past time we just laughed. Can't handle basic questions - hears the question wrong, starts, stops, takes us to some random ass webpage, etc, etc.
"She's so jacked up!" he said.
Apple needs to get this under control and figured out, stat!
Looks nice. I just wish they’d improve the models behind dictation on both iPhone and Mac to have better accuracy and on the fly multiple language transcription.
I'd really like to be able to use this 3B model on my little 4GB GPU card!
It looks very capable for a reasonable weight.
Maybe one day on HhuggingFace
This isn’t the Apple I remember. Product integration falls apart at every seam, but don’t worry—we’ve got plenty of impressive technical documentation to compensate. I’m sure Jobs would be thrilled to see his ‘it just works’ philosophy replaced with ‘it barely works, but here’s a 50-page PDF explaining why.
The philosophy is the same, and since it was never implemented in the mythical era of Jobs, so is the practice. So he'd be as thrilled as he was back then?
As someone that was around in the days of Apple before bankruptcy, the same, there is no Jobs more around, and is getting back to Gil Amelio kind of Apple.
Tim Cook might be better at squeezing the juice, but he is not a product person.
This time around they need another solution, otherwise regardless of how much money they have, they will stay as the iOS/iPad company, given the relevance of macOS on desktop market worldwide.
It does. You can use it directly on iOS 26 beta - without writing a line of code I can toy with the on-device model through Shortcuts on my 16 Pro. It’s not meant to be a general purpose chatbot… but it can work as a general purpose chatbot in airplane mode which is a novel experience.
It would be interesting to see the tok/s comparison between the ANE and GPU for inference. I bet these small models are a lot friendlier than the 7B/12B models that technically fit on a phone but won't accelerate well without a GPU.
I thought the big difference between the GPU and ANE was that you couldn't use the ANE to train. Does the GPU actually perform faster during inference as well? Is that because the ANE are designed more for efficiency or is there another bigger reason?
It’s “free”, as in it doesn’t charge you anything or require a subscription: it’s a part of Apple Intelligence which is basically something bought with the device. It’s in the cloud so theoretically one shouldn’t need a quite new iPhone or Mac but - one does.
As someone mentioned, this model is available in the beta version of iOS 26; it's also part of macOS 26, iPadOS 26 and visionOS 26. Anyone with a free developer account can install the developer betas; the public beta is expected next week.
There's a WWDC video "Meet the Foundation Models Framework" [1].
> The new Foundation Models framework gives access to developers to start creating
their own reliable, production-quality generative AI features with the approximately
3B parameter on-device language model. The ∼3B language foundation model at the
core of Apple Intelligence excels at a diverse range of text tasks like summarization,
entity extraction, text understanding, refinement, short dialog, generating creative
content, and more. While we have specialized our on-device model for these tasks,
it is not designed to be a chatbot for general world knowledge. We encourage app
developers to use this framework to design helpful features tailored to their apps
There are even already some local AFM to Open AI API bridge project on GitHub - that lets you point basically any Open AI compatible client at the local models. Super nice for basic summarisation and completions.
The more I think about Apple, the more I realize that Apple is so far behind. While other companies are pushing the envelope (OpenAI, Anthropic, Google ..) Apple's ambitions seem much much smaller.
And this is after they made very big claims with Apple Intelligence last year, when they had everyone fooled.
This is like watching a train-wreck in slow motion.
Apple's ambitions are actually bigger than openai or anthropopic. Only Google's ambition (surprise surprise) is similar. Apple fundamentally wants the llm to be a tool. It doesn't want the llm to be the product.
They're not a model company. The risks of deploying something half-baked to their users is unacceptable. They're taking it slow and trying to do it in a way that doesn't damage/erode their brand.
Wait it out, let the best model(s) rise to the surface (and the hallucination problems to get sufficiently mitigated), and then either partner with a proprietary provider or deploy one of the open source models. Makes more sense than burning billions of dollars training a new foundation model
This is a reasonable approach, but unfortunately misses what made Apple soooo successful. Apple is the master of controlling the brand. Apple DOES NOT like to highlight their suppliers. Nobody knows who makes iPhones displays, or sensors, or RAMs.
They love to "invent" brands that they control, so that they can commodotize the underlying supplier. Hey user, it is a retina display and dont worry whether it is LG or Samsung is making it.
Apple tried this with AI, calling it "Apple Intelligence". Unfortunately that faltered. Now Apple will have to come out and say "iPhone with ChatGPT" or "Siri with Claude". AND APPLE HATES THAT. HATES IT WITH PASSION.
People will start to associate smartness with ChatGPT or Claude, and Apple loses control and OpenAI/Anthropic's leverage goes up.
Apple has painted themselves into a corner. And as I said elsewhere, it is a train-wreck happening in slowmotion.
Please go rewatch the iPhone keynote by Steve Jobs. Everyone remembers the beginning; few seem to remember that he brings out 3 other CEOs to highlight the integrations between the iPhone and those companies.
Or consider that they spent a decade highlighting that their computers were powered by Intel, after leaving their proprietary PowerPC architecture—again, under Steve Jobs.
Or go all the way back to 1997 when Steve Jobs had Bill Gates on the screen at Macworld and announced that IE would be the default browser on Mac.
It’s easy to fall into a caricature of Apple, where they insist on making everything themselves. What is more accurate is to say that they are not afraid to make things themselves, when they think they have a better idea. But they are also not afraid to do deals when it is the best way forward right now.
They already deployed half-baked models (eg needing to disable news summaries because they were so bad), and haven't delivered on other aspects of apple intelligence. This is hard to call being cautious, this is them not being able to keep up.
Exactly. Another mobile.me moment that adversely impacts customers is worse than making something useful that works. Anyone that “needs” AI can use an app.
I wouldn’t go as far as GP, but yes, absolutely, they must compete with large models on the internet. Customers are now used to being able to ask a computer a question and get something better than “I just ran a web search for what you said, here are the uncurated, unsummarized results”.
Yes, this is in fact what people want. Apple is the biggest company in the world (don’t quibble this y’all, you know what I mean) and should be able to deliver this experience. And sure, if they could do it on device that would be aces, but that’s not an item on the menu, and customers seem fine with web-based things like ChatGPT for now. To act like Apple is doing anything other than fumbling right now is cope.
Erm, have you heard of these things called apps? It’s this magical concept where other companies can run code your iPhone, and deliver all the features you just talked about.
I don’t really understand why Apple has to provide a ChatGPT product, baked directly into their software. Why on earth would Apple want to get involved in the race to the bottom for the cheapest LLMs? Apple doesn’t produce commodity products, they package commodities into something much more unique that gives them a real competitive advantage, so people are willing to pay a premium for the Apple’s product, rather than just buying the cheapest commodity equivalent.
There is no point Apple just delivering an LLM. OpenAI, Anthropic, Google etc already do that, and Apple is never going to get into the pay-per-call API service they all offer. Delivering AI experiences using on-device only compute, that’s something OpenAI, Anthropic and Google can’t build, which means Apple can easily charge an premium for it, assuming they build it.
> I don’t really understand why Apple has to provide a ChatGPT product
Control. It boils down to control. If you own a platform, you want to make your "suppliers" (apps in this case) as substitutable as possible.
If people start associating ChatGPT or Claude or Gemini as the main reasons to buy a phone, at some point in the future, they'll think - gee, most of what I'm doing on the phone is interacting with $app, and I can get the $app elsewhere.
This usecase is run of the mill for someone like Google, who used to store and show you your location forever, but it's not in Apple style.
It's hard to be like "uhhh privacy" when you send all requests to a remote server where they're stored in clear text for god knows how long.
As of right now, there is no way to run big LLMs in a privacy preserving manner. It just doesn't exist. You can't E2EE encrypt these services, because the compute is done on the server, so it has to decrypt it.
There are some services which will randomize your instance and things like that, but that kind of defeats the a big part of what makes LLMs useful, context. Until we can run these models locally, there's no way to get around the privacy nightmare aspects of it.
I see it as the opposite. Apple is absolutely positioned to own "chat". I am not worried they'll soon sort things out — and eventually we'll have an LLM integrated into the iPhone; call it Siri or otherwise.
With my history encrypted in the cloud, and the trust that Apple has built around privacy ... I think they're going to come out alright.
But they have de facto admitted failure of most of the strategy if the rumours are true that they are switching much harder to OpenAI/Anthropic for upcoming LLM products.
This is the first time in 10+ years I've seen Apple so far on the back foot. They usually launch category defining products that are so far ahead of the competition, even by the time they work through the 'drawbacks' in the first versions of them they are still far ahead. OS X, the iPhone and the iPad were all like that. They are still way ahead of the competition on Apple Silicon as well.
I am not very confident on their on device strategy at least in the short to medium term. Nearly all their devices do not have enough RAM and even if they did SLMs are very far behind what users "know" as AI - even the free ChatGPT plan is leap years ahead of the best 3B param on device model. Maybe there will be huge efficiency gains.
Private cloud is used AFIAK for virtually 0 use cases so far. Perhaps it will be more interesting longer term but not very useful at the moment given the lack of a suitable (ie: non Chinese), large (>500b param) model. They would also struggle to scale it if they roll it out to billions of iOS devices especially if they put features that use a lot of tokens.
Then they've got OpenAI/Gemini/Anthropic via API. But this completely goes against all their private cloud messaging and gives those providers enormous potential control over Apple, which is not a position Apple usually finds itself in. It will also be extremely expensive to pay someone per token for OS level features for billions of iOS/Mac devices and unless they can recoup this via some sort of subscription will hit services margins badly.
To me its clear the future of "OS" is going to involve a lot of agentic tool calling. These require good models, with large context windows and a lot of tokens - this will definitely not work on device. Indeed this is exactly what the Siri vapourware demo was.
I'm sure they can potentially get to a great UX (though these missteps are making me question this). But having such a core feature outsourced does not leave them in a good position.
You're right about the RAM, of course. Apple will no doubt have to run that up. At the same time it's an obvious "top tier" feature for the "Apple aiPhone 17 Max". And it will cost dearly.
> Private cloud is used AFIAK for virtually 0 use cases so far.
Applications using Apple's foundation models can seamlessly switch from on-device models to Private Compute Cloud.
Research is already showing the use of LLMs for people's most intimate relationship and medical issues. The usual suspects will try to monetize that, which why Private Cloud Compute is a thing from the jump.
> Then they've got OpenAI/Gemini/Anthropic via API. But this completely goes against all their private cloud messaging
Using ChatGPT via Siri today, no personally identifying information is shared with OpenAI and those prompts aren't used for training. I suspect Apple would want something similar for Google, Anthropic, etc.
At some point, there will be the inevitable enshitification of AI platforms to recoup the billions VCs have invested, which means ads, which won't happen to Apple users using foundation model-based apps.
> Nearly all their devices do not have enough RAM and
Every Apple Silicon Mac (going back to the M1 in 2020) can run Apple Intelligence. 8 GB RAM is all they need. Every iPhone 15 Pro, Pro Max and the entire 16 line can all run Apple Intelligence.
Flagship iPhone 17 models are expected to come with 12 GB of RAM and all current Mac models come with at least 16 GB.
Apple sells over 200 million iPhones in a given year.
There's no doubt Apple stumbled out of the gate regarding AI; these are early days. They can't be counted out.
8GB RAM is not enough for a semi-decent model IMO. 12/16GB is better (4GB for model and 8GB for OS) and really if you were going hard on device you'd probably want more like 32GB (24GB for model + 8GB for everything else - you'd be able to run a 13b param model with larger context size with that).
Even still though people are used to the quality of huge frontier models, so it will feel like a massive downgrade on many tasks. The _big_ problem with all this is chained tool calling. It uses context SO quickly and context needs a lot of (V)RAM. This also completely undermines the privacy argument you make, because it will need to ask personal data if using OpenAI and put it in the prompt.
Yes I noticed Apple shipping higher RAM but it will take years for this to feed through to a sizeable userbase, and people are quickly getting ingrained to use an app like ChatGPT instead of OS level features. Even more so given what a flop Apple Intelligence 1.0 has been.
The key problem they've got is they've went hard on privacy (which means it is hard to square that with going all in on 3rd party APIs) but they've also been incredibly stingy with RAM historically, which really nerfs their on device options. Private compute is an interesting middle ground but their model options are incredibly limited currently.
> 8GB RAM is not enough for a semi-decent model IMO.
Apple's ~3 billion parameter on-device model is about as good as it gets on a smartphone, especially for the functions it was designed for: writing and refining text, prioritizing and summarizing notifications, creating images for conversations, and taking in-app actions.
Every Mac comes with at least 16 GB of RAM; while every iPhone comes with 8 GB of RAM, some models of the iPhone 17 will have 12 GB.
Remember, an app using the on-device model can seamlessly shift to a much bigger model via Private Cloud Compute without the user having to do anything.
If the user enables it, Apple's Foundation Model can use ChatGPT in a privacy preserving way. By the fall, Gemini and Sonnet/Opus could be options as well.
Again, ChatGPT is used in a privacy preserving way; you don't need an account: "Use ChatGPT with Apple Intelligence on iPhone" [1].
Apple is only "behind" if you think they're in the same race. They haven't shown any interest in developing frontier models or taking on the enormous costs of doing so.
Did you even watch Apple Intelligence ads? They were very much in the race, just that they got ahead of themselves a bit.
They were touting the same features that other companies are now delivering. Point the phone at something, and it'll tell you what you're looking at. Or summarize news articles etc. Instead we got .. emojithingy
I'm having trouble understanding, do you think people are going to stop buying iPhones because Siri isn't as good as ChatGPT? Do you think Apple users are going to flood over to the Pixel phone to use Gemini?
I feel like this is the most exciting news today about AI on hn. I really hope apple shows that small models can be just as capable as the bigger ones. Maybe they have the people on perplexity working on these small models.
Lol and yet, Google has AI image descriptions in their screen reader, TalkBack, before Apple. Apple is supposed to be the accessibility king. But with AI, they just can't, even if they obviously have access to ChatGPT which has vision capabilities. Granted, I don't know what model Google uses because tech news don't do Android Accessibility Suite APK teardowns, but it works pretty well, and fast too.
It's hard to know what it isn't for certain but there are many other reasons papers list contributors in a flat structure (be it random or alphabetical order). Particularly with large numbers of collaborators.
Not very hard to look people up on LinkedIn and figure out who the core researchers are. I think this is just a very surface-level overview paper that encompasses a bunch of different research projects conducted by different teams, and it would be difficult to order the contributors in any meaningful way.
Considering a large portion of the contributors have names originating in a script and language that has no relationship whatsoever to English’s already arbitrary letter ordering, this list configuration is as good as any.
This is the first time that millions of people will actually download and run a model on their own devices.
The question is… will Apple be constantly tweaking these models, or only during OS upgrades?
I for one really like local software. Call me old-fashioned, but I enjoy when a company doesn’t switch up software anytime on the server, or phone the results home all the time in order to extract more profits from their collective users.
> “Adapters produced by the toolkit are fully compatible with the Foundation Models framework. However, each adapter is compatible with a single specific model version, meaning that a new adapter must be trained for each new version of the base model.”
Any changes should require retraining any LoRA adapters that has been built & distributed by third party developers, so they wouldn’t update the models outside OS updates at the drop of a hat I don’t think.
LoRA adapters can be distributed via Background Assets, but the base model itself should be version-locked to the OS build (e.g. iOS 26.0 → 26.1) and updates only when Apple ships a new OS image.
In the meantime, when I ask Siri to set a timer for 15 minutes, about 10–15% of the time it just says, “Here’s what I found about setting a timer for 15 minutes,” instead of actually setting the timer"
> We believe in training our models using diverse and high-quality data. This includes data that we’ve licensed from publishers, curated from publicly available or open- sourced datasets, and publicly available information crawled by our web-crawler, Applebot.
> We do not use our users’ private personal data or user interactions when training our foundation models. Additionally, we take steps to apply filters to remove certain categories of personally identifiable information and to exclude profanity and unsafe material.
> Further, we continue to follow best practices for ethical web crawling, including following widely-adopted robots.txt protocols to allow web publishers to opt out of their content being used to train Apple’s generative foundation models. Web publishers have fine-grained controls over which pages Applebot can see and how they are used while still appearing in search results within Siri and Spotlight.
Respect.
When Apple inevitably partners with OpenAI or Anthropic, which by their definition isnt doing "ethical crawling", I wonder how I should be reading that.
They already partnered with OpenAI, right?
To use their APIs at a discount, so what?
Apple aren’t paying OpenAI anything:
https://www.bloomberg.com/news/articles/2024-06-12/apple-to-...
That's a big discount
They paid in exposure, unironically.
That's quite a discount! ;)
The art of the deal
In theory Apple could provide their training data to be used by OpenAI/Anthropic.
It isn't "apple proprietary" data to give it to OpenAI.
Also the bigger problem is, you can't train a good model with smaller data. The model would be subpar.
"Good artists copy; great artists steal"
- Famous Dead Person
I mean they also buy from companies with less ethical supply chain practices than their own. I don’t know that I need to feel anything about that beyond recognizing there’s a big difference between exercising good practices and refusing to deal with anyone who does less.
[dead]
Same way as the other parts of their supply chain I suppose.
You shouldn't believe Big Tech on their PR statements.
They are decades behind in AI. I have been following AI research for a long time. You can find best papers published by Microsoft, Google, Facebook in past 15 years but not Apple. I don't know why but they didn't care about AI at all.
I would say this is PR to justify their AI state.
Apple used to be at the edge of AI. They shipped Siri before "AI assistant" went mainstream, they were one of the first to ship an actual NPU in consumer hardware and put neural networks into features people use. They were spearheading computational photography. They didn't publish research, they're fucking Apple, but they did do the work.
And then they just... gave up?
I don't know what happened to them. When AI breakthrough happened, I expected them to put up a fight. They never did.
> I don't know what happened to them.
Tim Cook happened. The fish rots from the head down.
>I don't know what happened to them. When AI breakthrough happened, I expected them to put up a fight. They never did.
Apple always had the luxury of time. They work heavily on integrating deeply into their ecosystems without worrying about the pace of the latest development. eg. Widgets were a 2023 feature for iOS. They do it late, but do it well.
The development in the LLM space was and is too fast for Apple to compete in. They usually pave their own path and stay in their lane as a leader. The impact on Apple's brand image will be tarnished if Google, Meta, OpenAI, MS all leapfrog Apple's models every 2-3 months. That's just not what the Apple brand is associated with.
One problem with Apple's approach here is that they were scraping the web for training data long before they published the details of their activities and told people how to exclude them using robots.txt
They documented it in 2015: https://www.macrumors.com/2015/05/06/applebot-web-crawler-si...
Uncharitable.
Robots.txt is already the understood mechanism for getting robots to avoid scraping a website.
People often use specific user agents in there, which is hard if you don't know what the user agents are in advance!
That seems like a potentially very useful addition to the robots.txt "standard": Crawler categories.
Wanting to disallow LLM training (or optionally only that of closed-weight models), but encouraging search indexing or even LLM retrieval in response to user queries, seems popular enough.
If you're using a specific user agent, then you're saying "I want this specific user agent to follow this rule, and not any others." Don't be surprised when a new bot does what you say! If you don't want any bots reading something, use a wildcard.
Yes, but given the lack of generic "robot types" (e.g. "allow algorithmic search crawlers, allow archival, deny LLM training crawlers"), neither opt-in nor opt-out seems like a particularly great option in an age where new crawlers are appearing rapidly (and often, such as here, are announced only after the fact).
Sure, but I still think it's OK to look at Apple with a raised eyebrow when they say "and our previously secret training data crawler obeys robots.txt so you can always opt out!"
I've been online since before the web existed, and this is the first time I've ever seen this idea of some implicit obligation to give people advance notice before you deploy a crawler. Looks to me like people are making up new rules on the fly because they don't like Apple and/or LLMs.
I stand by what I said.
Apple are saying you can opt out of their training data collection using robots.txt.
But... they collected their training data before they told people how to opt out.
I don't understand why me pointing that out as "eyebrow raising" is controversial here.
It's not controversial, it's just not how the ecosystem works. There has never been an expectation that someone make a notification about impending crawling.
It might be nice if there were categories that well-behaved bots could follow, as noted above, but even then the problem exists for bots doing new things that don't fall into existing categories.
My complaint here isn't what they did. It's that they explain it as "here's how to opt out" when the information was too late to allow people to opt out.
I think that's disingenuous of them.
It's been common knowledge for anyone running a web server since 1994.
I don't think you are reading my posts in full.
Assuming well behaved robots.
> Using our web crawling strategy, we sourced pairs of images with corresponding alt-texts.
An issue for anti-AI people, as seen on Bluesky, is that they're often "insisting you write alt text for all images" people as well. But this is probably the main use for alt text at this point, so they're essentially doing annotation work for free.
I think it is entirely morally consistent to provide alt text for accessibility even if you personally dislike it being used to train AI models.
It's fine if you want to, but I think they should consider that basically nobody is reading it. If it was important for society, photo apps would prompt you to embed it in the image like EXIF.
Computer vision is getting good enough to generate it; it has to be, because real-world objects don't have alt text.
I actually use Claude to generate the first draft of most of my alt text, but I still do a manual review of it because LLMs usually don't have enough contents to fully understand the message I'm trying to convey with an image: https://simonwillison.net/2025/Mar/2/accessibility-and-gen-a...
Context not content.
Why would photo apps do what's "important for society"?
Annotating photos takes time/effort, and I could totally imagine photo apps being resistant to prompting their users for that, some of which would undoubtedly find it annoying, and many more confusing.
Yet I don't think that one can conclude from that that annotations aren't helpful/important to vision impaired users (at least until very recently, i.e. before the widespread availability of high quality automatic image annotations).
In other words, the primary user base of photo editors isn't the set of people that would most benefit from it, which is probably why we started seeing "alt text nudging" first appear on social media, which has both producer and consumer in mind (at least more than photo editors).
> Why would photo apps do what's "important for society"?
One would hope they're responsive to user demands. I should say Lightroom does have an alt text field, but like phone camera apps don't.
Apple is genuinely obsessed with accessibility (but bad at social media) and I think has never once advocated for people to describe their photos to each other.
> An issue for anti-AI people, as seen on Bluesky, is that they're often "insisting you write alt text for all images" people as well. But this is probably the main use for alt text at this point, so they're essentially doing annotation work for free.
How did you come to the conclusion that those two groups overlap so significantly?
This is a well known fact. A bunch of AI researchers tried to migrate to the platform from Twitter but got a ton of hate and death threats from other users so they went back. Bluesky has a pretty strong anti-AI bias and the community of folks talking about it despite that is very small.
Well that's easy, I read their posts where they say it.
So you found a couple people expressing this conflicting view and assumed it applies to a larger group? Doesn’t sound very reliable to me but I see this all the time and it makes sense if you look at it as a mechanism to explain the world .
[flagged]
Respect, but its going to be terrible compared to every other company. You can only hamstring yourself so much.
Respect actions, not words and PR.
Gotta polish that fig-leaf to hide Apple's real stance towards user privacy: arstechnica.com/tech-policy/2023/12/apple-admits-to-secretly-giving-governments-push-notification-data/
> Apple has since confirmed in a statement provided to Ars that the US federal government "prohibited" the company "from sharing any information,"
I mean if you throw out all contrary examples, I suppose you are left with the simple lack of nuance you want to believe
All examples contrary to what? Admitting to being muzzled by feds?
Take all the space you need to lay out your contrary case. Did the San Bernadino shooter predict this?
You literally said that we should disregard this example and focus on the “real” situation as evidenced by a different example.
It is exactly the same thing as saying “if you ignore the heads, these coins really always come up tails”.
Does the Chewbacca argument method ever work these days?
All I can say is, I asked Siri today (verbatim): What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit — and it offered a web search about fahrenheit. The "and" completely disabled its most basic ability to do metric conversions.
So, it's nice to see Apple is doing research and talking about it, but we're out here waiting, still waiting, for anything useful to make of it all on our thousand-dollar devices that literally connect us to the world and contain our entire life data. It's what I would've expected from one of the most valuable companies in the world.
> What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit
Err, what? As a native English speaker human that's a pretty confusing question to me, too!
First, most of the English speaking world is not native.
"As of 2022, there were about 400 million native speakers of English. Including people who speak English as a second language, estimates of the total number of Anglophones vary from 1.5 billion to 2 billion."
Second, all popular models I tested did well with that query, including Gemini on Android (aka "ok Google"), except Apple's.
https://en.m.wikipedia.org/wiki/English-speaking_world
I am not sure why you go on the subject of English speaking world etc. Anyway, the models you tested with that query, which I am not sure why we think is a good benchmark, are local models running on a wireless device or they use datacenter and only convey the text back and forth?
I'm fairly sure Siri still sends user voice samples to a data center. At least for a while, it used to use multipath TCP to decrease latency over multiple available network connections if I'm not misremembering.
Some modern Apple devices support "local Siri", but it's a limited subset of both voice recognition performance and capabilities.
I just tried this on my phone and just got two pop ups with the conversions appear in quick succession.
>> What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit
Probably wouldn't have made a difference but the second half of that statement isn't exactly clear. 85 degrees what?
I also think when you're chaining these two separate calculations together you get a problem when it comes to displaying the results.
That exact phrase "What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit" given to ChatGPT produces the correct result (it infers that the second degrees must be Celsius) and ChatGPT gives me a nicely laid out formula for the math of the conversion.
So yeah, Apple is way behind on this stuff.
the fact is that gemini responds with this: 75 degrees Fahrenheit is 23.89 degrees Celsius, and 85 degrees Celsius is 185.00 degrees Fahrenheit.
Meanwhile users have been conditioned to expect a system that understand the multiple queries and answers them appropriately.
True. But for most of us, only in the past year. I have a few friends/relatives who have still never conversed with an LLM.
You asked 2 questions in a system made for 1 question at a time. Split these up and Siri answers them fine. You’re holding it wrong.
A tool that can handle more than one question at a time is useful. Modern LLMs handle that with ease. So it's completely reasonable to be critical of that limitation.
Why is Siri being discussed in the context of LLMs and Apple Intelligence? Have they already released Siri 2.0 or am I missing something?
The OP is making a point that Apple is behind. They might be publishing research, but it’s completely useless to the end user buying their products.
A plethora of LLMs are available on Apple platforms. If someone wants a chatbot, they can get a chatbot on Apple products. It’s not hard.
Are all Android users using Gemini exclusively? Are all Windows users only using Copilot? Where is the native Linux desktop LLM?
I really don’t understand this criticism. Would it be nice if Siri could do more, sure. Do I have tolerance for Siri to start hallucinating on simple problems it used to use real math for, no. Do I have other options to use in the meantime to get the best of both worlds, absolutely. Where is the hardship?
Siri is the default and only voice assistant that has access to all the data on your phone. It doesn't matter if I have ChatGPT, Claude, Gemini, or another SOTA model on my iPhone—I can't easily activate them in the car or another handsfree situation or use them with any other app or data on my iPhone.
Replace "LLMs" with "competitors" and maybe you'll see the point..
The LLMs aren't necessarily competitors. Apple doesn't need to have the best all around LLM. They need to create an AI with excellent integration into their OS and the data the user's store on those systems. Beyond that, they need to have a good system for plugging into whatever other generic LLM a person might want/need. Having something decent out of the box is nice, for basic questions, but being able to easily switch to whatever specialist company is in the lead, or best suited for a user's need, is a lot better than being stuck with one first-party option. Based on how ChatGPT looks in the Apple Settings, I wouldn't be surprised if this is the plan.
Much like with the internet, Apple didn't need to re-invent every website to own it all. From Apple platforms a user can access Amazon, Google, or whatever else. Apple didn't create the internet, they sold a gateway to it. AI could be done largely the same way. This way it doesn't matter who wins, Apple can support it. At the end of the day, an LLM doesn't exist on its own, it needs to be accessed through hardware/software people enjoy using, and not be yet another device to charge and carry. Apple has a very popular phone and the most popular wearable. This positions them very well. They are often late to the party, but tend to be best dressed. The first iPhone didn't even have video, and people clowned them for it, and now iPhone video is largely considered one of the best in the smartphone world.
Sure, what’s not reasonable is expecting Siri to be a modern LLM, when they know it’s not. They asked a question they knew Siri couldn’t handle just to slam it. I’m not critical of a 5-function calculator for not one-shotting complex equations like a computer.
While Siri only does one thing at a time, I trust the answer more, because it’s doing the actual math and not just guessing what the most likely answer is, like an LLM. We need to pick the right tool for the right job. Frankly, I don’t think an LLM is the right tool for conversations like this, and jumbling multiple questions into a single question is something people do with LLMs to get more use out of them during the day, this is an adaptation to a limitation of the free tier (and sometimes speed) of the LLM.
On Android phone, the equivalent voice assistant (Gemini) handles the question gracefully. Regardless of what you think about Google, having a single-button LLM-powered voice assistant, deeply integrated into the phone's OS, is a very useful feature, and Apple is quite far away from developing a competing version of this. They'll have to buy it or go without.
It’s not unreasonable
Amazon already reworked Alexa to be backed by a LLM months ago, and they were delayed doing it.
You’re telling me that Apple isn’t capable of the same to Siri?
The unreasonable part is acting like Siri got its big LLM update, when they know it didn’t. Just like it would be unreasonable to expect any famously delayed, or unannounced, feature to magically start happening.
Amazon just needs a generic LLM. Apple, from the sound of it, is trying to create deep integration with the OS and on-device data. That’s a different problem to solve. They also seem to be trying to do it while respecting user privacy, which is someone most other companies ignore.
I don’t see what the big deal is. I’d rather wait for something good than have them rush out a half-ass “me too” chatbot, that is indistinguishable from the dozens of other chatbots I can simply download as an app for that.
If we believe what Craig Federighi said, they had something, it just wasn’t up to their standards when talking about rolling it out to a billion devices. Which is fair, I run into bad data from ChatGPT and other LLMs all the time. Letting it mature a little more is not a bad thing.
ChatGPT spent a couple months getting my dad pumped up for an elective open heart surgery; he was almost arrogant going into it about how the recovery would go, thinking ChatGPT gave him all the info he could possibly need and a balanced view of reality. Reality hit him pretty hard in the ICU. He sent me some of the chats he had, it was a lot of mutual ego stroking. He was using ChatGPT to downplay the negatives from the doctors and boosting the positives. While it’s good to feel confident, I think it went too far. I spent the whole week in the hospital trying to pull him out of his depression and recalibrating the unrealistic expectations that ChatGPT reinforced. I hope Apple finds a way to be more responsible. If that takes time, great.
Never mind that Infocom games running on my Apple ][+ could handle that sort of command in 1983.
(Well, with multiple direct objects, anyway.)
"holding it wrong" was exactly the right phrase given how that phrase was used with the iPhone antenna bridging problem. This is an Apple product failing.
"You haven't contorted your comically simple query enough to make the brittle tool work. Throw the chicken bones better next time."
It’s been this way for over a decade. If someone hasn’t figured it out by now, that’s kind of on them.
I’m not even sure why those two things would be asked as a single question. It seems like a very unnatural way to pose those two questions. Most humans would trip on that, especially if it was asked verbally.
> It seems like a very unnatural way to pose those two questions. Most humans would trip on that
I'd assume GP only gave an example. As a pretty frequent user, I can unfortunately only confirm that Siri trips over almost every multi-part question.
This would be forgivable if there weren't multiple voice-based AI consumer products available that can handle these kinds of requests perfectly.
And Apple has integrated one of them, ChatGPT, to do just that.
If they wanted an LLM answer they could have got one. They went out of their way just to take shots at Apple.
I can’t talk to ChatGPT hands-free on my Apple devices, but I can to ChatGPT.
Besides that, many people don’t install any apps, and Apple not pre-installing a reasonable LLM to cater to that market just seems incredibly out of character.
And there’s enough credible reporting and personnel reshuffling happening to suggest that it’s not available yet because they failed to make it work, not because they didn’t try.
OP isn't asking how to use Siri to do his contrived task. OP is saying that Siri in 2025 should be able to handle that relatively simple albeit contrived task.
Your usage of Siri today (probably on an old version of iOS) frankly has nothing to do with the article we are discussing. Sorry to say this but it is going to take time. Comparing the performance of a chatgpt running in a big data center with a model running locally on a phone device... give it a few years.
People have been giving Siri a few years for a decade now. Siri used to run in a data center (and still does for older hardware and things like HomePods) and it has never supported compound queries.
Siri needs to be taken out back and shot. The problem with “upgrading” it is the pull to maintain backwards compatibility for every little thing Siri did, which leads them to try and incorporate existing Siri functionality (and existing Siri engineers) to work alongside any LLM. Which leads to disaster, and none of it works and just made it all slower. They’ve been trying to do an LLM assisted Siri for years now and it’s the most public facing disaster the company has had in a while. Time to start over.
As a user, I'd gladly opt into a slightly less deeply integrated Siri that understands what I want from it.
Build a crude router in front of it, if you must, or give it access to "the old Siri" as a tool it can call, and let the LLM decide whether to return its own or a Siri-generated response!
I bet even smaller LLMs would be able to figure out, given a user input and Siri response pair, whether the request was resonably answered or whether the model itself could do better or at least explain that the request is out of capabilities for now.
> Build a crude router in front of it, if you must, or give it access to "the old Siri" as a tool it can call, and let the LLM decide whether to return its own or a Siri-generated response!
Both of these approaches were tried internally, including even the ability for the LLM to rewrite siri-as-a-tool's response, and none of them shipped, because they all suck. Putting a router in front of it makes multi-turn conversation (when Siri asks for confirmation or disambiguation) a nightmare to implement, and siri-as-a-tool suffers from the same problem. What happens when legacy siri disambiguates? Does the LLM try to guess at an option? Does it proxy the prompt back to the user? What about all the "smart UI" like having a countdown timer with Siri saying "I'll send this" when sending a text message? Does that just pass through? When does the LLM know how/when to intervene in the responses the Siri tool is giving?
This was all an integration nightmare and it's the main reason why none of it shipped. (Well, that and the LLM being underwhelming and the on-device models not being smart enough in the first place. It was just a slower, buggier siri without any new features.)
The answer is that they need to renege on the entire promise of a "private" siri and admit that the only way they can get the experience they want is a _huge_ LLM running with a _ton_ of user context, in the cloud, and don't hinder it all with backwards compatibility with Siri. Give it a toolbox of things it can do with MCP to your device, bake in the stock tools with LoRA or whatever, and let it figure out the best user experience. If it's a frontier-quality LLM it'll be better than Siri on day one, without Apple having to really do anything other than figure out a good system prompt.
The problem is, Apple doesn't want to admit the whole privacy story is a dead-end, so they're going to keep trying to pursue on-device models, and it's going to continue to be underwhelming and "not meeting our quality bar", for the foreseeable future.
Very good details on why just bolting on an LLM isn't that trivial I hadn't really considered before, thank you!
But regarding Apple not wanting to admit that client side compute isn't enough: Haven't they essentially already done that, with Private Cloud Computing and all that? I believe not even proofreading and Safari summarization work fully on-device, at least according to my private compute privacy logs.
Those little things have been broken for a while now, it's best to bite the bullet and integrate LLM to Siri now.
> Your usage of Siri today (probably on an old version of iOS) frankly has nothing to do with the article we are discussing.
Yes, but isn't that infuriating? The technology exits! It even exists, as evidenced by this article, in the same company that provides Siri!
At least I feel that way every time I interact with it – or for that matter my Google Home speaker, ironically made and operated by the company that invented transformer networks.
Despite all the “Apple is evil” or “Apple is behind” (because they don’t do evil). Well, what they made with the Foundation Model is great. The fact that they build a system within the Swift language that allows you to specify structured data models (structs) to be used like any other model in a modern programming language, and you actually get back generated data in that format is great. Unlike a lot of other AIs where you might get back a well formatted JSON after a carefully crafted request, but still you never can’t be sure and need to implement a bunch of safeguards. Obviously it’s still the beginning and other tools might do something similar. But as an iOS developer that makes the usage of AI so much simpler. Especially with the bridge to external AIs that still allows you to map back to the type safe structured Swift models. I try not to be a hater, every progress, even slow or underwhelming at first might lead to improvements everywhere else.
Apple is behind. People forget that Google was shipping mobile-scale transformer-based LLMs in 2019: https://github.com/google-research/bert
By the time Apple has an AI-native product ready, people will already associate it with dehumanization and fascism.
I think it's a great move that Apple is cautious with including a hardcore LLM in everything. They are not that useful to the regular user.
Nobody is forcing "hardcore LLM" features on anyone, besides maybe Microsoft. This is that same cope as "I'm glad Apple Car can't crash and threaten people's lives" despite the fact that... yunno, Apple really wanted to bring it to market.
Siri, sideloading and AI features all all the same way; give people options and nobody will complain.
If they give Siri LLMs, there will be headlines that it drove kids to suicide. People really don't need LLMs.
Sideloading is bad for business. Most users don't care. Remember, we, the devs, are not the core target/biggest spenders. They are targeting a large audience of young people who are not tech-savvy.
Guided generation is called "Structured Output" by other providers?
Well partially generated content streaming thing is great and I haven't seen it anywhere else.
Sorry if I didn’t use the correct terms. Didn’t catch up on all the terminology coming from my native language. ;) But yes, I agree, the fact that parts, different parameters, of the model can be completed asynchronous by streaming the output of the model, is quite unique. Apple/swift was late with async/await, but putting it all together, it probably plays well with the ‘never’ (I know ) asynchronous and reactive coding.
An issue with this is that model quality can get a lot lower when you force it into a structured form, because it's out of distribution for the model.
(I'm pretty sure this is actually what drove Microsoft Sydney insane.)
Reasoning models can do better at this, because they can write out a good freeform output and then do another pass to transform it.
I have this toy agent I'm writing, I always laugh that I, human, write a code that generates human-readable markdown, that I feed to llm where I ask it to produce a json, so I can parse (by code I, or it wrote) and output in a consistent human-readable form.
I'm thinking about let it output freeform and then use another model to use to force that into structured.
I've found this approach brings slightly better result indeed. Let the model "think" in natural language, then translate it's conclusions to Json. (Vibe checked, not benchmarked)
IIRC yaml is easier for models than json because you don't need as much recursive syntax.
I doubt this is true anymore, if ever. Both require string escaping, which is the real hurdle. And they are heavily trained on JSON for tool calling.
How do you think their implementation works under the hood? I'm almost certain it's also just a variant of "structured outputs", which many inference providers or LLM libraries have long supported.
Huh? Grammar-based sampling has been commonplace for years. It's a basic feature with guaranteed adherence. There is no "carefully crafting" anything, including safeguards.
Every time I see a paper from Apple I just feel like, OK so why isn’t my iPhone actually doing any of this yet?
Why give this to developers if you haven’t been able to get Siri to use it yet? Does it not work or something? I guess we’ll find out when devs start trying to make stuff
> why isn’t my iPhone actually doing any of this yet?
What exactly are you referring to? Models do run on iPhone and there are features that take advantage of it, today.
None of those features are in any way interesting though. Image playground is a joke, Siri is a joke, that generative emoji thing is a joke.
The AI stuff with photography sure, but that’s more like machine learning.
The photo touch up thing is… useable? Sometimes?
What is it you’ve been so impressed with?
This is available in iOS 26 to all applications; it's available directly to the user through shortcuts.
I'm currently on the beta, and I have a shortcut that pulls in various bits of context and feeds that directly to the on-device model.
Is there another model you’d say it’s roughly on a par with ?
The main features are text summarization, search, and writing tools.
Yes, and these are all pointless.
‘Writing tools’ is there in the text pop up now instead of ‘look up’ even if you’re selecting text on a webpage. It’s just in the way and useless.
I agree. You can revert this UI frustration. Go to your settings. Then, go into screen time. Next, go into content and privacy restrictions and enable that. Under Apple intelligence in the restrictions you can then disable individual features.
My anonymous friend who wrote that settings pane would like to say that is not what they meant it for, but have fun.
(Since it's meant to restrict your children, using it to restrict yourself will disable some features that'd let you escape it. I forget what exactly, but you might not be able to change the time or something like that.)
> why isn’t my iPhone doing any of this yet?
> Ok its doing it in 4 or 5 products, but thats a joke.
Not every AI product is a chatbot.
The joke is that it does it terribly, not whether it does it at all.
Wow.
Yes, but why do I have to open a third-party app to do these things when Apple, the company that primarily popularized the entire genre of mobile voice assistants, could very feasibly bake all of that into theirs?
I mean, the thing even lets me ask ChatGPT things if I explicitly ask it to! But why do I need to ask in the first place?
Excellent question.
I don’t speak for Apple but certainly you can appreciate that there is a fine balance between providing basic functionality and providing apps. Apple works to provide tools that developers can leverage and also tries to not step on those same developers. Defining the line between what should be built-in and what should be add-on needs to be done carefully and often is done organically.
Are we talking about the same Apple whose behavior resulted in the expression “to Sherlock”?
Yup, exactly the same.
Q: Should search be core behavior or third-party functionality?
> why isn’t my iPhone actually doing any of this yet?
Probably Apple is trying to distill the models so they can run on your phone locally. Remember, most, if not all, of Siri is running on your device. There's no round trip whatsoever for voice processing.
Also, for larger models, there will be throwaway VMs per request, so building that infra takes time.
It says there’s 2 models - one local. It’s already released to app developers to use locally I think (it was in the keynote for WWDC).
The model now available to developers (in beta, not in released versions of iOS) is the same model that powers stuff like the much-maligned notification summaries from iOS 18. So your phone does have features that are powered by this stuff… you may just not be particularly overwhelmed by those features.
That’s kinda my point though - is this only capable of things like this? If it ia capable of more, why isn’t there something more yet, it’s been a long time waiting…
My suspicion with playing with it in the developer betas is that, yes, this is what it’s capable of.
They just launched "Private Cloud Compute" with much fanfare to enable server-side LLM processing, so between that and the fact that Siri has been server-based for most of its existence (local processing is fairly new), I don't think that's their main constraint at this point.
That said, "Private Cloud Compute" does run on proprietary Apple hardware, so availability might be a concern (assuming they don't want to start charging for it).
Apple silicone unified memory is amazing for running things like ollama. You don’t have to wait for them to release their own applications.
I know apple is methodical and don’t show their hand but I cannot help but feel they are releasing all this research because they haven’t integrated any into the phone or provided a compelling AI functionality for their users. This is their only way to say “hey we are good with AI too”.
AFAICT this is the first commercial model trying to be marketed as responsibly-sourced. Love it, but it also seems like the noise around this issue has died down. Is this for legal cover? Or more apple-privacy marketing
Stockholders are suing them over Apple Intelligence. Definitely legal cover.
"Sorry we are hilariously far behind everyone else in the industry after having made a huge amount of fanfare about 'Apple Intelligence' for years. It's just that we have shot ourselves in the knee to satisfy Bluesky posters and the NY Time's lawyers"
Do people have an issue with the smollm datasets? I guess it isn't really commercial.
Siri is literally a joke!
My son (he's 11 years old now and fairly skilled with all the main AI tools, eg chatgpt, gemini, etc) and I retry her every month or so, and this past time we just laughed. Can't handle basic questions - hears the question wrong, starts, stops, takes us to some random ass webpage, etc, etc.
"She's so jacked up!" he said.
Apple needs to get this under control and figured out, stat!
Apple can’t afford to run models, there are too many iPhones and not enough data centers.
Running on device is also risky because cycle limitations will make it seem dumb in comparison.
Looks nice. I just wish they’d improve the models behind dictation on both iPhone and Mac to have better accuracy and on the fly multiple language transcription.
I'd really like to be able to use this 3B model on my little 4GB GPU card! It looks very capable for a reasonable weight. Maybe one day on HhuggingFace
This isn’t the Apple I remember. Product integration falls apart at every seam, but don’t worry—we’ve got plenty of impressive technical documentation to compensate. I’m sure Jobs would be thrilled to see his ‘it just works’ philosophy replaced with ‘it barely works, but here’s a 50-page PDF explaining why.
The philosophy is the same, and since it was never implemented in the mythical era of Jobs, so is the practice. So he'd be as thrilled as he was back then?
What I don't understand is how this happened. What really has changed at Apple in the last decade?
As someone that was around in the days of Apple before bankruptcy, the same, there is no Jobs more around, and is getting back to Gil Amelio kind of Apple.
Tim Cook might be better at squeezing the juice, but he is not a product person.
This time around they need another solution, otherwise regardless of how much money they have, they will stay as the iOS/iPad company, given the relevance of macOS on desktop market worldwide.
I think that iOS 7 theme update caused their brains to rot.
I wonder if we'll see these models running on the phone (aiPhone) hardware in the future.
It does. You can use it directly on iOS 26 beta - without writing a line of code I can toy with the on-device model through Shortcuts on my 16 Pro. It’s not meant to be a general purpose chatbot… but it can work as a general purpose chatbot in airplane mode which is a novel experience.
https://share.icloud.com/photos/018AYAPEm06ALXciiJAsLGyuA
https://share.icloud.com/photos/0f9IzuYQwmhLIcUIhIuDiudFw
The above took like 3 seconds to generate. That little box that says On-device can be flipped between On-device, Private Cloud Compute, and ChatGPT.
Their LLM uses the ANE sipping battery and leaves the GPU available.
It would be interesting to see the tok/s comparison between the ANE and GPU for inference. I bet these small models are a lot friendlier than the 7B/12B models that technically fit on a phone but won't accelerate well without a GPU.
I thought the big difference between the GPU and ANE was that you couldn't use the ANE to train. Does the GPU actually perform faster during inference as well? Is that because the ANE are designed more for efficiency or is there another bigger reason?
GPUs are usually faster for inference simply because they have more ALUs/FPUs but they are also less efficient.
fitting 7B model on phone with 8gb ram for the whole system is impressive.
Wild to see what improvements might come if there is additional hardware support in future Apple Silicon chips.
What’s the cost of pointing it to Private Cloud Compute? It can’t be free, can it?
It’s “free”, as in it doesn’t charge you anything or require a subscription: it’s a part of Apple Intelligence which is basically something bought with the device. It’s in the cloud so theoretically one shouldn’t need a quite new iPhone or Mac but - one does.
As someone mentioned, this model is available in the beta version of iOS 26; it's also part of macOS 26, iPadOS 26 and visionOS 26. Anyone with a free developer account can install the developer betas; the public beta is expected next week.
There's a WWDC video "Meet the Foundation Models Framework" [1].
[1]: https://developer.apple.com/videos/play/wwdc2025/286
> The new Foundation Models framework gives access to developers to start creating their own reliable, production-quality generative AI features with the approximately 3B parameter on-device language model. The ∼3B language foundation model at the core of Apple Intelligence excels at a diverse range of text tasks like summarization, entity extraction, text understanding, refinement, short dialog, generating creative content, and more. While we have specialized our on-device model for these tasks, it is not designed to be a chatbot for general world knowledge. We encourage app developers to use this framework to design helpful features tailored to their apps
> a ∼3B-parameter on-device model
There are even already some local AFM to Open AI API bridge project on GitHub - that lets you point basically any Open AI compatible client at the local models. Super nice for basic summarisation and completions.
I was worried "device" was a Mac mini, not an iPhone. (I already have been running models on my MacBook Pro.)
The more I think about Apple, the more I realize that Apple is so far behind. While other companies are pushing the envelope (OpenAI, Anthropic, Google ..) Apple's ambitions seem much much smaller.
And this is after they made very big claims with Apple Intelligence last year, when they had everyone fooled.
This is like watching a train-wreck in slow motion.
Apple's ambitions are actually bigger than openai or anthropopic. Only Google's ambition (surprise surprise) is similar. Apple fundamentally wants the llm to be a tool. It doesn't want the llm to be the product.
I think it's the right strategy for Apple.
They're not a model company. The risks of deploying something half-baked to their users is unacceptable. They're taking it slow and trying to do it in a way that doesn't damage/erode their brand.
Wait it out, let the best model(s) rise to the surface (and the hallucination problems to get sufficiently mitigated), and then either partner with a proprietary provider or deploy one of the open source models. Makes more sense than burning billions of dollars training a new foundation model
This is a reasonable approach, but unfortunately misses what made Apple soooo successful. Apple is the master of controlling the brand. Apple DOES NOT like to highlight their suppliers. Nobody knows who makes iPhones displays, or sensors, or RAMs.
They love to "invent" brands that they control, so that they can commodotize the underlying supplier. Hey user, it is a retina display and dont worry whether it is LG or Samsung is making it.
Apple tried this with AI, calling it "Apple Intelligence". Unfortunately that faltered. Now Apple will have to come out and say "iPhone with ChatGPT" or "Siri with Claude". AND APPLE HATES THAT. HATES IT WITH PASSION.
People will start to associate smartness with ChatGPT or Claude, and Apple loses control and OpenAI/Anthropic's leverage goes up.
Apple has painted themselves into a corner. And as I said elsewhere, it is a train-wreck happening in slowmotion.
Please go rewatch the iPhone keynote by Steve Jobs. Everyone remembers the beginning; few seem to remember that he brings out 3 other CEOs to highlight the integrations between the iPhone and those companies.
Or consider that they spent a decade highlighting that their computers were powered by Intel, after leaving their proprietary PowerPC architecture—again, under Steve Jobs.
Or go all the way back to 1997 when Steve Jobs had Bill Gates on the screen at Macworld and announced that IE would be the default browser on Mac.
It’s easy to fall into a caricature of Apple, where they insist on making everything themselves. What is more accurate is to say that they are not afraid to make things themselves, when they think they have a better idea. But they are also not afraid to do deals when it is the best way forward right now.
They already deployed half-baked models (eg needing to disable news summaries because they were so bad), and haven't delivered on other aspects of apple intelligence. This is hard to call being cautious, this is them not being able to keep up.
Exactly. Another mobile.me moment that adversely impacts customers is worse than making something useful that works. Anyone that “needs” AI can use an app.
Apple’s AI summary mangled a BBC headline about Luigi Mangione
https://www.theverge.com/2024/12/13/24320689/apple-intellige...
Apple urged to withdraw 'out of control' AI news alerts
https://www.bbc.com/news/articles/cge93de21n0o
iOS 18.3 Temporarily Removes Notification Summaries for News
https://www.reddit.com/r/apple/comments/1i2w65j/ios_183_temp...
Only if you think they _must_ compete with large models on the internet.
I'm fine with Apple chilling on the sidelines for a bit.
I wouldn’t go as far as GP, but yes, absolutely, they must compete with large models on the internet. Customers are now used to being able to ask a computer a question and get something better than “I just ran a web search for what you said, here are the uncurated, unsummarized results”.
Yes, this is in fact what people want. Apple is the biggest company in the world (don’t quibble this y’all, you know what I mean) and should be able to deliver this experience. And sure, if they could do it on device that would be aces, but that’s not an item on the menu, and customers seem fine with web-based things like ChatGPT for now. To act like Apple is doing anything other than fumbling right now is cope.
Erm, have you heard of these things called apps? It’s this magical concept where other companies can run code your iPhone, and deliver all the features you just talked about.
I don’t really understand why Apple has to provide a ChatGPT product, baked directly into their software. Why on earth would Apple want to get involved in the race to the bottom for the cheapest LLMs? Apple doesn’t produce commodity products, they package commodities into something much more unique that gives them a real competitive advantage, so people are willing to pay a premium for the Apple’s product, rather than just buying the cheapest commodity equivalent.
There is no point Apple just delivering an LLM. OpenAI, Anthropic, Google etc already do that, and Apple is never going to get into the pay-per-call API service they all offer. Delivering AI experiences using on-device only compute, that’s something OpenAI, Anthropic and Google can’t build, which means Apple can easily charge an premium for it, assuming they build it.
> I don’t really understand why Apple has to provide a ChatGPT product
Control. It boils down to control. If you own a platform, you want to make your "suppliers" (apps in this case) as substitutable as possible.
If people start associating ChatGPT or Claude or Gemini as the main reasons to buy a phone, at some point in the future, they'll think - gee, most of what I'm doing on the phone is interacting with $app, and I can get the $app elsewhere.
This usecase is run of the mill for someone like Google, who used to store and show you your location forever, but it's not in Apple style.
It's hard to be like "uhhh privacy" when you send all requests to a remote server where they're stored in clear text for god knows how long.
As of right now, there is no way to run big LLMs in a privacy preserving manner. It just doesn't exist. You can't E2EE encrypt these services, because the compute is done on the server, so it has to decrypt it.
There are some services which will randomize your instance and things like that, but that kind of defeats the a big part of what makes LLMs useful, context. Until we can run these models locally, there's no way to get around the privacy nightmare aspects of it.
Read https://security.apple.com/documentation/private-cloud-compu.... It's very thorough and as good as you could possibly do this.
Doesnt matter if it doesnt work. And by all accounts, Apple Intelligence has been a garbage fire.
Siri, even after decades of investment, is a joke. Apple does NOT have the talent or capability to deliver what people want.
> I wouldn’t go as far as GP, but yes, absolutely, they must compete with large models on the internet
The people running large models want to charge a monthly fee for that.
I'm fine with having a free model that runs on device without slurping up my data.
I see it as the opposite. Apple is absolutely positioned to own "chat". I am not worried they'll soon sort things out — and eventually we'll have an LLM integrated into the iPhone; call it Siri or otherwise.
With my history encrypted in the cloud, and the trust that Apple has built around privacy ... I think they're going to come out alright.
But they have de facto admitted failure of most of the strategy if the rumours are true that they are switching much harder to OpenAI/Anthropic for upcoming LLM products.
This is the first time in 10+ years I've seen Apple so far on the back foot. They usually launch category defining products that are so far ahead of the competition, even by the time they work through the 'drawbacks' in the first versions of them they are still far ahead. OS X, the iPhone and the iPad were all like that. They are still way ahead of the competition on Apple Silicon as well.
I am not very confident on their on device strategy at least in the short to medium term. Nearly all their devices do not have enough RAM and even if they did SLMs are very far behind what users "know" as AI - even the free ChatGPT plan is leap years ahead of the best 3B param on device model. Maybe there will be huge efficiency gains.
Private cloud is used AFIAK for virtually 0 use cases so far. Perhaps it will be more interesting longer term but not very useful at the moment given the lack of a suitable (ie: non Chinese), large (>500b param) model. They would also struggle to scale it if they roll it out to billions of iOS devices especially if they put features that use a lot of tokens.
Then they've got OpenAI/Gemini/Anthropic via API. But this completely goes against all their private cloud messaging and gives those providers enormous potential control over Apple, which is not a position Apple usually finds itself in. It will also be extremely expensive to pay someone per token for OS level features for billions of iOS/Mac devices and unless they can recoup this via some sort of subscription will hit services margins badly.
To me its clear the future of "OS" is going to involve a lot of agentic tool calling. These require good models, with large context windows and a lot of tokens - this will definitely not work on device. Indeed this is exactly what the Siri vapourware demo was.
I'm sure they can potentially get to a great UX (though these missteps are making me question this). But having such a core feature outsourced does not leave them in a good position.
You're right about the RAM, of course. Apple will no doubt have to run that up. At the same time it's an obvious "top tier" feature for the "Apple aiPhone 17 Max". And it will cost dearly.
> Private cloud is used AFIAK for virtually 0 use cases so far.
Applications using Apple's foundation models can seamlessly switch from on-device models to Private Compute Cloud.
Research is already showing the use of LLMs for people's most intimate relationship and medical issues. The usual suspects will try to monetize that, which why Private Cloud Compute is a thing from the jump.
> Then they've got OpenAI/Gemini/Anthropic via API. But this completely goes against all their private cloud messaging
Using ChatGPT via Siri today, no personally identifying information is shared with OpenAI and those prompts aren't used for training. I suspect Apple would want something similar for Google, Anthropic, etc.
At some point, there will be the inevitable enshitification of AI platforms to recoup the billions VCs have invested, which means ads, which won't happen to Apple users using foundation model-based apps.
> Nearly all their devices do not have enough RAM and
Every Apple Silicon Mac (going back to the M1 in 2020) can run Apple Intelligence. 8 GB RAM is all they need. Every iPhone 15 Pro, Pro Max and the entire 16 line can all run Apple Intelligence.
Flagship iPhone 17 models are expected to come with 12 GB of RAM and all current Mac models come with at least 16 GB.
Apple sells over 200 million iPhones in a given year.
There's no doubt Apple stumbled out of the gate regarding AI; these are early days. They can't be counted out.
8GB RAM is not enough for a semi-decent model IMO. 12/16GB is better (4GB for model and 8GB for OS) and really if you were going hard on device you'd probably want more like 32GB (24GB for model + 8GB for everything else - you'd be able to run a 13b param model with larger context size with that).
Even still though people are used to the quality of huge frontier models, so it will feel like a massive downgrade on many tasks. The _big_ problem with all this is chained tool calling. It uses context SO quickly and context needs a lot of (V)RAM. This also completely undermines the privacy argument you make, because it will need to ask personal data if using OpenAI and put it in the prompt.
Yes I noticed Apple shipping higher RAM but it will take years for this to feed through to a sizeable userbase, and people are quickly getting ingrained to use an app like ChatGPT instead of OS level features. Even more so given what a flop Apple Intelligence 1.0 has been.
The key problem they've got is they've went hard on privacy (which means it is hard to square that with going all in on 3rd party APIs) but they've also been incredibly stingy with RAM historically, which really nerfs their on device options. Private compute is an interesting middle ground but their model options are incredibly limited currently.
> 8GB RAM is not enough for a semi-decent model IMO.
Apple's ~3 billion parameter on-device model is about as good as it gets on a smartphone, especially for the functions it was designed for: writing and refining text, prioritizing and summarizing notifications, creating images for conversations, and taking in-app actions.
Every Mac comes with at least 16 GB of RAM; while every iPhone comes with 8 GB of RAM, some models of the iPhone 17 will have 12 GB.
Remember, an app using the on-device model can seamlessly shift to a much bigger model via Private Cloud Compute without the user having to do anything.
If the user enables it, Apple's Foundation Model can use ChatGPT in a privacy preserving way. By the fall, Gemini and Sonnet/Opus could be options as well.
Again, ChatGPT is used in a privacy preserving way; you don't need an account: "Use ChatGPT with Apple Intelligence on iPhone" [1].
[1]: https://support.apple.com/guide/iphone/use-chatgpt-with-appl...
Apple is only "behind" if you think they're in the same race. They haven't shown any interest in developing frontier models or taking on the enormous costs of doing so.
Did you even watch Apple Intelligence ads? They were very much in the race, just that they got ahead of themselves a bit.
They were touting the same features that other companies are now delivering. Point the phone at something, and it'll tell you what you're looking at. Or summarize news articles etc. Instead we got .. emojithingy
I'm having trouble understanding, do you think people are going to stop buying iPhones because Siri isn't as good as ChatGPT? Do you think Apple users are going to flood over to the Pixel phone to use Gemini?
What is this train-wreck you are hallucinating?
The paper was a very nice read, and they did many creative things. It's a pity this model won't be directly accessible, only integrated in some apps.
> It's a pity this model won't be directly accessible, only integrated in some apps.
It's already accessible using Shortcuts, even to non-developers "iOS 26 Shortcuts + Apple Intelligence is POWERFUL " (Youtube) [1].
[1]: https://youtu.be/Msde-lZwOxg?si=KJqTgtWjpdNDxneh
When the Blackberry ruled the Earth, people asked 'Why doesn't Apple do a smartphone?'.
I feel like this is the most exciting news today about AI on hn. I really hope apple shows that small models can be just as capable as the bigger ones. Maybe they have the people on perplexity working on these small models.
Lol and yet, Google has AI image descriptions in their screen reader, TalkBack, before Apple. Apple is supposed to be the accessibility king. But with AI, they just can't, even if they obviously have access to ChatGPT which has vision capabilities. Granted, I don't know what model Google uses because tech news don't do Android Accessibility Suite APK teardowns, but it works pretty well, and fast too.
Hasn't Apple had AI image descriptions in VoiceOver for 5 years now? https://www.idropnews.com/ios-14/ios-14-adds-ai-based-voiceo...
The dozens of "contributors" being presented in random order is, one would suppose, an anti-poaching tactic?
It's hard to know what it isn't for certain but there are many other reasons papers list contributors in a flat structure (be it random or alphabetical order). Particularly with large numbers of collaborators.
"References" section sort of narrows the field anyway.
As someone whose last name is near the end of the alphabet, that's not the first presumption I had seeing that page.
Well meta already got Ruoming so he can obviously give them a ranked list of who to grab.
Most of his team are former Google brain so GDM knows who is good.
Not very hard to look people up on LinkedIn and figure out who the core researchers are. I think this is just a very surface-level overview paper that encompasses a bunch of different research projects conducted by different teams, and it would be difficult to order the contributors in any meaningful way.
Considering a large portion of the contributors have names originating in a script and language that has no relationship whatsoever to English’s already arbitrary letter ordering, this list configuration is as good as any.
Here is my question…
This is the first time that millions of people will actually download and run a model on their own devices.
The question is… will Apple be constantly tweaking these models, or only during OS upgrades?
I for one really like local software. Call me old-fashioned, but I enjoy when a company doesn’t switch up software anytime on the server, or phone the results home all the time in order to extract more profits from their collective users.
> The question is… will Apple be constantly tweaking these models, or only during OS upgrades?
Certainly when new updates are released--going from macOS 26 to 26.1).
They can probably push model updates between releases if necessary.
Per the PDF in this post:
> “Adapters produced by the toolkit are fully compatible with the Foundation Models framework. However, each adapter is compatible with a single specific model version, meaning that a new adapter must be trained for each new version of the base model.”
Any changes should require retraining any LoRA adapters that has been built & distributed by third party developers, so they wouldn’t update the models outside OS updates at the drop of a hat I don’t think.
LoRA adapters can be distributed via Background Assets, but the base model itself should be version-locked to the OS build (e.g. iOS 26.0 → 26.1) and updates only when Apple ships a new OS image.
Makes sense; thanks for the clarification.
The model is gigabytes so I doubt they will push updates frequently.
Educate me: is there any work on modifying models in a way that changes relatively few parameters, so an update is a smaller payload?
Yeah, LoRAs. Apple uses them to specialize a single model for different uses.
Thanks! Very interesting. Lead inventor Edward Hu describes them, and their usage, incredibly well in this video:
https://youtu.be/DhRoTONcyZE?si=vM2N5zNslbQ5z8gv
In the meantime, when I ask Siri to set a timer for 15 minutes, about 10–15% of the time it just says, “Here’s what I found about setting a timer for 15 minutes,” instead of actually setting the timer"