Apple refuses to call Apple Intelligence 'AI'

corbin@infosec.pub · 6 months ago

Apple refuses to call Apple Intelligence 'AI'

MudMan@fedia.io · 6 months ago

But they do, though.

The use cases they have presented are literally asking for a picture you received last week that contained a particular piece of text, selecting the text and copying it over.

I know Apple made it seem like AI is magic, but here in the real world that uses real world computers you need to know what’s on the image to do that.

But hey, no, that’s my point. You understand what taking a screenshot of your desktop looks like. You can grok that to the extent that you can feel weird about the idea of somebody doing that to you every five seconds. You can’t wrap your head around the steps of breaking down all your information to the extent Apple is describing. Yeah, they know exactly what you did and when, and what you looked at and what it said and how it relates to everybody you know and to your activity. But since you can’t intuitively understand what that requires you don’t know enough to feel weird about it.

That right there is good UX, even if the ultimate level of intrusion is the same or higher.

Petter1@lemm.ee · 6 months ago

This is not screenshoting, the picture is already a picture which the AppleAI has access to

Apple solves it by having the AI deamon running with relatively low rights and analyse stuff directly through a API where apps expose data for it

This is way less bad than just screenshoting everything and as added bonus, apps can give the AppleAI data not even shown on screen, which is impossible with the Screenshot idea.

MudMan@fedia.io · 6 months ago

Hold on, how is this “low rights” if it’s looking at and reading every single file you have in your device AND every single thing you access online or have remotely stored? Surely from a purely technical standpoint looking at the screen is less access by every reasonable metric. You don’t look at it, the AI doesn’t know about it. Right? Do we have a sense of shared reality here?

Don’t get me wrong, that’s still very effective spyware and I certainly don’t want a screenlogger running on my device, Apple or Microsoft. But if you present to me a system that constantly reads every file you access on any capacity and remembers it, displayed onscreen or not, versus one that looks at your screen… well, the one that looks at your screen knows less about you by any measure. OBS can record your screen, but it doesn’t know what the emails you haven’t read while you’re recording say.

The info is easier to extract, easier to be made human readable, definitely creepier in concept, probably easier to exploit. But less intrusive. Can we at least agree on that?

Petter1@lemm.ee · 6 months ago

You have other deamons on your device that have more rights. It doesn’t need rights if it gets packages delivered from apps by the API. Of course a big flaw in apple’s system is, that you don’t exactly know which system app gives what data to your personal appleAI LLM. So long story short, microsoft should have let your personal LMM be trained by the screenshots and don’t let those screenshots be saved to disk, but only temporarily saved in RAM. I bet, that the chips from snapdragon aren’t fast enough to achieve that good enough and this is typical microsoft bruthforce problem solving. Of course, if someone would be able to steal your trained appleAI (like Apple for example) they still can ask anything about you. I don’t know how apple plans to keep your trained LLM save, but that we will see soon I guess. Maybe it is stored in iCloud in order to sync with all devices, which of course could be a problem for many people. I use Arch, btw

MudMan@fedia.io · 6 months ago

I don’t know that this is a matter of performance, considering MS is pushing a specific TOPS spec to support these features. From the spec we have, several of the supported devices Apple is flagging for this feature are below the 40 TOPS spec required for Copilot+. I think that’s more than they’re putting in M4, isn’t it?

Granted, Apple IS in fact sending some of this data to server to get processed, so on that front they are almost certainly deploying more computing power than MS at the cost of not keeping the processing on-device. Of course I get the feeling that we disagree about which of those is the “brute force” solution.

I also think you’re misunderstanding what Apple and MS are doing here. They’re not “training” a model based on your data. That’d take a lot of additional effort. They presumably have some combination of pre-existing models, some proprietary some third party and they are feeding your data into the models in response to your query to serve as context.

That’s fundamentally different. It’s a different step on the process, it’s a different piece of work. And it’s very similar to the MS solution because in both cases when you ask something the model is pulling your data up and sharing it with the user. The difference is that in MS’s original implementation the data also resided in your drive and was easily accessible even without querying the model as long as you were logged into the user’s local account.

But the misconception is another interesting reflection of how these things are branded. I suppose Apple spent a ton of time talking about the AI “learning” about you, implying a gradual training process, rather than “we’re just gonna input every single text message you’ve ever sent into this thing whenever you ask a question”. MS was all “we’re watching you and our AI will remember watching you for like a month in case you forget”, which certainly paints a different mental picture, regardless of the underlying similarities.

Petter1@lemm.ee · 6 months ago

I understood it like Apple provides a pre trained LLM and it is then trained on device with user data directly resulting in new weights and configuration for each person‘s personal AppleLLM. For me that seems more reasonable that way because the data is way less random but strictly orchestrated by the limitations defined by apple through the API that needs to be used in order to integrate your app with the user’s personal AppleLLM

And I still agree, the weights and configuration of the AppleLLM is as critical as 100gb screenshots of your windows, but definitely harder to understand if extracted.

MudMan@fedia.io · 6 months ago

I just don’t think that’s plausible at all. I mean, they can “train” further by doing stuff like storing certain things somewhere and I imagine there’s a fair amount of “dumb” algorithm and programming work going on under the whole thing…

…but I don’t think there’s any model training on device. That’s orders of magnitude more processing power than running this stuff. Your phone would be constantly draining for months, it’s just not how these things work.

Petter1@lemm.ee · 6 months ago

Ahh, lol, sorry for taking so long to understand 😅 guess many misunderstood apple, like I did, or not, at least I think I get it now.

So, the only difference between copilot and apple is that appleAI has access to the API where app developers decide what is seeable for the AI vs Access to everything one has seen on the screen except DRM stuff

At apple, as attacker, you would need to get access to that API and you can get all data and at copilot you need access to the Photos

So the difference why anybody prefer Apples solution, is because their LLM gets butter clean data which is perfectly structured by devs vs at windows, where the LLM has to work with pretty much chaos data

Where exactly is Apples solution spyware? It is only a process that runs while interacting and processing data. Or is it enough to be proprietary and have access to this data, well then, spotlight is spyware.

MudMan@fedia.io · 6 months ago

It’s spyware in that both applications are a centralized searchable repository that knows exactly what you did, when and how. And no, the supposed ability to limit specific applications is not a difference, MS also said you can block specific apps and devs can block specific screens within an app. They’re both the same on that front, presumably.

What I’m saying is the reason people are reacting differently is down to branding and UX.