As you might have heard, apple have finally got round to improving Siri. This means that they have jumped onboard the LLM train.
Apple are great at two things: project specification, and marketing. Of those powers, Marketing is the strongest. To the point where we all forget how much Siri was supposed to do all this stuff already.
I like innovation, I like privacy, I dislike deliberately obscure terms designed to misdirect the average person. So with that in mind, we are going to apply this test: Imagine it’s Facebook breathlessly telling you about these privacy features.
Applying the “Facebook test” allows us to cut through the bollocks, and get to the nub of the matter; are these features anything new, or just current practice dressed up as “revolutionary”?
Sidebar: Apple is not your privacy friend
Apple routinely collects an astonishing amount of information about you. It knows where you live, who you’re having an affair with, what things you’re buying and where you work. It also knows how much you earn.
All of this data is extremely valuable, and crushingly devastating should it leak. Apple claim they only collect the bare minimum, but that’s almost certainly a mistruth.
If you ask them about it, you’ll get a very slick and lawyered response saying that they have industry leading processes that ensure no more data is collected than is needed, and that no other company stores data like Apple. Which is true, as the only other company that has as much data as you is Google, and they don’t run their company like a compartmentalised secret religious order from the 17th century.
So let us attempt to extract signal from the Apple Marketing noise
Private Compute Cloud
I will be drawing from the primary source here
Defining the PCC:
What is Apple’s Private Compute Cloud (know from now on as PCC)? well lets see what they say:
[..] [W]e created Private Cloud Compute (PCC), a groundbreaking cloud intelligence system designed specifically for private AI processing. For the first time ever, Private Cloud Compute extends the industry-leading security and privacy of Apple devices into the cloud, making sure that personal user data sent to PCC isn’t accessible to anyone other than the user[..]
Lots of noise there, but PCC is a compute farm to do the heavy lifting that local devices can’t do. Not exactly ground breaking, but there’s more!
Built with custom Apple silicon and a hardened operating system designed for privacy, we believe PCC is the most advanced security architecture ever deployed for cloud AI compute at scale.
Custom apple silicon is a wonderful term here. It could mean anything from a simple TPM, all the way up to custom processors and Inference engines. (A Trusted Platform Module does a lot of things, but for the purpose of this discussion it’s a box to store permission slips to access things, even though it’s not actually that. Inference engines are the things that run AI models. for most people that means Nvidia GPUs, but other more efficient hardware exists. I strongly suspect that Apple won’t be giving Nvidia money unless they have to)
We will have to wait till Apple release its OS binaries, to find out if they have decided to run on a custom Apple Arm platform. This might sound like an obvious choice however designing, manufacturing and running a custom server platform is exceptionally difficult. Making sure it has decent OS and driver support is also very hard. If Apple have managed to build and deploy a custom server platform at scale, it’s a testament to their project execution ability.
Now, I have fallen into the typical apple reporting trap of praising them for probably quite pragmatic decisions. Let’s stop here and move on.
Apple defines the cloud problem
So, let’s interrogate how Apple have defined the problem. This is instructive as it will allow us to work out what tradeoffs they might have taken. From the same PCC website:
Apple has long championed on-device processing [..] Data that exists only on user devices is by definition disaggregated and not subject to any centralized point of attack. When Apple is responsible for user data in the cloud, we protect it with [end to end encryption]. For cloud services where end-to-end encryption is not appropriate, we strive to process user data ephemerally or under uncorrelated randomized identifiers that obscure the user’s identity.
Apple likes End-to-end(e2e) encryption. It works well for things where there is a clear human destination for the data. (ie iMessage) This is because the “identity” of the human that is receiving the message rarely changes (ie only when they get a new phone) so the key stays the same and you have a way to verify the identity of the receiver.
It’s more obvious in WhatsApp when someone gets a new phone, you get little message saying that “Your Security code with “Dave” has changed”. That’s the key updating.
E2E communication provides primitives for you to verify that you are talking to a specific human. You can meet them in person and exchange proofs that show that each person has the right device, with the right encryption keys.
When we using someone else’s computer (ie cloud computing), the owner has the ability to place those keys on any computer they want. so its not really practical to go to Apple and say I want to see the server that my data is being processed on so I can exchange proofs. Those proofs don’t exist, and the number of servers that could use it is large.
Why is there a distinction? because encryption only protects your data from people who don’t have your keys. If you send your encrypted data to an unknown party, who has the keys to decrypt it, they have your data. For your data to be safe, you need to be sure that the person you’re sending it to is who they say they are.
For messaging, you can talk to the person in real life and check their proofs. For Apple (Google, Facebook any other company) Your data is a secret shared between you and their millions of servers, and thousands of employees.
This is a lot of words to say that Apple inserted reference to E2E as a distraction. Yes, the data will be sent over an encrypted link, but once inside Apple’s datacenter, it can be decrypted at any point.
So, the main protections that Apple claims to give you are: Ephemerality (ie your data only lasts in the cloud for the length of the request you made) or Pseudo anonymisation (where your data is given a new unique ID, instead of your Apple ID, and hopefully irrelevant details are excluded)
But lets read that more closely:
end-to-end encryption is not appropriate, we strive to process user data ephemerally
We strive to process it ephemerally.
Yeahnah, that’s not really convincing. To get fast relevant results from an ML model, you need long term access to your data. Sure for “does this photo contain a dog” ephemerality works. “does this photo contain my dog” ephemerality does not work.
Designing Private Cloud Compute
This is the meat of the document and it contains a lots of stuff, most of which is standard best practice. However there are a few interesting nuggets which we will expand on point by point.
Stateless computation on personal user data. Private Cloud Compute must use the personal user data that it receives exclusively for the purpose of fulfilling the user’s request. This data must never be available to anyone other than the user, not even to Apple staff, not even during active processing. And this data must not be retained, including via logging or for debugging, after the response is returned to the user. In other words, we want a strong form of stateless data processing where personal data leaves no trace in the PCC system.
Nothing overly new here. Making it practical is the hard part. It heavily implies that everything will be encrypted whilst being processed, which is implausible.
This clearly states that nothing lives in the PCC, but it doesn’t actually state where the personal data is coming from. The implication is that it’s coming from your device, but that’s not actually spelled out here. It also doesn’t say where the data will go after the request has left the PCC.
Enforceable guarantees. Security and privacy guarantees are strongest when they are entirely technically enforceable, which means it must be possible to constrain and analyze all the components that critically contribute to the guarantees of the overall Private Cloud Compute system. To use our example from earlier, it’s very difficult to reason about what a TLS-terminating load balancer may do with user data during a debugging session. Therefore, PCC must not depend on such external components for its core security and privacy guarantees. Similarly, operational requirements such as collecting server metrics and error logs must be supported with mechanisms that do not undermine privacy protections.
So there are a number of points covered here. if you’re familiar with access control systems, then you’ll know that they are generally “technically enforceable” Its just a list of rules. They are effectively a bunch of burley bouncers at a night club looking at IDs. Nothing new or groundbreaking.
The next part is interesting. TLS-terminating load balancer is basically where if you were a bank, you take your lock box full of cash to the front desk of the bank, and give the key to the cashier. They then open the box, extract the cash and then take it to your vault to be deposited.
This is a common pattern as it allows good flexibility without too much of a sacrifice of security. You still need to trust the cashier though.
What they are proposing would be like taking your lock box to the bank, and being escorted to your vault. You then need to coordinate shipping a key so that the vault worker can unlock your box and empty it out.
The protocol design isn’t actually all that new or groundbreaking. It’s the implementation that will be interesting, assuming they publish it. Again, nothing actually new, interesting, but not new.
No privileged runtime access. Private Cloud Compute must not contain privileged interfaces that would enable Apple’s site reliability staff to bypass PCC privacy guarantees, even when working to resolve an outage or other severe incident. This also means that PCC must not support a mechanism by which the privileged access envelope could be enlarged at runtime, such as by loading additional software.
This is the most interesting claim/design feature. This strongly hints (to me) that they might be doing fancy microkernel type OS stuff.
This would allow really strong process isolation (ie not being able to snoop on other process’s memory, or files and other resources) which, if you’re lucky has the knock on benefit of being a bit more resistant to stuff crashing and bringing down the entire machine with it.
This also lends credence to the “not allowing SRE staff privileged access”
It’s not beyond the realms of reason, after all iOS is very microkernel-y why not scale that up? (rhetorical, I know its a fuck tonne of work)
Non-targetability. An attacker should not be able to attempt to compromise personal data[..]
I’ve truncated here, because it’s just standard good practice
Verifiable transparency. Security researchers need to be able to verify, with a high degree of confidence, that our privacy and security guarantees for Private Cloud Compute match our public promises.[..]
Again truncated. blah blah blah under NDA, etc etc etc.
They round up this section with:
This is an extraordinary set of requirements, and one that we believe represents a generational leap over any traditional cloud service security model.
God I wish I could be this confident when I’m presenting my own work.
Logging and metrics
One nugget that stuck out to me was this:
Next, we built the system’s observability and management tooling with privacy safeguards that are designed to prevent user data from being exposed. For example, the system doesn’t even include a general-purpose logging mechanism. Instead, only pre-specified, structured, and audited logs and metrics can leave the node
Which is interesting, because one of the annoying habits of a large FAANG type company is that they don’t tend generate metrics directly, they rely on shipping petabytes of logs and boiling them to become metrics.
It’s slow and nasty work, and introduces a lot of latency (and expense!). Its nice to see that people are thinking about generating system and application metrics directly on device
Nuggets on How PCC will be used in practice
If we read the explainer towards the bottom of the page we get this:
When Apple Intelligence needs to draw on Private Cloud Compute, it constructs a request — consisting of the prompt, plus the desired model and inferencing parameters — that will serve as input to the cloud model. The PCC client on the user’s device then encrypts this request directly to the public keys of the PCC nodes that it has first confirmed are valid and cryptographically certified. This provides end-to-end encryption from the user’s device to the validated PCC nodes, ensuring the request cannot be accessed in transit by anything outside those highly protected PCC nodes. Supporting data center services, such as load balancers and privacy gateways, run outside of this trust boundary and do not have the keys required to decrypt the user’s request, thus contributing to our enforceable guarantees.
The critical point here is the AI service decides when to reach out to PCC to run a model. It’s going to need to have some level of context in order to make that decision. Obviously the service can run locally, but there needs to be some “cloud side” service that is able to authorise requests to the PCC. I suspect the service is much bigger and handle much more private data than is being let on.
This also fleshes out some of the detail I glossed over in the “TLS terminated loadbalancer” bit. A separate cloud side service triggers/manages the exchanging of keys between the PCC and the device.
Also not included in that paragraph is where the result of the inference goes. Does it go back to the device directly? I’m sure more information will be forthcoming in later days
Conclusions
Apple’s PCC is an interesting manifesto for a “secure cloud”. However it’s not the security Jesus. It makes some design choices that limit its application to single shot, zero or limited context ML inference requests. It’s not a generalised cloud system and can’t run the vast majority of Apple’s online services. Not that they’ll clarify that. They are very happy for you to believe that all their services are architected like this.
It’s important to point out that the PCC only deals with model execution. There are still vast amounts of private data being collected and shared with Apple, even if it’s not all directly being fed into ML models.
By its self the PCC will not materially improve privacy, especially as you are now handing over more data to another service in order to get the LLM to interpret them.
Serving personalised assistant ML models is pretty hard to do reliably, securely and privatly. Ultimately, for a true personal assistant, a lot of data will need to be shared with the thing running the model. Apple makes a big deal about on-device models. However that’s not the entire system.
However, it is indeed miles better than the data hoovers created by Google, OpenAI and Anthopic et al. Of course we still have to trust Apple when they say that they are using this system as designed. We do not, as yet have a mechanism to verify “cloud computing” architecture from outside of the cloud.
It’s also important to point out that as yet, Apple are not using your data for training yet.(well they are when they can get away with it)
However quite what data they trained their models on in the first place is an interesting question.
As ever, don’t get cargo-culted into making a system like this. for smaller companies there are far more effective ways to make things private without having to make your own servers. Start by tightening up your own IAMs, then create and enforce data life cycles.