TPM 2.0 is cool, actually

Like a lot of developers and computer users, my first real encounter with TPM 2.0 was:

This PC doesn't meet the minimum system requirements for Windows 11.

I didn't know what it was, just that it was annoying me. Over the years, I overheard that it was some kind of DRM-ish Microsoft requirement to force hardware upgrades without ever getting the chance to actually dive into what it is.

But now, I actually have the need for it and I completely changed my opinion on it.

The problem

For CodSpeed, part of the infrastructure is a fleet of bare-metal runners that execute performance benchmarks. The results need to be trustworthy since we're making claims about performance changes, so if a runner is compromised, the whole system's integrity collapses.

So a big issue to solve was: How do you authenticate that a machine is actually the hardware you enrolled?

Not just that the software on it is correct but that the hardware itself hasn't been swapped, tampered with, or impersonated.

The runners live in a data center. We don't physically control them 24/7. The workloads they receive are sensitive. The threat model includes someone with physical access: either an attacker, or someone inside the hosting provider.

The naive approach

My first instinct: generate a private key on each machine at enrollment, store the corresponding public key fingerprint in the control plane, sign a challenge at registration time.

It's simple, works. Until someone reads the filesystem.

A private key on disk is extractable by anyone with physical access or enough privilege escalation. You can tighten permissions, put it in a secret manager, rotate it all of which add complexity without solving the fundamental problem: the key is still a blob of bytes somewhere accessible.

What I actually needed: a secret that cannot leave the hardware. Not hard to extract. Not guarded by access controls. Just physically impossible to extract.

I started thinking about mobile, remembered Secure Enclave on iOS. Keys that live in dedicated hardware, never exposed to the OS. I needed the same thing essentially, but for a server.

That's when I started properly reading about TPM. And realized it's been sitting on every modern motherboard all along.

What TPM actually is

TPM stands for Trusted Platform Module. It's a dedicated security microcontroller, soldered on the motherboard, with its own isolated execution environment.

Its core promise: it can sign messages with a private key that never leaves the chip . You can ask it to sign things, to generate keys but you cannot ask it to export raw key material, because that operation purposefully doesn't exist in the command interface.

It communicates with the host over SPI or PCIe, through a standardized API called TSS (Trusted computing group Software Stack).

The Windows 11 drama gave TPM a reputation as a bureaucratic gatekeeper. The actual technology is the opposite of that, solving a genuinely hard problem.

EK, AK, and the credential activation protocol

Two keys matter for hardware attestation. Their design is more thoughtful than it first appears.

Endorsement Key (EK)

It's called "endorsement" because it's literally the manufacturer's endorsement of the chip. When Infineon, STMicro, or NXP produces a TPM, they burn a unique asymmetric key pair into it and issue an Endorsement Certificate, a signed document that says "this key lives inside genuine hardware that we manufactured." The certificate chains back to the manufacturer's root CA. You verify it the same way you'd verify a TLS certificate.

The private half of the EK never leaves the chip. But here's the subtle part: the EK is a decryption-only key, not a signing key. This is deliberate. If the EK could sign arbitrary data, it would be a perfect tracking device. Every signature would fingerprint the physical chip. So the spec restricts the EK to decryption only. You can encrypt something to a specific TPM, but that TPM can't use its EK to sign a message back.

This constraint is what makes the AK necessary.

Attestation Key (AK)

A key you generate yourself, inside the TPM, under the EK's hierarchy. Unlike the EK, the AK is a signing key — this is what actually signs challenges, quotes, and attestation evidence.

Why not just use the EK for everything? Privacy. The EK uniquely identifies the physical chip. If every service you attest to sees the same key, they can collude and track the hardware across contexts. The AK breaks that link. You can create multiple AKs — one per service, one per use case — each unlinkable to each other and to the EK from the outside. The only way to prove an AK belongs to a specific EK is through the credential activation protocol, which requires cooperation from the TPM itself.

So now the interesting question: how do you prove to a remote control plane that a given AK actually lives on a genuine TPM, and specifically the same chip as an enrolled EK?

Credential Activation

This is the credential activation protocol, or ActivateCredential in the spec. It's the handshake at enrollment, validating the hierarchy between an EK and an AK:

The critical property: the blob produced by MakeCredential can only be decrypted by a TPM that simultaneously holds the EK private key and the AK. The cryptographic binding enforces co-residency. You can't fake this in software since there is no EK private key to steal.

After enrollment, the control plane stores the AK public key. Every subsequent challenge is signed by the AK inside the TPM with standard asymmetric verification on the other end.

What the API actually looks like

The TPM exposes a small, well-defined set of operations through the TSS (TCG Software Stack). Bindings exist for most languages. Before looking at the calls, one detail worth keeping in mind: the TPM has a small amount of volatile RAM that holds loaded keys as transient objects (cleared on reboot or explicit flush), and a limited NVRAM for anything that needs to survive a power cycle. Most of the API is framed around loading a key into RAM and getting back a short-lived handle you can then pass to other calls.

The operations that matter for attestation are essentially five:

CreatePrimary(): derives the EK from the chip's internal seed. This is deterministic and always returns a handle to the same key loaded in the TPM's RAM, as well as its public portion.
Create(parent: EK): generates a new key under a parent's hierarchy. Returns two blobs: the public portion and an opaque encrypted private blob that only this specific TPM can decrypt. So this is safe to write to disk, safe to back up and useless to anyone without the actual hardware. Note that Create() does not load anything into the TPM; you just get the blobs back.
Load(private_blob, public_blob): loads a blob pair into the TPM's RAM and returns a transient handle. This is how the AK is restored across reboots: you persist the blobs on disk and re-load them at boot. If you want a key to stick around without a reload step, you can also evict it into NVRAM, but this is limited and usually reserved for hot keys.
Sign(key_handle, digest): signs a hash using a handle to a key currently loaded in the TPM's RAM. The signing happens entirely inside the chip. Your code only ever holds a handle, never the private key material.
ActivateCredential(ak_handle, ek_handle, blob): the enrollment handshake call we talked about earlier. Both keys must be loaded in RAM. The TPM decrypts the blob using the EK, verifies the AK name matches, and returns the secret. Errors if either key doesn't match. The whole proof lives in this one call.

It was there the whole time

Every modern motherboard ships with a tiny cryptoprocessor that can answer one question with mathematical certainty: is this the actual machine I believe it to be? Or something impersonating it?

And it's surprisingly been sitting there for over a decade. The Windows 11 drama poisoned the brand so completely that most engineers never stopped to find out what was actually in there. While the tech in itself comes extremely handy if you need a trust anchor.

For CodSpeed's bare-metal runners, that's exactly what we ended up building on: proving at every boot that the machine on the other end is the one we racked and not a lookalike or an impersonator, all using a chip that was already in the box.

Resources

tpm2-tools: CLI tools for TPM 2.0, good for local experimentation before writing any code
tpm2-tss: the reference C implementation of the TSS stack
TCG TPM 2.0 Specification: Parts 1–4, the canonical reference
tss-esapi: Rust bindings (what we use at CodSpeed)
go-tpm: Go bindings from Google, well-documented