One time programs through TEEs, QRACs and garbled circuits

Fabrizio Genovese ,  Lev Stambler · February 26, 2025

In a recent post, we explained how to realize one-time programs on TEEs using QRACs or conjugate coding. The interplay between TEEs and quantum resources effectively protects TEEs from all sorts of reentry attacks, but does very little to protect against stealing the TEE’s root-of-trust. This sucks, especially considering that, if you trust the root-of-trust (pun intended) then you can probably also trust the TEE capacity of producing ephemeral session-keys that cannot be so easily replicated. All in all, in this setting, does quantum really buy us something? And if not, can we do something more than what we already achieved? Yes we can! This piece focuses on the interplay of a bunch of interesting cryptographic primitives/devices: TEEs, QRACs/conjugate coding, and garbled circuits.

You probably already know what TEEs are. Just for the sake of completeness, here is a one-paragraph recap: A TEE is a processor that comes with a hardcoded private key on it called root-of-trust. Every instruction enters or leaves the TEE encrypted using a public key derived from the root-of-trust, aside for the final output of the computation. So the TEE can be used as the only party that really knows what’s going on: It sees all the inputs in plaintext, has access to the full computation trace and whatnot. This allows us to implement a lot of very nice protocols, especially when they involve several parties that do not trust each other. Still, you need to trust the TEE, or better the TEE manufacturer, that everything is done properly. If the root-of-trust is leaked somehow, it’s game over. Moreover, as hinted above TEEs suffer from reentry attacks, where the same program can be executed multiple times on different inputs until the desired outcome is produced. This is usually mitigated by producing ephemeral cryptographic session keys based on a decent-enough source of randomness, but it essentially amounts to putting even more trust in the hardware manufacturer. The possibility of sidechannel/reentry attacks gives the TEE owner extra control that may be used to undermine the protocol completely.

Unfortunately, for garbled circuits and QRACs/conjugate coding one paragraph recaps do not cut it. Bear with us or feel free to skip the next two sections altogether.

Recap: Garbled circuits

If you already know what a garbled circuit is, you probably want to skip this section altogether.

Boolean gates and circuits

The underlying idea of garbled circuits is remarkably simple. First of all, let us recall that a boolean gate is just a very simple function, usually $\{0,1\}^2 \to \{0,1\}$, which we usually define using a truth table. The most used basic gates for garbled circuits are XOR (usually denoted $\oplus$) and AND (usually denoted $\wedge$), corresponding to the logical operations having the same name, which are realised by the truth tables below:

XOR and AND truth tables.

We can then wire these gates together, basically piping one gate’s output into some other gate’s input. Coping information is allowed, so we can feed the same output to multiple inputs, but importantly we want to do so avoiding loops. For the mathematically versed, a boolean circuit is just a sort of acyclic graph with its nodes denoting either gates, inputs or outputs:

A 2-bit comparator.

For instance, the circuit in Figure 2 is a very simple 2-bit comparator: It will return $1$ if $X_0 = Y_0$ and $X_1 = Y_1$, $0$ in any other case. Notice that to work properly the circuit makes use of an auxiliary input, which must always be set to $1$.

There are important theorems that say how any boolean function $f:\{0,1\}^m \to \{0,1\}^n$ can be realised as a boolean circuit composed solely of AND and NOT gates, so this is really all we need to realize pretty much everything computable!

Very roughly, garbled circuits are a way to turn a boolean circuit into something similar to a TEE, where two parties can jointly compute a public function while keeping their inputs private, and learning nothing more than the function output. This protocol comprises two parties:

  • One party prepares the garbled circuit, essentially encrypting it. It then sends the circuit, along with his encrypted input, to…
  • …another party, that evaluates it.

The point here is that both the preparer and the evaluator do not need to know the other party’s inputs. In addition to this, the evaluator doesn’t learn anything about the circuit being evaluated, besides what can be learned by reverse-engineering it starting from the output. This whole business is sometimes called secure multi-party computation.

How to garble a circuit

Garbling a circuit is not hard. The idea is the following: First of all, we tag every wire in our circuit. Going back to the comparator of Figure 2, this step would look like so:

A tagged 2-bit comparator.

Now, for each tag $w_i$ in the circuit, we generate two random strings $w_i^0, w_i^1$. We now look at each gate, and at the wires it is connected to. We proceed by replacing every input/output value in the gate truth table with either $w_i^0$ or $w_i^1$ depending if the value in the table is $0$ or $1$. Here $i$ denotes the corresponding input/output wire.

For instance, consider the AND gate of Figure 3. It has wires $w_7$ and $w_8$ as first and second input, respectively, and wire $w_9$ as output. The corresponding AND truth table is now replaced with:

Encoding stage of a garbled circuit.

Having fixed any symmetric encryption scheme $E$ - for instance AES - we proceed by encrypting any entry in the table with its row and column value. In the case of Figure 4 the result is the following:

Encrypting stage of a garbled circuit.

The garbled circuit consists of a list, for each gate, containing the interior of the corresponding encrypted table, randomly permuted. For Figure 5, this becomes:

Final garbled gate.

Evaluating a garbled circuit

Let us focus for a second on what we have achieved so far: If the party preparing the garbled circuit performs the steps above, and sends the resulting lists to anyone, there is no way the receiving party will be able to either reconstruct the topology of the circuit nor evaluate anything.

On the other hand, if the preparing party sends us the wires $w_i^j$ corresponding to any input, we are able to compute the label corresponding to the result. To see how this works, let’s return to Figure 3. Suppose we receive the input $w^0_0, w^0_1, w^1_2, w^0_3, w^1_4$. This corresponds to the input $00101$ but of course we don’t know that, since the values $w^i_j$ are assigned at random by the preparing party. Still, we can check if some of these values decrypt any entry in any of the lists we have. We quickly realize that $w^0_0, w^0_1$ and $w^1_2, w^0_3$ decrypt two entries in the leftmost top and bottom xor gates, respectively. From this, we obtain the values corresponding to $w^0_4$ and $w^1_5$. We keep going like this until we finally decrypt the value $w^0_9$.

It is important to stress again that from the evaluation step we learn nothing about the actual values of the input or the output. We just work with the assigned strings which are completely random! The evaluator can learn something about the overall structure of the circuit, but that is fine as the circuit structure is supposed to be public.

On the other hand, the preparer doesn’t know anything about how the computation was performed, but it is the only party that knows the cleartext value corresponding to the output tags.

Usually, garbled circuits are paired with another cryptographic primitive called oblivious transfer, which allows the evaluator to request the tags corresponding to a given input without revealing to the preparer what the input does. This is what effectively makes garbled circuits multi-party computation. The ‘problem’ with this protocol is that it is interactive. In the next section we will see how we can use quantum resources to make it both non-interactive and one-shot.

As a closing remark for this section, notice that there are many possible optimizations that either make garbled circuits much easier to evaluate, or drastically reduce the data needed to be transmitted, or both. Garbled circuit optimization is still an open area of research, but this is out of the scope of this post.

Recap: QRACs and conjugate coding

This whole section is copied verbatim from another post. Feel free to skip it.

The quantum world is very different from the classical world. Whereas a classical bit can only be in one of two states – $0$ or $1$ – a quantum bit $\ket{\psi}$ can be represented as any point laying on a surface called the block sphere, which looks like so:

The block sphere,

Quantum operations on the bit consist in reflecting or rotating this vector $\ket{\psi}$ by a given phase. Most importantly for us, we can measure any quantum bit along a particular, chosen axis. The most used axes are the three orthogonal axes showing in the picture above, which are usually referred to as the $\ket{0}$–$\ket{1}$ basis, the $\ket{+}$–$\ket{-}$ basis, and the $\ket{+i}$–$\ket{-i}$ basis.

Measurements in the quantum world yield probabilistic results, and change the system being measured in a non-reversible way. A measurement is specified by choosing a given axis, for instance the $\ket{0}$–$\ket{1}$ basis. Measuring makes our state $\ket{\psi}$ collapse to either the state $\ket{0}$ or the state $\ket{1}$ with a certain probability, which is proportional to how close our state $\ket{\psi}$ was to either of these points. This is known as the Born rule. To visualize this, consider the example below:

A measurement.

Here, $\ket{\psi}$ is tilted towards $\ket{0}$, $\ket{-}$ and $\ket{+i}$. If we were to measure, say, in the $\ket{0}$–$\ket{1}$ basis, it is more likely that our measurement outcome would yield $\ket{0}$ than to $\ket{1}$. The bias towards $\ket{0}$ is proportional to the red highlighted line, which in turn depends on the red angle. Similarly, if we were to measure in the $\ket{+}$–$\ket{-}$ basis, we would be way more likely to obtain $\ket{-}$, with a bias proportional to the blue line. For the mathematically versed, quantum measurements can be realized as projections is the linear-algebraic sense.

Notice how since the axes we are choosing are ortogonal, the state $\ket{0}$ is completely unbiased with respect to the $\ket{+}$–$\ket{-}$ basis: Measuring along this axis would produce outcomes $\ket{+}$ or $\ket{-}$ with equal probability (50%). So, in particular, measuring $\ket{\phi}$ in the $\ket{0}$–$\ket{1}$ basis and then measuring it along the $\ket{+}$–$\ket{-}$ basis would yield $\ket{+}$ or $\ket{-}$ with the same probability (50%), since both possible outcomes of the first measurement are unbiased with respect to the second measurement basis.

Orthogonal states are unbiased with respect to each other.

A last, very important ingredient to our toolbox is the no cloning theorem, which says that there is no procedure to clone arbitrary quantum states. So we cannot make a copy of $\ket{\psi}$, and unless we know how it was prepared, the only way we have to probe it is by measuring it. But in doing so we would modify $\ket{\psi}$ forever. This suggests that quantum states are very different from classical states in that they are more similar to physical resources (like, say, matter): We cannot freely clone or destroy them, and, exactly like running a mass spectrometry to probe the chemical nature of something destroys the sample, so do quantum measurements with our quantum states.

We can use these facts to pack two different classical bits into a quantum system in such a way that only one of the bits can be recovered. There are two different techniques to do so, which we are going to briefly cover.

QRACs

Suppose you have a sequence of two classical bits. There are four possible sequences, namely $00,01,10,11$. Let us consider the circle on the Block sphere that lies on the same plane of the axes $\ket{0}$–$\ket{1}$ and $\ket{+}$–$\ket{-}$. We encode our four possible sequences into the following four states, all laying in the plane spanned by the $\ket{0}$–$\ket{1}$ and $\ket{+}$–$\ket{-}$ bases (shown in detail in the second image below):

QRAC
QRAC, planar

As we can see, each of these states is placed in a precise way so that it is biased both along the $\ket{0}$–$\ket{1}$ basis and the $\ket{+}$–$\ket{-}$ basis. The recipe for extracting the classical bit is simple:

  • If you want the first bit, measure in the $\ket{0}$–$\ket{1}$ basis. Here, obtaining $\ket{0}$ is taken to mean “the first bit is $0$”, whereas obtaining $\ket{1}$ means “the first bit is $1$”;
  • If you want the second bit instead, measure in the $\ket{+}$–$\ket{-}$ basis. Again, $\ket{+}$ means “$0$”, whereas $\ket{-}$ means “$1$”.

We see that in both cases, the user will recover the i-th bit with overwhelming probability. Most importantly, as soon as the first measurement is performed the resulting state becomes completely unbiased towards the other basis, making the recovery of the other bit impossible. All the encoded information about the second bit has been destroyed by the first measurement!

The technique detailed here is a simplified version of a technique called Quantum Random Access Coding – QRAC in short. Optimized QRACs allow for the successful recovery of the chosen bit with ~85% of probability.

Conjugate coding

Conjugate coding was invented in the ’60s by Wiesner, which is also the original quantum money guy. Compared to QRACs, conjugate coding eliminates the probability of error – the end user can always recognize one of the two classical bits with certainty. This happens at the expense of using more quantum resources – double the number of qubits compared to QRACs. The idea of conjugate coding is easy:

  • For the first bit, we encode $0$ in the state $\ket{0}$, $1$ in the state $\ket{1}$;
  • For the second bit, we encode $0$ in the state $\ket{+}$, $1$ in the state $\ket{-}$;

So now we have two quantum bits, $q_0, q_1$, each encoding one of the two classical bits. We know that we can recover the first bit with certainty by measuring $q_0$ in the $\ket{0}–\ket{1}$ basis, whereas $q_1$ must be measured in the $\ket{+}–\ket{-}$ basis.

We now concatenate $q_0$ and $q_1$ randomly in a sequence of two qubits, that can either be $q_0 q_1$ or $q_1 q_0$. Without knowing in which order the qubits were concatenated, it is impossible to recover both bits at the same time:

  • If we measure along the $\ket{0}$–$\ket{1}$ basis, we obtain a sequence that certainly contains the value of the first bit, plus another bit of pure noise;
  • If we measure along the $\ket{+}$–$\ket{-}$ basis, we obtain a sequence that certainly contains the value of the second bit, plus another bit of pure noise;
  • If we measure the first qubit along a basis and the second along another, we may either obtain both bits with certainty, or a purely random string.

…The point is that it is impossible to establish in which of these scenarios we end up being without knowing the concatenation order. So no information can be extracted without knowing it.

One time programs, finally

Now that we have both the garbled circuits part and the quantum part figured out, it should be easy-ish to put all the pieces together. To make a garbled circuit non-interactive and one-shot, we perform the following steps:

  1. We prepare the garbled circuit;
  2. For each wire $w_i$, we use QRACs or conjugate coding to encode both $w_i^0, w_i^1$ in a bunch of quantum resources. We then obtain a list of qbits $[q]_i$ for each wire $w_i$.
  3. We send this list over to the evaluator along with the garbled circuit.

When the evaluator wants to evaluate the circuit, it performs the following steps:

  1. Evaluator selects its own input.
  2. If $w^0_i$ is part of the input, then we measure $[q]_i$ in order to extract $w^0_i$. This in turns ensures us that $w^1_i$ is suitably destroyed.
  3. If instead $w^1_i$ is part of the input, then we measure $[q]_i$ in order to extract $w^1_i$. This in turns ensures us that $w^0_i$ is suitably destroyed.
  4. We evaluate the circuit as usual.
  5. For the last step, suppose $[q]_n$ encodes the output tags. We measure all $[q]_n$ in order to recover the bit $0$. If the tag we got from the previous step corresponds to the outcome of this measurement, then we conclude the output is $0$. Otherwise, we conclude the output is $1$.

This essentially makes the protocol non-interactive, as information is sent from the preparer to the evaluator only once, and one-shot, as the evaluator can only extract one viable input from the quantum resources provided.

Why do we need TEEs?

You are probably wondering: All nice and fine, but why do we need TEEs in all this? After all, we used quantum resources to encode all possible inputs to the garbled circuit, and that seemed to be enough.

Unfortunately, things are more complicated than this: It has been proved that one-shot programs are impossible in both the ideal classical and quantum cases. So if what we did above was all, we’d surely be doing something wrong. To see why we need TEEs, just note the following:

  • If we use QRACs, we have to account for some measurement error;
  • If we use conjugate coding, we need to know which part of the measured string is usable and which part is noise.

The point is that in the absence of a TEE there is no way to go around these limitations without busting security completely. If we used conjugate coding, then it is clear that knowing which qubits encode which of the two bitstrings would be enough to be able to extract both bitstrings at the same time: Just measure every qubit along the ‘correct’ basis! In the case of QRACs, the situation appears a bit more delicate, but it is essentially the same: Being able to account for measurement error means knowing ‘how close’ the two bitstrings really are. This information is again enough to recover both bitstrings by means of strategically placed measurements.

So, we need TEEs to perform the last step of the protocol, that is, we feed the measurement outcomes for each $[q]_i$ to the TEE and in turn we receive the labels corresponding to each input/output in turn.

What we obtained though is that if the root-of-trust is stolen, or the TEE is hijacked after the measurement has been carried out, then the attacker gains nothing! Actually not just thit: Whoever holds the root-of-trust needs to actively communicate with whoever is performing the measurements to gain something. So, if for instance the TEE manifacturer (or, say, the NSA) keeps a database of all the root-of-trusts of all TEEs, active collusion between the manifacturer and the protocol Evaluator is still required to break the protocol. This is because, even if now an attacker would be able to recover both bitstrings, this information has been already destroyed after measuring. For instance, in the case of conjugate coding, immagine you have two classical bits encoded in qubits $q_0, q_1$. Suppose also that $q_0$ is encoded in the $\ket{0}$–$\ket{1}$ basis and $q_1$ in the $\ket{+}$–$\ket{-}$ basis, but you do not know this yet. You want to extract the first bit and measure along the $\ket{0}$–$\ket{1}$ basis. You feed the measurement outcome - that is, two classical bits $b_0,b_1$ - to the TEE. The TEE discards the second bit and returns $b_0$. Suppose that only now you gain access to the root-of-trust of the TEE. Everything you would see is a program that says:

If the first bit is to be read, keep $b_0$ and discard $b_1$. Keep $b_1$ otherwise.

Having this information would have been very useful before measuring $q_0,q_1$, but now it’s worth nothing. In measuring along the $\ket{0}$–$\ket{1}$ basis we already destroyed any useful information encoded in $q_1$, with no way of going back. This effectively renders any store now, decrypt later strategy useless.

So yes, in this protocol we use TEEs just a little, precisely as little as needed to circumvent an impossibility result!

Implementing it

Implementing what we said both in the last post and here is not hard, and indeed we have a finished proof-of-concept library written in Rust that does the job. Importantly, the same library can cover both the ‘pure TEE’ and the ‘TEE + garbled circuits’ approaches. As of now, we’re finalizing the implementation of a couple of examples on a small and super cheap TEE, the ESP32-c6. This has obviously been especially painful since Rust on embedded system is not exactly pleasurable (albeit still more pleasurable than using C/C++).

Developing this library brought a lot of details to our attention, such as, for instance, the fact that the native protocol as we described it above could be replayed - precisely what we worked so hard to avoid! To see why, notice that in the protocol example we gave, the TEE does not know if the data we’re feeding to it comes from a real quantum measurement or not. The evaluator could make it up cleverly and extract the tags corresponding to all inputs.

Again, circumventing this is not hard at all, and since we’ve been crying in Rust for the past couple of weeks, we’re really looking forward to sharing all the relevant details with you in another blog post as soon as the library is out.

Future work: Dualizing the computation effort

Clearly the protocol detailed here leaves a lot to be desired. A direction of future research we are exploring is the following: Can we somehow dualize the situation? That is, at the moment the TEE does very little work, whereas garbled circuits do a lot. This is undesirable as preparing and evaluating garbled circuits is computationally way harder than just using TEEs. Can we then recast this protocol so that:

  • The non-interactive, one-shot part is still taken care of by the quantum word;
  • The overwhelming majority of the computation takes place in the TEE;
  • Just a little bit of work is carried out with garbled circuits, but…
  • …this bit of work is enough to make gaining the TEE root-of-trust after measurement useless?

The short answer is: We don’t know, but we’re working hard to find out. Until the next time!

Twitter, Facebook

Comments