§7.2: Probabilistically checkable proofs of proximity

In the previous section we saw that every subproperty of the dictatorship property has a $3$-query local tester. In this section we will show that any property whatsoever has a $3$-query local tester — if an appropriate “proof” is provided.

To make sense of this statement let’s first generalize the setting in which we study property testing. Definitions 1 and 2 are concerned with testing a boolean function $f : \{0,1\}^n \to \{0,1\}$ by querying its values on various inputs. If we think of $f$’s truth table as a boolean string of length $N = 2^n$, then a testing algorithm simply queries various coordinates of this string. It makes sense to generalize to the notion of testing properties of $N$-bit strings, for any length $N$. Here a property $\mathcal{C}$ will just be a collection $\mathcal{C} \subseteq \{0,1\}^N$ of strings, and we’ll be concerned with the relative Hamming distance $\mathrm{dist}(w, w’) = \frac{1}{N}\mathrm{Dist}(w,w’)$ between strings. For simplicity, we’ll begin to write $n$ instead of $N$.

Definition 10 An $r$-query string testing algorithm for strings $w \in \{0,1\}^n$ is a randomized algorithm which:

• chooses $r$ (or fewer) indices $\boldsymbol{i}_1, \dots, \boldsymbol{i}_r \in [n]$ according to some probability distribution;
• queries $w_{\boldsymbol{i}_1}, \dots, w_{\boldsymbol{i}_r}$;
• based on the outcomes, decides (deterministically) whether to “accept” $w$.

We may also generalize this definition to testing strings $w \in \Omega^n$ over finite alphabets $\Omega$ of cardinality larger than $2$.

Definition 11 Let $\mathcal{C} \subseteq \{0,1\}^n$ be a “property” of $n$-bit boolean strings. We say a string testing algorithm is a local tester for $\mathcal{C}$ (with rejection rate $\lambda > 0$) if it satisfies the following:

• If $w \in \mathcal{C}$ then the tester accepts with probability $1$.
• For all $0 \leq \epsilon \leq 1$, if $\mathrm{dist}(w, \mathcal{C}) > \epsilon$ then the tester rejects $w$ with probability greater than $\lambda \cdot \epsilon$.
• Equivalently, if the tester accepts $w$ with probability at least $1 – \lambda \cdot \epsilon$ then $w$ is $\epsilon$-close to $\mathcal{C}$; i.e., $\exists w’ \in \mathcal{C}$ such that $\mathrm{dist}(w,w’) \leq \epsilon$.

Examples 12 Let $\mathcal{Z} = \{(0, 0, \dots, 0)\} \subseteq \{0,1\}^n$ be the property of being the all-zeroes string. Then the following is a $1$-query local tester for $\mathcal{Z}$ (with rejection rate $1$): pick a uniformly random index $\boldsymbol{i}$ and accept if $w_{\boldsymbol{i}} = 0$.

Let $\mathcal{E} = \{(0, 0, \dots, 0), (1, 1, \dots, 1)\} \subseteq \{0,1\}^n$ be the property of having all coordinates equal. Then the following is a $2$-query local tester for $\mathcal{E}$: pick two independent and uniformly random indices $\boldsymbol{i}$ and $\boldsymbol{j}$ and accept if $w_{\boldsymbol{i}} = w_{\boldsymbol{j}}$. In the exercises you are asked to show that if $\mathrm{dist}(w,\mathcal{E}) = \epsilon$ then this tester rejects $w$ with probability $\tfrac{1}{2} – \tfrac{1}{2}(1-2\epsilon)^2 \geq \epsilon$.

Let $\mathcal{O} = \{w \in {\mathbb F}_2^n : w \text{ has an odd number of 1′s}\}$. This property does not have a local tester making few queries. In fact, in the exercises you are asked to show that any local tester for $\mathcal{O}$ must make the maximum number of queries, $n$.

As the last example shows, not every property has a local tester making a small number of queries; indeed, most properties of $n$-bit strings do not. This is rather too bad — imagine that for any large $n$ and any complicated property $\mathcal{C} \subseteq \{0,1\}^n$ there were an $O(1)$-query local tester. Then if anyone supplied you with a string $w$ claiming it satisfied $\mathcal{C}$, you wouldn’t have to laboriously check this yourself, nor would you have to trust the supplier; you could simply spot-check $w$ in a constant number of coordinates and become convinced that $w$ is (close to being) in $\mathcal{C}$.

But what if, in addition to $w \in \{0,1\}^n$, you could require the supplier to give you some additional side information $\Pi \in \{0,1\}^\ell$ about $w$ so as to assist you in testing that $w \in \mathcal{C}$? One can think of $\Pi$ as a kind of “proof” that $w$ satisfies $\mathcal{C}$. In this case it’s possible that you can spot-check $w$ and $\Pi$ together in a constant number of coordinates and become convinced that $w$ is (close to being) in $\mathcal{C}$ — all without having to “trust” the supplier of the string $w$ and the purported proof $\Pi$. These ideas lead to the notion of probabilistically checkable proofs of proximity (PCPPs).

Definition 13 Let $\mathcal{C} \subseteq \{0,1\}^n$ be a property of $n$-bit boolean strings and let $\ell \in {\mathbb N}$. We say that $\mathcal{C}$ has an $r$-query, length-$\ell$ probabilistically checkable proof of proximity (PCPP) system (with rejection rate $\lambda > 0$) when the following holds: There exists an $r$-query testing algorithm $T$ for $(n+\ell)$-bit strings, thought of as pairs $w \in \{0,1\}^n$ and $\Pi \in \{0,1\}^\ell$, such that:

• (“Completeness.”) If $w \in \mathcal{C}$ then there exists a “proof” $\Pi \in \{0,1\}^\ell$ such that $T$ accepts with probability $1$.
• (“Soundness”.) For all $0 \leq \epsilon \leq 1$, if $\mathrm{dist}(w, \mathcal{C}) > \epsilon$ then for every “proof” $\Pi \in \{0,1\}^\ell$ the tester $T$ rejects with probability greater than $\lambda \cdot \epsilon$.
• Equivalently, if there exists $\Pi \in \{0,1\}^\ell$ which causes $T$ to accept with probability at least $1 – \lambda \cdot \epsilon$ then $w$ must be $\epsilon$-close to $\mathcal{C}$.

PCPP systems are also known as assisted testers, locally testable proofs, or assignment testers.

Remark 14 A word on the three parameters: We are usually interested in fixing the number of queries $r$ to a very small universal constant (such as $3$) while trying to keep the proof length $\ell = \ell(n)$ relatively small (e.g., $\mathrm{poly}(n)$ is a good goal). We are usually not very concerned with the rejection rate $\lambda$ so long as it’s a positive universal constant (independent of $n$).

Example 15 In Examples 12 we stated that ${\mathcal O} = \{w \in {\mathbb F}_2^n : w_1 + \cdots + w_n = 1\}$ has no local tester making fewer than $n$ queries. But it’s easy to give a $3$-query PCPP system for ${\mathcal O}$ with proof length $n-1$ (and rejection rate $1$). The idea is to require the proof string $\Pi$ to contain the partial sums of $w$: $\Pi_{j} = \sum_{i=1}^{j+1} w_i \pmod{2}.$ The tester will perform one of the following checks, uniformly at random: \begin{align*} \Pi_1 &= w_1 + w_2 \\ \Pi_2 &= \Pi_1 + w_3 \\ \Pi_3 &= \Pi_2 + w_4 \\ &\cdots \\ \Pi_{n-1} &= \Pi_{n-2} + w_n \\ \Pi_{n-1} &= 1 \end{align*} Evidently the tester always makes at most $3$ queries. Further, in the “completeness” case $w \in {\mathcal O}$, if $\Pi$ is a correct list of partial sums then the tester will accept with probability $1$. It remains to analyze the “soundness” case, $w \not \in {\mathcal O}$. Here we are significantly aided by the fact that $\mathrm{dist}(w,{\mathcal O})$ must be exactly $1/n$ (since every string is at Hamming distance either $0$ or $1$ from ${\mathcal O}$). Thus to confirm the claimed rejection rate of $1$, we only need to observe that if $w \not \in {\mathcal O}$ then at least one of the tester’s $n$ checks must fail.

This example generalizes to give a very efficient PCPP system for testing that $w$ satisfies any fixed ${\mathbb F}_2$-linear equation. What about testing that $w$ satisfies a fixed system of ${\mathbb F}_2$-linear equations? This interesting question is explored in the exercises; it serves as a good warmup for our next result.

We now extend Theorem 9 to show the rather remarkable fact that any property of $n$-bit strings has a $3$-query PCPP system. (The proof length, however, is enormous.)

Theorem 16 Let $\mathcal{C} \subseteq \{0,1\}^n$ be any class of strings. Then there is a $3$-query, length-$2^{2^n}$ PCPP system for $\mathcal{C}$ (with rejection rate $.001$).

Proof: Let $N = 2^n$ and fix an arbitrary bijection $\iota : \{0,1\}^n \to [N]$. The tester will interpret the string $w \in \{0,1\}^n$ to be tested as an index $\iota(w) \in [N]$ and will interpret the $2^N$-length proof $\Pi$ as a function $\Pi : \{0,1\}^N \to \{0,1\}$. The idea is for the tester to require that $\Pi$ be the dictator function corresponding to index $\iota(w)$; i.e., $\chi_{\iota(w)} : \{0,1\}^N \to \{0,1\}$.

Now under the identification $\iota$, we can think of the string property $\mathcal{C}$ as a subclass of all $N$-bit dictators, namely $\mathcal{C}' = \{\chi_{\iota(w')} : \{0,1\}^N \to \{0,1\} \mid w' \in \mathcal{C}\}.$ In particular, $\mathcal{C}’$ is a property of $N$-bit functions. We can now state the twofold goal of the tester:

1. check that $\Pi \in \mathcal{C}’$;
2. given that $\Pi$ is indeed some dictator $\chi_{\iota(w’)} : \{0,1\}^N \to \{0,1\}$ with $w’ \in \mathcal{C}$, check that $w’ = w$.

To accomplish the latter the tester would like to check $w_{\boldsymbol{j}} = w’_{\boldsymbol{j}}$ for a random $\boldsymbol{j} \in [n]$. The tester can query any $w_{j}$ directly but accessing $w’_{j}$ requires a little thought. The trick is to prepare the string $X^{(j)} \in \{0,1\}^N \text{ defined by } X^{(j)}_{\iota(y)} = y_j.$ and then to locally correct $\Pi$ on $X^{(j)}$ (using Proposition 1.32).

Thus the tester is defined as follows:

1. With probability $1/2$, locally test the function property $\mathcal{C}’$ using Theorem 9.
2. With probability $1/2$, pick $\boldsymbol{j} \sim [n]$ uniformly at random; locally correct $\Pi$ on the string $X^{(\boldsymbol{j})}$ and accept if the outcome equals $w_{\boldsymbol{j}}$.

Note that the tester makes $3$ queries in both of the subtests.

Verifying “completeness” of this PCPP system is easy: if $w \in \mathcal{C}$ and $\Pi$ is indeed the (truth table of) $\chi_{\iota(w)} : \{0,1\}^N \to \{0,1\}$ then the test will accept with probability $1$. It remains to verify the “soundness” condition. Fix $w \in \{0,1\}^n$, $\Pi : \{0,1\}^N \to \{0,1\}$, and $0 \leq \epsilon \leq 1$ and suppose that the tester accepts $(w, \Pi)$ with probability at least $1-\lambda \epsilon$, where $\lambda = .001$. Our goal is to show that $w$ is $\epsilon$-close to some string $w’ \in \mathcal{C}$.

Since the overall test accepts with probability at least $1-\lambda \epsilon$, subtest (2) above accepts with probability at least $1-2\lambda \epsilon$. Thus by Theorem 9, $\Pi$ must be $200\lambda \epsilon$-close to some dictator $\chi_{\iota(w’)}$ with $w’ \in \mathcal{C}$. Since dictators are parity functions, Proposition 1.32 tells us that $$\label{eqn:pcpp-correcting} \forall j,\ \mathop{\bf Pr}[\text{locally correcting \Pi on X^{(j)} produces \chi_{\iota(w')}(X^{(j)}) = w'_j}] \geq 1 – 400\lambda \epsilon \geq 1/2,$$ where we used $400 \lambda \epsilon < 400 \lambda \leq 1/2$ by the choice $\lambda = .001$.

On the other hand, since the overall test accepts with probability at least $1-\lambda \epsilon$, subtest (2) above rejects with probability at most $2\lambda \epsilon$. This means $\mathop{\bf E}_{\boldsymbol{j} \sim [n]}\left[\mathop{\bf Pr}[\text{locally correcting \Pi on X^{(\boldsymbol{j})} doesn't produce w_{\boldsymbol{j}}}]\right] \leq 2\lambda \epsilon.$ By Markov’s inequality we deduce that except for at most a $4\lambda \epsilon$ fraction of coordinates $j \in [n]$ we have $\mathop{\bf Pr}[\text{locally correcting \Pi on X^{(j)} doesn't produce w_{j}}] < 1/2.$ Combining this information with \eqref{eqn:pcpp-correcting} we deduce that $w_j = w’_j$ except for at most a $4\lambda\epsilon \leq \epsilon$ fraction of coordinates $j \in [n]$. Since $w’ \in \mathcal{C}$ we conclude that $\mathrm{dist}(w,C) \leq \epsilon$, as desired. $\Box$

You may feel that the doubly-exponential proof length $2^{2^n}$ in this theorem is quite bad, but bear in mind there are $2^{2^n}$ different properties $\mathcal{C}$. Actually, giving a PCPP system for every property is a bit overzealous since most properties are not interesting or natural. A more reasonable goal would be to give efficient PCPP systems for all “explicit” properties. A good way to formalize this is to consider properties decidable by polynomial-size circuits. Here we consider a standard definition of general boolean circuits which allows for non-constant depth (unlike in Chapter 4):

Definition 17 A (De Morgan) circuit $C$ over boolean variables $x_1, \dots, x_n$ is a directed acyclic graph in which each node (“gate”) is labelled with either an $x_i$ or with $\wedge$, $\vee$, or $\neg$ (logical NOT). Each $x_i$ is used as label exactly once; the associated nodes are called “input” gates and must have in-degree $0$. Each $\wedge$ and $\vee$ node must have in-degree $2$ and each $\neg$ node must have in-degree $1$. Each node “computes” a function $\{-1,1\}^n \to \{-1,1\}$ of the inputs in the natural way. Finally, one node of $C$ is designated as the “output” gate, and $C$ itself is said to compute the function computed by the output node. We often identify $C$ with the function it computes.

We define the size of $C$, denoted $\mathrm{size}(C)$, to be the number of nodes (which is always at least $n$ and at most twice the number of arcs/wires).

Given an $n$-variable circuit $C$ we consider the set of strings which it “accepts” to be a property, $$\label{eqn:ppty-from-ckt} \mathcal{C} = \{w \in \{0,1\}^n : C(w) = 1\}.$$ For properties computed by modest-sized circuits $C$ we may hope for PCPP systems with proof length much less than $2^{2^n}$. We saw such a case in Example 15. Another advantage of considering “explicit” properties is that we can define a notion of constructing a PCPP system, “given” a property. A theorem of the form “for each explicit property $\mathcal{C}$ there exists an efficient PCPP system…” may not be useful, practically speaking, if its proof is non-constructive. We can formalize the issue as follows:

Definition 18 A PCPP reduction is an algorithm which takes as input a circuit $C$ and outputs the description of a PCPP system for the string property $\mathcal{C}$ decided by $C$ as in \eqref{eqn:ppty-from-ckt}, where $n$ is the number of inputs to $C$. If the output PCPP system always makes $r$ queries, has proof length $\ell(n, \mathrm{size}(C))$ (for some function $\ell$), and has rejection rate $\lambda > 0$, we say that the PCPP reduction has the same parameters. Finally, the PCPP reduction should run in time $\mathrm{poly}(\mathrm{size}(C), \ell)$.

(We haven’t precisely specified what it means to output the description of a PCPP system; this will be explained more carefully in the next section. In brief it means to list — for each possible outcome of the tester’s randomness — which bits are queried and what predicate of them is used to decide acceptance.)

Looking back at the results on testing subclasses of dictatorship (Theorem 9) and PCPPs for any property (Theorem 16) we can see they have the desired sort of “constructive” proofs. In Theorem 9 the local tester’s description depends in a very simple way on the input $1_S$. As for Theorem 16, it suffices to note that given an $n$-input circuit $C$ we can write down its truth table (and hence the property it decides) in time $\mathrm{poly}(\mathrm{size}(C)) \cdot 2^n$, whereas the allowed running time is at least $\mathrm{poly}(\mathrm{size}(C), 2^{2^n})$. Hence we may state:

Theorem 19 There exists a $3$-query PCPP reduction with proof length $2^{2^n}$ (and rejection rate $.001$).

In the exercises you are asked to improve this result as follows:

Theorem 20 There exists a $3$-query PCPP reduction with proof length $2^{\mathrm{poly}(\mathrm{size}(C))}$ (and positive rejection rate).

(The fact that we again have just $3$ queries is also explained in the exercises; there is a generic reduction from any constant number of queries down to $3$.)

Indeed, there is a much more dramatic improvement:

The PCPP Theorem There exists a $3$-query PCPP reduction with proof length $\mathrm{poly}(\mathrm{size}(C))$ (and positive rejection rate).

This is (a slightly strengthened version of) the famous “PCP Theorem” [FGL+96,AS98,ALM+98] from the field of computational complexity, which is discussed later in this chapter. Though the PCPP Theorem is far stronger than Theorem 19, the latter is not unnecessary; it’s actually an ingredient in Dinur’s proof of the PCP Theorem [Din07], being applied only to circuits of “constant” size. The current state of the art for PCPP length [Din07,BS08] is highly efficient:

Theorem 21 There exists a $3$-query PCPP reduction with proof length $\mathrm{size}(C) \cdot \mathrm{polylog}(\mathrm{size}(C))$ (and positive rejection rate).

4 comments to §7.2: Probabilistically checkable proofs of proximity

• Avishay Tal

Two small comments regarding the proof of theorem 16.
- In Equation (1) I think it is more understandable to put
$\chi_{\iota(w’)}(X^{(j)}) = w’_j$ instead of $w_j$.
- \lambda should be 0.001 in order for \lambda*400<1/2.

• Thanks on both!

• Amir

In the proof of Theorem 16, and in the equation immediately following the text “The trick is to prepare the string”, the appearance of $y$ and $y_j$ never to recur again is a little confusing. Does $y$ denote a randomly chosen string? This leaves the definition of $X^{(j)}$ somewhat murky.

• Hi Amir. It’s just a ‘dummy variable’; in some sense an X’s coordinates are indexed by length-n strings, and y is just such a string. This whole proof is a bit confusing. I tried to to make it as non-confusing as possible, but I don’t think I succeeded If there’s ever a second edition I’m going to try to figure a better way to present this proof.