# §7.4: Highlight: Håstad’s hardness theorems

In Theorem 36 we saw that it is $\mathsf{NP}$-hard to $(1-\delta_0, 1)$-approximate Max-E$3$Sat for some positive but inexplicit constant $\delta_0$. You might wonder how large $\delta_0$ can be. The natural limit here is $\frac18$ because there is a very simple algorithm which satisfies a $\frac78$-fraction of the constraints in any Max-E$3$Sat instance:

Proposition 37 Consider the Max-E$3$-Sat algorithm which outputs a uniformly random assignment $F$. This is a $(\frac78,\beta)$-approximation for any $\beta$.

Proof: In instance $\mathcal{P}$, each constraint is a logical OR of exactly $3$ literals and will therefore be satisfied by $F$ with probability exactly $\frac78$. Hence in expectation the algorithm will satisfy a $\frac78$-fraction of the constraints. $\Box$

(It’s also easy to “derandomize” this algorithm, giving a deterministic guarantee of at least $\frac78$ of the constraints; see the exercises.)

This algorithm is of course completely brainless — it doesn’t even “look at” the instance it is trying to approximately solve. But rather remarkably, it achieves the best possible approximation guarantee among all efficient algorithms (assuming $\mathsf{P} \neq \mathsf{NP}$). This is a consequence of the following 1997 theorem of Håstad [Has01], improving significantly on Theorem 36:

Håstad’s 3-Sat Hardness For any constant $\delta > 0$, it is $\mathsf{NP}$-hard to $(\frac78 + \delta, 1)$-approximate Max-E$3$-Sat.

Håstad gave similarly optimal hardness-of-approximation results for several other problems, including Max-E$3$-Lin:

Håstad’s 3-Lin Hardness For any constant $\delta > 0$, it is $\mathsf{NP}$-hard to $(\frac12 + \delta, 1 – \delta)$-approximate Max-E$3$-Lin.

In this hardness theorem, both the “$\alpha$” and “$\beta$” parameters are optimal; as we saw in Examples 33 one can efficiently $(\tfrac{1}{2}, \beta)$-approximate and also $(1,1)$-approximate Max-E$3$-Lin. The goal of this section is to sketch the proof of the above theorems, mainly Håstad’s 3-Lin Hardness Theorem. Let’s begin by considering the 3-Sat hardness result. If our goal is to increase the inexplicit constant $\delta_0$ in Theorem 36, it makes sense to look at how the constant arises. From the proof of Theorem 36 we see that it’s just the rejection rate in the PCPP Theorem. We didn’t prove that theorem, but let’s consider its length-$2^{2^n}$ analogue, Theorem 19. The key ingredient in the proof of Theorem 19 is the dictator test. Indeed if we strip away the few local correcting and consistency checks, we see that the dictator test component controls both the rejection rate and the type of predicates output by the PCPP reduction. This observation suggests that to get a strong hardness-of-approximation result for, say, Max-E$3$-Lin, we should seek a local tester for dictatorship which: a) has a large rejection rate; b) makes its accept/reject decision using $3$-variable linear equation predicates.

This approach (which of course needs to be integrated with efficient “PCPP technology”) was suggested in a 1995 paper of Bellare, Goldreich, and Sudan [BGS95]. Using it, they managed to prove $\mathsf{NP}$-hardness of $(1-\delta_0, 1)$-approximating Max-E$3$-Sat with the explicit constant $\delta_0 = .026$. Håstad’s key conceptual contribution (originally from [Has96]) was showing that given known PCPP technology, it suffices to construct a certain kind of relaxed dictator test. Roughly speaking, dictators should still be accepted with probability $1$ (or close to $1$), but only functions which are “very unlike” dictators need to be rejected with substantial probability. Since this is a weaker requirement than in the standard definition of a local tester, we can potentially achieve a much higher rejection rate, and hence a much stronger hardness-of-approximation result.

For these purposes, the most useful formalization of being “very unlike a dictator” turns out to be “having no notable coordinates” in the sense of Definition 6.9. We make the following definition which is appropriate for boolean CSPs.

Definition 38 Let $\Psi$ be a finite set of predicates over the domain $\Omega = \{-1,1\}$. Let $0 < \alpha < \beta \leq 1$ and let $\lambda : [0,1] \to [0,1]$ satisfy $\lambda(\epsilon) \to 0$ as $\epsilon \to 0$. Suppose that for each $n \in {\mathbb N}^+$ there is a local tester for functions $f : \{-1,1\}^n \to \{-1,1\}$ with the following properties:

• If $f$ is a dictator then the test accepts with probability at least $\beta$.
• If $f$ has no $(\epsilon,\epsilon)$-notable coordinates — i.e., $\mathbf{Inf}^{(1-\epsilon)}_i[f] \leq \epsilon$ for all $i \in [n]$ — then the test accepts with probability at most $\alpha + \lambda(\epsilon)$.
• The tester’s accept/reject decision uses predicates from $\Psi$; i.e., the tester can be viewed as an instance of Max-$\mathrm{CSP}(\Psi)$.

Then, abusing terminology, we call this family of testers an $(\alpha,\beta)$-Dictator-vs.-No-Notables using predicate set $\Psi$.

Remark 39 For very minor technical reasons, the above definition should actually be slightly amended. In this section we freely ignore the amendments, but for the sake of correctness we state them here. One is a strengthening, one is a weakening.

• The second condition should be required even for functions $f : \{-1,1\}^n \to [-1,1]$; what this means is explained in the exercises.
• When the tester makes accept/reject decisions by applying $\psi \in \Psi$ to query results $f({\boldsymbol{x}}^{(1)}), \dots, f({\boldsymbol{x}}^{(r)})$, it is allowed that the query strings are not all distinct.

Remark 40 It’s essential in this definition that the “error term” $\lambda(\epsilon) = o_\epsilon(1)$ be independent of $n$. On the other hand, we otherwise care very little about the rate at which it tends to $0$; this is why we didn’t mind using the same parameter $\epsilon$ in the “$(\epsilon,\epsilon)$-notable” hypothesis.

Just as the dictator test was the key component in our PCPP reduction (Theorem 19), Dictator-vs.-No-Notables tests are the key to obtaining strong hardness-of-approximation results. The following result (essentially proved in [KKMO07]) lets you obtain hardness results from Dictator-vs.-No-Notables tests in a black-box way:

Theorem 41 Fix a CSP over domain $\Omega = \{-1,1\}$ with predicate set $\Psi$. Suppose there exists an $(\alpha,\beta)$-Dictator-vs.-No-Notables test using predicate set $\Psi$. Then for all $\delta > 0$, it is “$\mathsf{UG}$-hard” to $(\alpha + \delta, \beta – \delta)$-approximate Max-$\mathrm{CSP}(\Psi)$.

In other words, the distinguishing parameters of a Dictator-vs.-No-Notables test automatically translate to the distinguishing parameters of a hardness result (up to an arbitrarily small $\delta$).

The advantage of Theorem 41 is that it reduces a problem about computational complexity to a purely Fourier-analytic problem, and a constructive one at that. The theorem has two disadvantages, however. The first is that instead of $\mathsf{NP}$-hardness — the gold standard in complexity theory — it merely gives “$\mathsf{UG}$-hardness”, which roughly means “at least as hard as the Unique-Games problem”. We leave the definition of the Unique-Games problem to the exercises, but suffice it to say it’s not as universally believed to be hard as Circuit-Sat is. The second disadvantage of Theorem 41 is that it only has $\beta – \delta$ rather than $\beta$. This can be a little disappointing, especially when you are interested in hardness for satisfiable instances ($\beta = 1$), as in Håstad’s 3-Sat Hardness. In his work, Håstad showed that both disadvantages can be erased provided you construct something similar to, but more complicated than, an $(\alpha, \beta)$-Dictator-vs.-No-Notables test. This is how the Håstad 3-Sat and 3-Lin Hardness Theorems are proved. Describing this extra complication is beyond the scope of this book; therefore we content ourselves with the following theorems:

Theorem 42 For any $0 < \delta < \frac18$, there exists a $(\frac78 + \delta,1)$-Dictator-vs.-No-Notables test which uses logical OR functions on $3$ literals as its predicates.

Theorem 43 For any $0 < \delta < \frac12$, there exists a $(\frac12, 1-\delta)$-Dictator-vs.-No-Notables test using $3$-variable ${\mathbb F}_2$-linear equations as its predicates.

Theorem 43 will be proved below, while the proof of Theorem 42 is left for the exercises. By applying Theorem 41 we immediately deduce the following weakened versions of Håstad’s Hardness Theorems:

Corollary 44 For any $\delta > 0$, it is $\mathsf{UG}$-hard to $(\frac78 + \delta, 1-\delta)$-approximate Max-E$3$-Sat.

Corollary 45 For any $\delta > 0$, it is $\mathsf{UG}$-hard to $(\frac12+\delta, 1-\delta)$-approximate Max-E$3$-Lin.

Remark 46 For Max-E$3$-Lin, we don’t mind the fact that Theorem 41 has $\beta-\delta$ instead of $\beta$ because our Dictator-vs.-No-Notables test only accepts dictators with probability $1-\delta$ anyway. Note that the $1-\delta$ in Theorem 43 cannot be improved to $1$; see the exercises.)

To prove a result like Theorem 43 there are two components: the design of the test, and its analysis. We begin with the design. Since we are looking for a test using $3$-variable linear equation predicates, the BLR Test naturally suggests itself; indeed, all of its checks are of the form $f(x)+f(y)+f(z) = 0$. It also accepts dictators with probability $1$. Unfortunately it’s not true that it accepts functions with no notable coordinates with probability close to $\tfrac{1}{2}$. There are two problems: the constant $1$ function and “large” parity functions are both accepted with probability $1$, despite having no notable coordinates. The constant $1$ function is easy to deal with: we can replace the BLR Test by the “Odd BLR Test”.

Odd BLR Test Given query access to $f : {\mathbb F}_2^n \to {\mathbb F}_2$:

• Choose ${\boldsymbol{x}} \sim {\mathbb F}_2^n$ and $\boldsymbol{y} \sim {\mathbb F}_2^n$ independently.
• Choose $\boldsymbol{b} \sim {\mathbb F}_2$ uniformly at random and set $\boldsymbol{z} = {\boldsymbol{x}} + \boldsymbol{y} + (\boldsymbol{b}, \boldsymbol{b}, \dots, \boldsymbol{b}) \in {\mathbb F}_2^n$.
• Accept if $f({\boldsymbol{x}}) + f(\boldsymbol{y}) + f(\boldsymbol{z}) = \boldsymbol{b}$.

Note that this test uses both kinds of $3$-variable linear equations as its predicates. For the test’s analysis, we as usual switch to $\pm 1$ notation and think of testing $f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z}) = \boldsymbol{b}$. It is easy to show the following (see the proof of Theorem 43, or the exercises for a generalization):

Proposition 47 The Odd BLR Test accepts $f : \{-1,1\}^n \to \{-1,1\}$ with probability $\tfrac{1}{2} + \tfrac{1}{2} \sum_{\substack{S \subseteq [n] \\ |S| \text{ odd}}} \widehat{f}(S)^3 \leq \tfrac{1}{2} + \tfrac{1}{2} \max_{\substack{S \subseteq [n] \\ |S| \text{ odd}}} \{\widehat{f}(S)\}.$

This twist rules out the constant $1$ function; it passes the Odd BLR Test with probability $\tfrac{1}{2}$. It remains to deal with large parity functions. Håstad’s innovation here was to add a small amount of noise to the Odd BLR Test. Specifically, given a small $\delta > 0$ we replace $\boldsymbol{z}$ in the above test with $\boldsymbol{z}’ \sim N_{1-\delta}(\boldsymbol{z})$; i.e., we flip each of its bits with probability $\delta/2$. If $f$ is a dictator then there is only a $\delta/2$ chance this will affect the test. On the other hand, if $f$ is a parity of large cardinality, the cumulative effect of the noise will destroy its chance of passing the linearity test. Note that parities of small odd cardinality will also pass the test with probability close to $1$; however we don’t need to worry about them since they have notable coordinates. We can now present Håstad’s Dictator-vs.-No-Notables test for Max-E$3$-Lin.

Proof of Theorem 43: Given a parameter $0 < \delta < 1$, define the following test, which uses Max-E$3$-Lin predicates:

Håstad$_{\boldsymbol{\delta}}$ Test Given query access to $f : \{-1,1\}^n \to \{-1,1\}$:

• Choose ${\boldsymbol{x}}, \boldsymbol{y} \sim \{-1,1\}^n$ uniformly and independently.
• Choose bit $\boldsymbol{b} \sim \{-1,1\}$ uniformly and set $\boldsymbol{z} = \boldsymbol{b} \cdot ({\boldsymbol{x}} \circ \boldsymbol{y}) \in \{-1,1\}^n$ (where $\circ$ denotes entry-wise multiplication).
• Choose $\boldsymbol{z}’ \sim N_{1-\delta}(\boldsymbol{z})$.
• Accept if $f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z}’) = \boldsymbol{b}$.

We will show that this is a $(\frac12, 1-\delta/2)$-Dictator-vs.-No-Notables test. First, let us analyze the test assuming $\boldsymbol{b} = 1$. \begin{align*} \mathop{\bf Pr}[\text{Håstad$_{{\delta}}$ Test accepts $f$} \mid \boldsymbol{b} = 1] &= \mathop{\bf E}[\tfrac{1}{2} + \tfrac{1}{2} f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z}')] \\ &= \tfrac{1}{2} + \tfrac{1}{2} \mathop{\bf E}[f({\boldsymbol{x}})\cdot f(\boldsymbol{y}) \cdot \mathrm{T}_{1-\delta}f({\boldsymbol{x}}\circ\boldsymbol{y})]]\\ &= \tfrac{1}{2} + \tfrac{1}{2} \mathop{\bf E}_{{\boldsymbol{x}}}[f({\boldsymbol{x}}) \cdot (f * \mathrm{T}_{1-\delta}f)({\boldsymbol{x}})] \\ &= \tfrac{1}{2} + \tfrac{1}{2} \sum_{S \subseteq [n]} \widehat{f}(S) \cdot \widehat{f * \mathrm{T}_{1-\delta} f}(S)\\ &= \tfrac{1}{2} + \tfrac{1}{2} \sum_{S \subseteq [n]} (1-\delta)^{|S|}\widehat{f}(S)^3. \end{align*} On the other hand, when $\boldsymbol{b} = -1$ we take the expectation of $\tfrac{1}{2} – \tfrac{1}{2} f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z}’)$ and note that $\boldsymbol{z}’$ is distributed as $N_{-(1-\delta)}({\boldsymbol{x}} \circ \boldsymbol{y})$. Thus $\mathop{\bf Pr}[\text{Håstad_{{\delta}} Test accepts f} \mid \boldsymbol{b} = -1] = \tfrac{1}{2} – \tfrac{1}{2} \sum_{S \subseteq [n]} (-1)^{|S|}(1-\delta)^{|S|}\widehat{f}(S)^3.$ Averaging the above two results we deduce \begin{equation} \label{eqn:hastad-test-acc} \mathop{\bf Pr}[\text{Håstad$_{{\delta}}$ Test accepts $f$}] = \tfrac{1}{2} + \tfrac{1}{2} \sum_{|S| \text{ odd}} (1-\delta)^{|S|}\widehat{f}(S)^3. \end{equation} (Incidentally, by taking $\delta = 0$ here we obtain the proof of Proposition 47.)

From \eqref{eqn:hastad-test-acc} we see that if $f$ is a dictator, $f = \chi_S$ with $|S| = 1$, then it is accepted with probability $1-\delta/2$. (It’s also easy to see this directly from the definition of the test.) To complete the proof that we have a $(\tfrac{1}{2}, 1-\delta/2)$-Dictator-vs.-No-Notables test, we need to bound the probability that $f$ is accepted given that it has $(\epsilon,\epsilon)$-small stable influences. More precisely, assuming \begin{equation} \label{eqn:hast-assump} \phantom{\quad \text{for all $i \in [n]$}} \mathbf{Inf}_i^{(1-\epsilon)}[f] = \sum_{S \ni i} (1-\epsilon)^{|S|-1} \widehat{f}(S)^2 \leq \epsilon \quad \text{for all $i \in [n]$} \end{equation} we will show that \begin{equation} \label{eqn:hast-concl} \mathop{\bf Pr}[\text{Håstad$_{{\delta}}$ Test accepts $f$}] \leq \tfrac{1}{2} + \tfrac{1}{2}\sqrt{\epsilon}, \quad \text{provided } \epsilon \leq \delta. \end{equation} This is sufficient because we can take $\lambda(\epsilon)$ in Definition 38 to be $\lambda(\epsilon) = \begin{cases} \tfrac{1}{2}\sqrt{\epsilon} & \text{for \epsilon \leq \delta,} \\ \tfrac{1}{2} & \text{for \epsilon > \delta.} \end{cases}$ Now to obtain \eqref{eqn:hast-concl}, we continue from \eqref{eqn:hastad-test-acc}: \begin{align*} \mathop{\bf Pr}[\text{Håstad$_{{\delta}}$ Test accepts $f$}] &\leq \tfrac{1}{2} + \tfrac{1}{2} \max_{|S| \text{ odd}}\{(1-\delta)^{|S|}\widehat{f}(S)\} \cdot \sum_{|S| \text{ odd}} \widehat{f}(S)^2 \\ &\leq \tfrac{1}{2}+ \tfrac{1}{2} \max_{|S| \text{ odd}}\{(1-\delta)^{|S|}\widehat{f}(S)\} \\ &\leq \tfrac{1}{2}+ \tfrac{1}{2} \sqrt{\max_{|S| \text{ odd}}\{(1-\delta)^{2|S|}\widehat{f}(S)^2\}} \\ &\leq \tfrac{1}{2}+ \tfrac{1}{2} \sqrt{\max_{|S| \text{ odd}}\{(1-\delta)^{|S|-1}\widehat{f}(S)^2\}} \\ &\leq \tfrac{1}{2}+ \tfrac{1}{2} \sqrt{\max_{i \in [n]}\{\mathbf{Inf}_i^{(1-\delta)}[f]\}}, \end{align*} where we used that $|S|$ odd implies $S$ nonempty. And the above is indeed at most $\tfrac{1}{2} + \tfrac{1}{2} \sqrt{\epsilon}$ provided $\epsilon \leq \delta$, by \eqref{eqn:hast-assump}. $\Box$

• Albert Atserias

Very minor: In remark 39, second bullet, a word is missing. Probably “tester”.

• Ryan O'Donnell

Thanks Albert!

• Sasho

(Caveat: this is above all a matter of taste) Maybe you can define a notion of approximation-preserving reduction and then define UG-hard and NP-hard in terms of approximation-preserving reductions to unique games and circuit-sat, respectively. I see you are trying to save space, but I can also imagine a reader being confused – especially a reader who comes from say math and not computer science background.

• Ryan O'Donnell

Sorry for the long delay in responding to your comments! I’ll get back to you soon.

• Ryan O'Donnell

Yeah, in a perfect world I’d explain this better, but indeed I’m trying to save space, and I want to make the book as much about Fourier analysis (and not complexity theory) as possible. I.e., trying to get Theorem 41 out there with a modicum of motivation and then get on with the business of constructing and analyzing tests. But I’ll try to think about how to explain it all better… Thanks!