In Chapter 8.4 we described the problem of “threshold phenomena” for monotone functions $f : \{-1,1\}^n \to \{-1,1\}$.

As $p$ increases from $0$ to $1$, we are interested in whether $\mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}}) = -1]$ has a “sharp threshold”, jumping quickly from near $0$ to near $1$ around the critical probability $p = p_c$. The “sharp threshold principle” tells us that this occurs (roughly speaking) if and only if the total influence of $f$ under its critical distribution, $\mathbf{I}[f^{(p_c)}]$, is $O(1)$. (See Exercise 8.24 for more precise statements.) This motivates finding a characterization of functions with small total influence. Indeed, finding such a characterization is a perfectly natural question even for not-necessarily-monotone boolean-valued functions $f \in L^2(\Omega^n,\pi^{\otimes n})$.

For the usual uniform distribution on $\{-1,1\}^n$, Friedgut’s Junta Theorem from Chapter 9.6 provides a very good characterization: $f : \{-1,1\}^n \to \{-1,1\}$ can only have $O(1)$ total influence if it’s (close to) an $O(1)$-junta. By the version of Friedgut’s Junta Theorem for general product spaces (Section 3), the same holds for boolean-valued $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ so long as $p$ is not too close to $0$ or to $1$. However for $p$ as small as $1/n^{\Theta(1)}$, the “junta”-size promised by Friedgut’s Junta Theorem may be larger than $n$. (Cf. the breakdown of Friedgut and Kalai’s sharp threshold result Theorem 28 for $p \leq 1/n^{\Theta(1)}$.) This is a shame, as many natural graph properties for which we’d like to show a sharp threshold — e.g., (non-)$3$-colourability — have $p = 1/n^{\Theta(1)}$. At a technical level, the reason for the breakdown for very small $p$ is the dependence on the “$\lambda$” parameter in the General Hypercontractivity Theorem. But there’s a more fundamental reason for its failure, as suggested by the example at the end of Section 3: Friedgut’s Junta Theorem simply isn’t true for such small $p$. Let’s give some examples:

- The logical OR function $\mathrm{OR}_n : \{-1,1\}^n \to \{-1,1\}$ has critical probability $p_c \sim \frac{\ln 2}{n}$, and its total influence at this probability is $\mathbf{I}[\mathrm{OR}_n^{(p_c)}] \sim 2\ln2$, a small constant. Yet it’s easy to see that under the $p_c$-biased distribution, $\mathrm{OR}_n$ is not even, say, $.1$-close to any junta on $o(n)$ coordinates. (I.e., for every $o(n)$-junta $h$, $\mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_{p_c}^{\otimes n}}[f({\boldsymbol{x}}) \neq h({\boldsymbol{x}})] > .1$.)
- As another example, consider the function $f : \{-1,1\}^n \to \{-1,1\}$ which is $\mathsf{True}$ ($-1$) if and only if there exists a “run” of three consecutive $-1$’s in its input. (We allow runs to “wrap around”, thus making $f$ a transitive-symmetric function.) It’s not hard to show that the critical probability for this $f$ satisfies $p_c = \Theta(1/n^{1/3})$. Furthermore, since $f$ is a computable by a DNF of width $3$, it is an exercise to show that $\mathbf{I}[f^{(p_c)}] \leq 12$, a small constant. But again, this $f$ is not close to any $o(n)$-junta under the $p_c$-biased distribution. A similar example is $\mathrm{Clique}_3 : \{\mathsf{True}, \mathsf{False}\}^{\binom{v}{2}} \to \{\mathsf{True}, \mathsf{False}\}$, the graph property of containing a triangle.

We see from these examples that for $p$ very small, we can’t hope to show that low-influence functions are close to juntas. However these counterexample functions still have low complexity in a weaker sense — they are computable by narrow DNFs. Indeed, Friedgut **[Fri99]** suggests this as a characterization:

Friedgut’s ConjectureThere is a function $w : {\mathbb R}^+ \times (0,1) \to {\mathbb R}^+$ such that the following holds: If $f : \{\mathsf{True}, \mathsf{False}\}^n \to \{\mathsf{True}, \mathsf{False}\}$ is a monotone function, $0 < p \leq 1/2$, and $\mathbf{I}[f^{(p)}] \leq K$, then $f$ is $\epsilon$-close under $\pi_p^{\otimes n}$ to a monotone DNF of width at most $w(K,\epsilon)$.

The assumption of monotonicity is essential in this conjecture; see the exercises.

Short of proving his conjecture, Friedgut managed to show:

Friedgut’s Sharp Threshold TheoremThe above conjecture holds when $f$ is a graph property.

This gives a very good characterization of monotone graph properties with low total influence, one that works no matter how small $p$ is. Friedgut also extended his result to monotone hypergraph properties; this was sufficient for him to show that several interesting hypergraph (or hypergraph-like) properties have sharp thresholds — for example, the property of a random $3$-uniform hypergraph containing a perfect matching, or the property of a random width-$3$ DNF formula being a tautology. (Interestingly, for neither of these properties do we know precisely where the critical probability $p_c$ is; nevertheless, we know there is a sharp threshold around it.) Roughly speaking one needs to show that at the critical probability, these properties can’t be well-approximated by narrow DNFs because they are almost surely not determined just by “local” information about the (hyper)graph. This kind of deduction takes some effort in random graph theory and we won’t discuss it further here (beyond one exercise at the end of the chapter); for a survey, see **[Fri05]**.

Friedgut’s proof is rather long and it relies heavily on the function being a graph or hypergraph property. Following Friedgut’s work, Bourgain **[Bou99]** gave a shorter proof of an alternative characterization. Bourgain’s characterization is not as strong as Friedgut’s for monotone graph properties; however it has the advantage that it works for low-influence functions on *any* product probability space. (In particular, there is no monotonicity assumption since the domain need not be $\{\mathsf{True}, \mathsf{False}\}^n$.) We first make a quick definition and then state Bourgain’s theorem.

Definition 44Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ be $\{-1,1\}$-valued. For $T \subseteq [n]$, $y \in \Omega^T$, and $\tau > 0$, we say that the restriction $y_T$ is a$\tau$-boosterif $f^{\subseteq T}(y) \geq \mathop{\bf E}[f] + \tau$. (Recall that $f^{\subseteq T}(y) = \mathop{\bf E}[f_{\overline{T} \mid y}]$.) In case $\tau < 0$ we say that $y_T$ is a $\tau$-booster if $f^{\subseteq T}(y) \leq \mathop{\bf E}[f] – |\tau|$.

Bourgain’s Sharp Threshold TheoremLet $f \in L^2(\Omega^n, \pi^{\otimes n})$ be $\{-1,1\}$-valued with $\mathbf{I}[f] \leq K$. Assume $\mathop{\bf Var}[f] \geq .01$. Then there is some $\tau$ (either positive or negative) with $|\tau| \geq \exp(-O(K^2))$ such that \[ \mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[\exists T \subseteq [n], |T| \leq O(K) \text{ such that ${\boldsymbol{x}}_T$ is a } \tau\text{-booster}] \geq |\tau|. \]

Thinking of $K$ as an absolute constant, the theorem says that for a typical input string $x$, there is a large chance that it contains a constant-sized substring which is an $\Omega(1)$-booster for $f$. In the particular case of monotone $f \in L^2(\{\mathsf{True}, \mathsf{False}\}^n, \pi_p^{\otimes n})$ with $p$ small, it’s not hard to deduce (exercise) that in fact there exists a $T$ with $|T| \leq O(K)$ such that restricting all coordinates in $T$ to be $\mathsf{True}$ increases $\mathop{\bf Pr}_{\pi_p^{\otimes n}}[f = \mathsf{True}]$ by $\exp(-O(K^2))$. This is a qualitatively weaker conclusion then what you get from Friedgut’s theorem when $f$ is a graph property with $\mathbf{I}[f] \leq O(1)$ — in that case, by taking $T$ to be any of the width-$O(1)$ terms in the approximating DNF one can increase $\mathop{\bf Pr}_{\pi_p^{\otimes n}}[f = \mathsf{True}]$ not just by $\Omega(1)$ but up to almost $1$. Nevertheless, Bourgain’s theorem apparently suffices to deduce any of the sharp thresholds results obtainable from Friedgut’s theorem **[Fri05]**. For a very high-level sketch of how Bourgain’s theorem would apply in the case of $3$-colourability of random graphs, see the exercises.

The last part of this section will be devoted to proving Bourgain’s Sharp Threshold Theorem. Before doing this, we add one more remark. Hatami **[Hat12]** has significantly generalized Bourgain’s work, establishing the following characterization of boolean-valued functions with low total influence:

Hatami’s TheoremLet $f \in L^2(\Omega^n, \pi^{\otimes n})$ be a $\{-1,1\}$-valued function with $\mathbf{I}[f] \leq K$. Then for every $\epsilon > 0$, the function $f$ is $\epsilon$-close (under $\pi^{\otimes n}$) to an $\exp(O(K^3/\epsilon^3))$-“pseudo-junta” $h : \Omega^n \to \{-1,1\}$.

The term “pseudo-junta” is defined in the exercises. A $K$-pseudo-junta $h$ has the property that $\mathbf{I}[h] \leq 4K$; thus Hatami’s Theorem shows that having $O(1)$ total influence is essentially equivalent to being an $O(1)$-pseudo-junta. A downside of the result, however, is that being a $K$-pseudo-junta is not a “syntactic” property; it depends on the probability distribution $\pi^{\otimes n}$.

Let’s now turn to proving Bourgain’s Sharp Threshold Theorem. In fact, Bourgain proved the theorem as a corollary of the following main result:

Theorem 45Let $(\Omega, \pi)$ be a finite probability space and let $f : \Omega^n \to \{-1,1\}$. Let $0 < \epsilon < 1/2$ and write $k = \mathbf{I}[f]/\epsilon$. Then for each $x \in \Omega^n$ it’s possible to define a set of “notable coordinates” $J_x \subseteq [n]$ satisfying $|J_x| \leq \exp(O(k))$ such that \[ \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\sum_{S \not \in \mathcal{F}_{{\boldsymbol{x}}}} f^{=S}({\boldsymbol{x}})^2\right] \leq 2\epsilon. \] Here $ \mathcal{F}_x = \{S : S \subseteq J_x, |S| \leq k\} $, a collection always satisfying $ |\mathcal{F}_x| \leq \exp(O(k^2)) $.

You may notice that this theorem looks extremely similar to Friedgut’s Junta Theorem from Chapter 9.6 (and the $\exp(-O(\mathbf{I}[f]^2))$ quantity in Bourgain’s Sharp Threshold Theorem looks similar to the Fourier coefficient lower bound in Corollary 9.32). Indeed, the only difference between Theorem 45 and Friedgut’s Junta Theorem is that in the latter, the “notable coordinates” $J$ can be “named in advance” — they’re simply the coordinates $j$ with $\mathbf{Inf}_j[f] = \sum_{S \ni j} \widehat{f}(S)^2$ large. By contrast, in Theorem 45 the notable coordinates depend on the input $x$. As we will see in the proof, they are precisely the coordinates $j$ such that $\sum_{S \ni j} f^{=S}(x)^2$ is large. Of course, in the setting of $f : \{-1,1\}^n \to \{-1,1\}$ we have $f^{=S}(x)^2 = \widehat{f}(S)^2$ for all $x$, so the two definitions coincide. But in the general setting of $f \in L^2(\Omega^n, \pi^{\otimes n})$ it makes sense that we can’t name the notable coordinates in advance and rather have to “wait until $x$ is chosen”. E.g., for the $\mathrm{OR}_n$ function as in Examples 43, there are no notable coordinates to be named in advance, but once $x$ is chosen the few coordinates on which $x$ takes the value $\mathsf{True}$ (if any exist) will be the notable ones.

The proof of Theorem 45 mainly consists of adding the randomization/symmetrization technique to the proof of Friedgut’s Junta Theorem (more precisely, Theorem 9.28) to avoid dependence on the minimum probability of $\pi$. This randomization/symmetrization is applied to what are essentially the key inequalities in that proof: \[ \|\mathrm{T}_{\frac{1}{\sqrt{3}}} \mathrm{L}_i f\|_2^2 \leq \|\mathrm{L}_i f\|_{4/3}^2 = \|\mathrm{L}_i f\|_{4/3}^{2/3} \cdot \|\mathrm{L}_i f\|_{4/3}^{4/3} \leq \|\mathrm{L}_i f\|_{4/3}^{2/3} \cdot \mathbf{Inf}_i[f]. \] (The last inequality here is Exercise 8.9½.) The overall proof needs one more minor twist: since we work on a “per-$x$” basis and not in expectation, it’s possible that the set of notable coordinates can be improbably large. (Think again about the example of $\mathrm{OR}_n$; for ${\boldsymbol{x}} \sim \pi_{1/n}^{\otimes n}$ we expect only a constant number of coordinates of ${\boldsymbol{x}}$ to be $\mathsf{True}$, but it’s not always uniformly bounded.) This is combated using the principle that low-degree functions are “reasonable” (together with randomization/symmetrization).

*Proof of Theorem 45:* By the simple “Markov argument” (see Proposition 3.2) we have \[ \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\sum_{|S| > k} f^{=S}({\boldsymbol{x}})^2\right] = \sum_{|S| > k} \|f^{=S}\|_2^2 \leq \mathbf{I}[f] / k = \epsilon. \] Thus it suffices to define the sets $J_x$ so that \begin{equation} \label{eqn:bourgain-final} \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\sum_{|S| \leq k,\ S \not \subseteq J_{{\boldsymbol{x}}}} f^{=S}({\boldsymbol{x}})^2\right] \leq \epsilon. \end{equation} We’ll first define “notable coordinate” sets $J’_x \subseteq [n]$ which almost do the trick: \[ J'_x = \left\{ j \in [n] : \sum_{S \ni j} f^{=S}(x)^2 \geq \tau\right\}, \quad \tau = c^{-k}. \] (where $c > 1$ is a universal constant). Using this definition, the main effort of the proof will be to show \begin{equation} \label{eqn:bourgain-main-goal} \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\sum_{|S| \leq k,\ S \not \subseteq J'_{{\boldsymbol{x}}}} f^{=S}({\boldsymbol{x}})^2\right] \leq \epsilon/2. \end{equation} This looks better than \eqref{eqn:bourgain-final}; the only problem is that the sets $J’_x$ don’t always satisfy $|J’_x| \leq \exp(O(k))$ as needed. However “in expectation” $|J’_x|$ ought not be much larger than $1/\tau = c^k$. Thus we introduce the event \[ \text{``$J'_{{\boldsymbol{x}}}$ is too big''} \quad\iff\quad |J'_{{\boldsymbol{x}}}| \geq C^k \] (where $C > c$ is another universal constant) and define \[ J_x = \begin{cases} J'_x & \text{if $J'_x$ is not too big,} \\ \emptyset & \text{if $J'_x$ is too big.} \end{cases} \] The last part of the proof will be to show that \begin{equation} \label{eqn:bourgain-too-big} \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\boldsymbol{1}[\text{$J'_{{\boldsymbol{x}}}$ is too big}] \cdot \sum_{0 < |S| \leq k} f^{=S}({\boldsymbol{x}})^2\right] \leq \epsilon/2. \end{equation} Together, \eqref{eqn:bourgain-too-big} and \eqref{eqn:bourgain-main-goal} establish \eqref{eqn:bourgain-final}. We will first prove \eqref{eqn:bourgain-main-goal} and then prove \eqref{eqn:bourgain-too-big}. As a small aside, we’ll see that for both inequalities we could obtain a bound much less than $\epsilon/2$ if desired. To prove \eqref{eqn:bourgain-main-goal}, we mimic the proof of Theorem 9.28 but add in randomization/symmetrization. The key step is encapsulated in the following lemma. Note that the lemma also holds with the more natural definition $g = \mathrm{L}_i f$; the additional $\mathrm{T}_{\frac25}$ is to facilitate future “un-randomization/symmetrization”.

Lemma 46Fix $x \in \Omega^n$ and $i \in J’_x$. Then writing $g = \mathrm{T}_{\frac25} \mathrm{L}_i f$ we have \[ \|\mathrm{T}_{\frac{1}{\sqrt{3}}} \widetilde{g}_{ \mid x}\|_2^2 \leq \tau^{1/3} \|\widetilde{g}_{ \mid x}\|_{4/3}^{4/3}. \]

*Proof:* Here $\widetilde{g}$ is the randomization/symmetrization of $g$, so $\widetilde{g}_{ \mid x} = \widetilde{g}_{ \mid x}(r)$ is a function on the uniform-distribution hypercube. Applying the basic $(4/3,2)$-Hypercontractivity Theorem we have \[ \|\mathrm{T}_{\frac{1}{\sqrt{3}}} \widetilde{g}_{ \mid x}\|_2^2 \leq \|\widetilde{g}_{ \mid x}\|_{4/3}^2 = (\|\widetilde{g}_{ \mid x}\|_{4/3}^2)^{1/3} \cdot \|\widetilde{g}_{ \mid x}\|_{4/3}^{4/3} \leq (\|\widetilde{g}_{ \mid x}\|_{2}^2)^{1/3} \cdot \|\widetilde{g}_{ \mid x}\|_{4/3}^{4/3}. \] But by the usual Parseval Theorem, \[ \|\widetilde{g}_{ \mid x}\|_{2}^2 = \sum_{S \subseteq [n]} g^{=S}(x)^2 = \sum_{S \ni i} (2/5)^{2|S|} f^{=S}(x)^2 \leq \sum_{S \ni i} f^{=S}(x)^2 \leq \tau, \] the last inequality due to the assumption that $i \in J’_x$. $\Box$

We now establish \eqref{eqn:bourgain-main-goal}: \begin{align} \mathop{\bf E}_{{\boldsymbol{x}}}\left[\sum_{|S| \leq k,\ S \not \subseteq J'_{{\boldsymbol{x}}}} f^{=S}({\boldsymbol{x}})^2\right] &\leq (5\sqrt{3}/2)^{2k} \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\sum_{S \not \subseteq J'_{{\boldsymbol{x}}}} (\mathrm{T}_{\frac{2}{5\sqrt{3}}} f^{=S})({\boldsymbol{x}})^2\right] \nonumber\\ &\leq 20^{k} \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\sum_{i \not \in J'_{{\boldsymbol{x}}}} \sum_{S \ni j} (\mathrm{T}_{\frac{2}{5\sqrt{3}}} f^{=S})({\boldsymbol{x}})^2\right] \nonumber\\ &= 20^k \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\sum_{i \not \in J'_{{\boldsymbol{x}}}} \|\mathrm{T}_{\frac{1}{\sqrt{3}}} \widetilde{g^i}_{ \mid {\boldsymbol{x}}}\|_2^2\right] \tag{for $g^i = \mathrm{T}_{\frac25} \mathrm{L}_i f$}\\ &\leq 20^k \tau^{1/3} \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\sum_{i \not \in J'_{{\boldsymbol{x}}}} \| \widetilde{g^i}_{ \mid {\boldsymbol{x}}}\|_{4/3}^{4/3}\right] \tag{Lemma 46}\\ &\leq 20^k \tau^{1/3} \cdot \sum_{i=1}^n \|\mathrm{L}_i f\|_{4/3}^{4/3} \tag{Theorem 34}\\ &\leq 20^k \tau^{1/3} \cdot \sum_{i=1}^n \mathbf{Inf}_i[f] \tag{Exercise 8.9½}\\ &= 20^k \tau^{1/3} \cdot \mathbf{I}[f] = (20c^{-1/3})^k k \epsilon \leq \epsilon/2, \nonumber \end{align} the last inequality because $(20 c^{-1/3})^{k}k \leq 1/2$ for all $k \geq 0$ once $c$ is a large enough constant.

The last task in the proof is to establish \eqref{eqn:bourgain-too-big}. Using Cauchy–Schwarz, \begin{multline} \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\boldsymbol{1}[\text{$J'_{{\boldsymbol{x}}}$ is too big}] \cdot \sum_{0 < |S| \leq k} f^{=S}({\boldsymbol{x}})^2\right] \\ \leq \sqrt{\mathop{\bf E}_{{\boldsymbol{x}}}\left[\boldsymbol{1}[\text{$J'_{{\boldsymbol{x}}}$ is too big}]^2\right]}\sqrt{\mathop{\bf E}_{{\boldsymbol{x}}}\left[\Bigl(\sum_{0 < |S| \leq k} f^{=S}({\boldsymbol{x}})^2\Bigr)^2\right]}. \label{eqn:bourgain-too-big1} \end{multline} For the first factor on the right of \eqref{eqn:bourgain-too-big1} we use Markov’s inequality: \begin{multline} \mathop{\bf E}_{{\boldsymbol{x}}}\left[\boldsymbol{1}[\text{$J'_{{\boldsymbol{x}}}$ is too big}]^2\right] = \mathop{\bf Pr}_{{\boldsymbol{x}}}[\text{$J'_{{\boldsymbol{x}}}$ is too big}] = \mathop{\bf Pr}_{{\boldsymbol{x}}}[|J'_{{\boldsymbol{x}}}| \geq C^k]\\ \leq C^{-k} \mathop{\bf E}_{{\boldsymbol{x}}}[|J'_{{\boldsymbol{x}}}|] \leq C^{-k} \mathop{\bf E}_{{\boldsymbol{x}}}\left[\Bigl(\sum_{i=1}^n \sum_{S\ni i} f^{=S}({\boldsymbol{x}})^2\Bigr)/\tau\right] = C^{-k} c^k \cdot \mathbf{I}[f]. \label{eqn:bourgain-too-big2} \end{multline} As for the second factor on the right of \eqref{eqn:bourgain-too-big1}, let’s write $h = \mathrm{T}_{\frac25}(f – f^{=\emptyset})$. (We are being slightly finicky about $f^{=\emptyset}$ just in case it’s very large.) Then \begin{align} \mathop{\bf E}_{{\boldsymbol{x}}}\left[\Bigl(\sum_{0 < |S| \leq k} f^{=S}({\boldsymbol{x}})^2\Bigr)^2\right] &\leq (5/2)^{4k} \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\left(\sum_{S \neq \emptyset} (\mathrm{T}_{\frac25} f^{=S})({\boldsymbol{x}})^2\right)^2\right] \nonumber\\ &= 40^k \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\|\widetilde{h}_{ \mid {\boldsymbol{x}}}\|_2^4\right]\nonumber\\ &\leq 40^k \cdot \mathop{\bf E}_{{\boldsymbol{x}}}\left[\|\widetilde{h}_{ \mid {\boldsymbol{x}}}\|_4^4\right]\nonumber\\ &\leq 40^k \cdot \|f – f^{=\emptyset}\|_4^4\tag{Theorem 34} \\ &\leq 40^k \cdot 2^2 \mathop{\bf E}_{{\boldsymbol{x}}}[(f - f^{=\emptyset})^2] \tag{since $|f – f^{=\emptyset}| \leq 2$ always}\\ &= 4 \cdot 40^k \cdot \mathop{\bf Var}[f] \leq 4 \cdot 40^k \cdot \mathbf{I}[f]. \label{eqn:bourgain-too-big3} \end{align} Substituting \eqref{eqn:bourgain-too-big2} and \eqref{eqn:bourgain-too-big3} into \eqref{eqn:bourgain-too-big1} gives \begin{multline*} \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\left[\boldsymbol{1}[\text{$J'_{{\boldsymbol{x}}}$ is too big}] \cdot \sum_{0 < |S| \leq k} f^{=S}({\boldsymbol{x}})^2\right] \\ \leq \sqrt{C^{-k} c^k \cdot 4 \cdot 40^k} \cdot \mathbf{I}[f] = 2(\tfrac{40c}{C})^{k/2} k \epsilon \leq \epsilon/2, \end{multline*} the last inequality again holding for all $k \geq 0$ once $C$ is chosen large enough compared to $c$. $\Box$

We end this chapter by deducing Bourgain’s Sharp Threshold Theorem from Theorem 45.

*Proof of Bourgain’s Sharp Threshold Theorem:* We take $\epsilon = .001$ in Theorem 45 and obtain the associated collections of subsets $\mathcal{F}_x$, where each $|\mathcal{F}_x| \leq \exp(O(K^2))$ and each $S \in \mathcal{F}_x$ satisfies $|S| \leq O(K)$. Using the fact that $f^{=\emptyset}(x)^2 = 1 – \mathop{\bf Var}[f] \leq .99$ for each $x$ we get \[ \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} \left[\sum_{S \in \mathcal{F}_{\boldsymbol{x}} \setminus \{\emptyset\}} f^{=S}({\boldsymbol{x}})^2\right] \geq 1 – 2\epsilon – .99 = .008. \] We always have $|\mathcal{F}_x \setminus \{\emptyset\}| \leq \exp(O(K^2))$, and there’s also no harm in assuming $|\mathcal{F}_x \setminus \{\emptyset\}| > 0$. It follows that \[ \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} \left[\max_{S \in \mathcal{F}_{\boldsymbol{x}} \setminus \{\emptyset\}} \{f^{=S}({\boldsymbol{x}})^2\}\right] \geq \frac{.008}{\exp(O(K^2))} = \exp(-O(K^2)). \] Thus for each $x$ we can define a set $S_x$ with $0 < |S_x| \leq O(K)$ such that \begin{equation} \label{eqn:bourgain-calculation} \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} \left[f^{=S_{\boldsymbol{x}}}({\boldsymbol{x}})^2\right] \geq \exp(-O(K^2)). \end{equation} By Exercise 8.17 we have $|f^{=S_x}(x)| \leq 2^{|S_x|} \leq 2^{O(K)}$ and hence $f^{=S_{\boldsymbol{x}}}({\boldsymbol{x}})^2 \leq \exp(O(K))$ always. It follows from \eqref{eqn:bourgain-calculation} that we must have \[ \mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} \left[f^{=S_{\boldsymbol{x}}}({\boldsymbol{x}})^2 \geq \exp(-O(K^2))\right] \geq \exp(-O(K^2)). \] We will complete the proof by showing that whenever $f^{=S_{\boldsymbol{x}}}({\boldsymbol{x}})^2 \geq \exp(-O(K^2))$ occurs, there exists $T \subseteq S_{{\boldsymbol{x}}}$ such that ${\boldsymbol{x}}_T$ is a $\pm \exp(-O(K^2))$-booster for $f$. Thus we either have a $+\exp(-O(K^2))$-booster with probability at least $\tfrac{1}{2} \exp(-O(K^2))$, or a $-\exp(-O(K^2))$ with probability at least $\tfrac{1}{2} \exp(-O(K^2))$; either way, the proof will be complete.

Assume then that $f^{=S_x}(x)^2 \geq \exp(-O(K^2))$; equivalently, \[ |f^{=S_x}(x)| \geq \exp(-O(K^2)). \] Let’s now work with $g = f – \mathop{\bf E}[f]$. Of course $g^{=T} = f^{=T}$ for all $T \neq \emptyset$; since $S_x \neq \emptyset$ the above inequality tells us that $|g^{=S_x}(x)| \geq \exp(-O(K^2))$. Recall the formula \[ g^{=S_x}(x) = \sum_{\emptyset \neq T \subseteq S_x} (-1)^{|S_x|-|T|}g^{\subseteq T}(x); \] we dropped the $T = \emptyset$ term since it’s $0$. As there are only $2^{|S_x|}-1 = \exp(O(K))$ terms in the above sum, we deduce there must exist some $T \subseteq S_x$ with $0 < |T| \leq O(K)$ such that \[ |g^{\subseteq T}(x)| \geq \exp(-O(K^2))/\exp(O(K)) = \exp(-O(K^2)). \] But $g^{\subseteq T} = f^{\subseteq T} – \mathop{\bf E}[f]$, so the above gives us $|f^{\subseteq T}(x) – \mathop{\bf E}[f]| \geq \exp(-O(K^2))$. This precisely says that $x_T$ is a $\pm \exp(-O(K^2))$-booster, as desired. $\Box$

For a relaxation of the assumption $\mathop{\bf Var}[f] \geq .01$ in this theorem, see the exercises.

There may be a small typo at the end of the third sentence of the first paragraph:

“.. is $O(1)$” maybe should be “… is not $O(1)$.”

Right, thanks!

maybe small typo in 2nd line of establishing (10.24) on p. 307 of book:

$\sum_{i \not\in J’_x} \sum_{S \ni j}$ maybe should be $\sum_{i \not\in J’_x} \sum_{S \ni i}$.

Great catch, thanks!

Maybe small typo in 2nd line of analysis of “second factor on the right of (10.26)” on p. 308 of book:

$= 40^k$ maybe should be $\leq 40^k$.

Yep, thanks!