# §9.2: Small subsets of the hypercube are noise-sensitive

An immediate consequence of the Bonami Lemma is that for any $f : \{-1,1\}^n \to {\mathbb R}$ and $k \in {\mathbb N}$, $$\label{eqn:2-4-hypercon-k} \|\mathrm{T}_{1/\sqrt{3}} f^{=k}\|_4 = \tfrac{1}{\sqrt{3}^k} \|f^{=k}\|_4 \leq \|f^{=k}\|_2.$$

This is a special case of the $(2,4)$-Hypercontractivity Theorem (whose name will be explained shortly), which says that the assumption of degree-$k$ homogeneity is not necessary:

$\mathbf{(2,4)}$-Hypercontractivity Theorem Let $f : \{-1,1\}^n \to {\mathbb R}$. Then $\|\mathrm{T}_{1/\sqrt{3}} f\|_4 \leq \|f\|_2.$

It almost looks as though you could prove this theorem simply by summing \eqref{eqn:2-4-hypercon-k} over $k$. In fact that proof strategy can be made to work given a few extra tricks (see the exercises), but it’s just as easy to repeat the induction technique used for the Bonami Lemma.

Proof of the $(2,4)$-Hypercontractivity Theorem: We’ll prove $\mathop{\bf E}[\mathrm{T}_{1/\sqrt{3}} f({\boldsymbol{x}})^4] \leq \mathop{\bf E}[f({\boldsymbol{x}})^2]^2$ using the same induction as in the Bonami Lemma. Retaining the notation $\boldsymbol{d}$ and $\boldsymbol{e}$, and using the shorthand $\mathrm{T} = \mathrm{T}_{1/\sqrt{3}}$, we have $\mathrm{T} \boldsymbol{f} = {\boldsymbol{x}}_n \cdot \tfrac{1}{\sqrt{3}} \mathrm{T} \boldsymbol{d} + \mathrm{T} \boldsymbol{e}.$ Similar computations to those in the Bonami Lemma proof yield \begin{align*} \mathop{\bf E}[(\mathrm{T} \boldsymbol{f})^4] &= \bigl(\tfrac{1}{\sqrt{3}}\bigr)^4\mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4] + 6 \bigl(\tfrac{1}{\sqrt{3}}\bigr)^2 \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^2(\mathrm{T} \boldsymbol{e})^2] + \mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4] \\ &\leq \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4] + 2 \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^2(\mathrm{T} \boldsymbol{e})^2] + \mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4] \\ &\leq \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4] + 2 \sqrt{\mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4]}\sqrt{\mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4]} + \mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4] \\ &\leq \mathop{\bf E}[\boldsymbol{d}^2]^2 + 2 \mathop{\bf E}[\boldsymbol{d}^2]\mathop{\bf E}[\boldsymbol{e}^2] + \mathop{\bf E}[\boldsymbol{e}^2]^2 \\ &= \bigl(\mathop{\bf E}[\boldsymbol{d}^2] + \mathop{\bf E}[\boldsymbol{e}^2]\bigr)^2 = \mathop{\bf E}[\boldsymbol{f}^2]^2, \end{align*} where the second inequality is Cauchy–Schwarz, the third is induction, and the final equality is a simple computation analogous to equation (2) in the proof of the Bonami Lemma. $\Box$

The name “hypercontractivity” in this theorem describes the fact that not only is $\mathrm{T}_{1/\sqrt{3}}$ a “contraction” on $L^2(\{-1,1\}^n)$ — meaning $\|\mathrm{T}_{1/\sqrt{3}} f\|_2 \leq \|f\|_2$ for all $f$ — it’s even a contraction when viewed as an operator from $L^2(\{-1,1\}^n)$ to $L^4(\{-1,1\}^n)$. You should think of hypercontractivity theorems as quantifying the extent to which $\mathrm{T}_\rho$ is a “smoothing”, or “reasonable-izing” operator.

Unfortunately the quantity $\|\mathrm{T}_{1/\sqrt{3}}f\|_4$ in the $(2,4)$-Hypercontractivity Theorem does not have an obvious combinatorial meaning. On the other hand, the quantity $\|\mathrm{T}_{1/\sqrt{3}}f\|_2 = \sqrt{\langle \mathrm{T}_{1/\sqrt{3}}f, \mathrm{T}_{1/\sqrt{3}}f \rangle} = \sqrt{\langle f, \mathrm{T}_{1/\sqrt{3}} \mathrm{T}_{1/\sqrt{3}} f\rangle} = \sqrt{\mathbf{Stab}_{1/3}[f]},$ does have a nice combinatorial meaning. And we can make this quantity appear in the Hypercontractivity Theorem via a simple trick from analysis, just using the fact that $\mathrm{T}_{1/\sqrt{3}}$ is a self-adjoint operator. We “flip the norms across $2$” using Hölder’s inequality:

$\mathbf{(4/3,2)}$-Hypercontractivity Theorem Let $f : \{-1,1\}^n \to {\mathbb R}$. Then $\|\mathrm{T}_{1/\sqrt{3}} f\|_2 \leq \|f\|_{4/3};$ i.e., $$\label{eqn:4/3-2} \mathbf{Stab}_{1/3}[f] \leq \|f\|_{4/3}^2.$$

Proof: Writing $\mathrm{T} = \mathrm{T}_{1/\sqrt{3}}$ for brevity we have $$\label{eqn:hc-holder} \|\mathrm{T} f\|_2^2 = \langle \mathrm{T} f, \mathrm{T} f \rangle = \langle f, \mathrm{T} \mathrm{T} f \rangle \leq \|f\|_{4/3} \|\mathrm{T} \mathrm{T} f\|_4 \leq \|f\|_{4/3} \|\mathrm{T} f\|_2$$ by Hölder’s inequality and the $(2,4)$-Hypercontractivity Theorem. Dividing through by $\|\mathrm{T} f\|_2$ (which we may assume is nonzero) completes the proof. $\Box$

In the inequality \eqref{eqn:4/3-2} the left-hand side is a natural quantity. The right-hand side is just $1$ when $f : \{-1,1\}^n \to \{-1,1\}$, which is not very interesting. But if we instead look at $f : \{-1,1\}^n \to \{0,1\}$ we get something very interesting:

Corollary 8 Let $A \subseteq \{-1,1\}^n$ have volume $\alpha$; i.e., let $1_A : \{-1,1\}^n \to \{0,1\}$ satisfy $\mathop{\bf E}[1_A] = \alpha$. Then $\mathbf{Stab}_{1/3}[1_A] = \mathop{\bf Pr}_{\substack{{\boldsymbol{x}} \sim \{-1,1\}^n \\ \boldsymbol{y} \sim N_{1/3}({\boldsymbol{x}})}}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in A] \leq \alpha^{3/2}.$ Equivalently (for $\alpha > 0$), $\mathop{\bf Pr}_{\substack{{\boldsymbol{x}} \sim A \\ \boldsymbol{y} \sim N_{1/3}({\boldsymbol{x}})}}[\boldsymbol{y} \in A] \leq \alpha^{1/2}.$

Proof: This is immediate from inequality \eqref{eqn:4/3-2}, since $\|1_A\|_{4/3}^2 = \Bigr(\mathop{\bf E}_{{\boldsymbol{x}}}[|1_A({\boldsymbol{x}})|^{4/3}]^{3/4}\Bigr)^2 = \mathop{\bf E}_{{\boldsymbol{x}}}[1_A({\boldsymbol{x}})]^{3/2} = \alpha^{3/2}. \qquad \Box$

See Section 5 of this chapter for the generalization of this corollary to noise rates other than $1/3$.

Example 9 Assume $\alpha = 2^{-k}$, $k \in {\mathbb N}^+$, and $A$ is a subcube of codimension $k$; e.g., $1_A : {\mathbb F}_2^n \to \{0,1\}$ is the logical $\mathrm{AND}$ function on the first $k$ coordinates. For every $x \in A$, when we form $\boldsymbol{y} \sim N_{1/3}(x)$ we’ll have $\boldsymbol{y} \in A$ if and only if the first $k$ coordinates of $x$ do not change, which happens with probability $(2/3)^k = (2/3)^{\log(1/\alpha)} = \alpha^{\log(3/2)} \approx \alpha^{.585} \leq \alpha^{1/2}$. In fact, the bound $\alpha^{1/2}$ in Corollary 8 is essentially sharp when $A$ is a Hamming ball; see the exercises.

We can phrase Corollary 8 in terms of the expansion in a certain graph:

Definition 10 For $n \in {\mathbb N}^+$ and $\rho \in [-1,1]$, the $n$-dimensional $\rho$-stable hypercube graph is the edge-weighted, complete directed graph on vertex set $\{-1,1\}^n$ in which the weight on directed edge $(x,y) \in \{-1,1\}^n \times \{-1,1\}^n$ is equal to $\mathop{\bf Pr}[({\boldsymbol{x}},\boldsymbol{y}) = (x,y)]$ when $({\boldsymbol{x}},\boldsymbol{y})$ is a $\rho$-correlated pair. If $\rho = 1-2\delta$ for $\delta \in [0,1]$, we also call this the $\delta$-noisy hypercube graph. Here the weight on $(x,y)$ is $\mathop{\bf Pr}[({\boldsymbol{x}}, \boldsymbol{y}) = (x,y)]$ where ${\boldsymbol{x}} \sim \{-1,1\}^n$ is uniform and $\boldsymbol{y}$ is formed from ${\boldsymbol{x}}$ by negating each coordinate independently with probability $\delta$.

Remark 11 The edge weights in this graph are nonnegative and sum to $1$. The graph is also “regular” in the sense that for each $x \in \{-1,1\}^n$ the sum of all the edge weight leaving (or entering) $x$ is $2^{-n}$. You can also consider the graph to be undirected, since the weight on $(x,y)$ is the same as the weight on $(y,x)$; in this viewpoint, the weight on the undirected edge $(x,y)$ would be $2^{1-n}\delta^{\Delta(x,y)}(1-\delta)^{n-\Delta(x,y)}$. In fact, the graph is perhaps best thought of as the discrete-time Markov chain on state space $\{-1,1\}^n$ in which a step from state $x \in \{-1,1\}^n$ consists of moving to state $\boldsymbol{y} \sim N_\rho(x)$. This is a reversible chain with the uniform stationary distribution. Each discrete step is equivalent to running the “usual” continuous-time Markov chain on the hypercube for time $t = \ln(1/\rho)$ (assuming $\rho \in [0,1]$).

With this definition in place, we can see Corollary 8 as saying that the $1/3$-stable (equivalently, $1/3$-noisy) hypercube graph is a “small-set expander”: given any small $\alpha$-fraction of the vertices $A$, almost all of the edge weight touching $A$ is on its boundary. More precisely, if we choose a random vertex ${\boldsymbol{x}} \in A$ and take a random edge out of ${\boldsymbol{x}}$ (with probability proportional to its edge weight), we end up outside $A$ with probability at least $1 – \alpha^{1/2}$. You can compare this with the discussion surrounding the Level-$1$ Inequality in Chapter 5.4, which is the analogous statement for the $\rho$-stable hypercube graph “in the limit $\rho \to 0^+$”. The appropriate statement for general $\rho$ is appears in Section 5 of this chapter as the “Small-Set Expansion Theorem”.

Corollary 8 would apply equally well if $1_A$ were replaced by a function $g : \{-1,1\}^n \to \{-1,0,1\}$, with $\alpha$ denoting $\mathop{\bf Pr}[g \neq 0] = \mathop{\bf E}[|g|] = \mathop{\bf E}[g^2]$. This situation occurs naturally when $g = \mathrm{D}_i f$ for some boolean-valued $f : \{-1,1\}^n \to \{-1,1\}$. In this case $\mathbf{Stab}_{1/3}[g] = \mathbf{Inf}^{(1/3)}_i[f]$, the $1/3$-stable influence of $i$ on $f$. We conclude that for a boolean-valued function, if the influence of $i$ is small then its $1/3$-stable influence is much smaller:

Corollary 12 Let $f : \{-1,1\}^n \to \{-1,1\}$. Then $\mathbf{Inf}_i^{(1/3)}[f] \leq \mathbf{Inf}_i[f]^{3/2}$ for all $i$.

We remark that the famous KKL Theorem (stated in Chapter 4.2) more or less follows by summing the above inequality over $i \in [n]$; we could now prove it quite easily in a paragraph or so.

Let’s take one more look at the “small-set expansion result”, Corollary 8. Since noise stability roughly measures how “low” a function’s Fourier weight is, this corollary implies that a function $f : \{-1,1\}^n \to \{0,1\}$ with small mean $\alpha$ cannot have much of its Fourier weight at low degree. More precisely, for any $k \in {\mathbb N}$ we have $$\label{eqn:weak-level-k} \alpha^{3/2} \geq \mathbf{Stab}_{1/3}[f] \geq (1/3)^k \mathbf{W}^{\leq k}[f] \quad\Rightarrow\quad \mathbf{W}^{\leq k}[f] \leq 3^k \alpha^{3/2}.$$ For $k = 1$ this gives $\mathbf{W}^{\leq 1}[f] \leq 3\alpha^{3/2}$, which is nontrivial but not as strong as the Level-1 Inequality from Chapter 5.4. But \eqref{eqn:weak-level-k} also gives us “level-$k$ inequalities” for larger values of $k$. For example, $\mathbf{W}^{\leq .25 \log(1/\alpha)}[f] \leq \alpha^{-.25 \log 3 + 3/2} \leq \alpha^{1.1} \ll \alpha = \|f\|_2^2;$ i.e., almost all of $f$’s Fourier weight is above degree $.25 \log(1/\alpha)$. We will give slightly improved versions of these level-$k$ inequalities in Section 5 of this chapter.

### 2 comments to §9.2: Small subsets of the hypercube are noise-sensitive

• Mitchell Johnston

Hi Ryan,

For the equations in the (2,4) theorem you leave the Ts in when you apply the induction hypothesis which gives you the wrong result in the end.

Best,
Mitchell

• Thanks Mitchell — that was a big mistake! I think it’s fixed now.