§9.2: Small subsets of the hypercube are noise-sensitive

An immediate consequence of the Bonami Lemma is that for any $f : \{-1,1\}^n \to {\mathbb R}$ and $k \in {\mathbb N}$, \begin{equation} \label{eqn:2-4-hypercon-k} \|\mathrm{T}_{1/\sqrt{3}} f^{=k}\|_4 = \tfrac{1}{\sqrt{3}^k} \|f^{=k}\|_4 \leq \|f^{=k}\|_2. \end{equation}

This is a special case of the $(2,4)$-Hypercontractivity Theorem (whose name will be explained shortly), which says that the assumption of degree-$k$ homogeneity is not necessary:

$\mathbf{(2,4)}$-Hypercontractivity Theorem Let $f : \{-1,1\}^n \to {\mathbb R}$. Then \[ \|\mathrm{T}_{1/\sqrt{3}} f\|_4 \leq \|f\|_2. \]

It almost looks as though you could prove this theorem simply by summing \eqref{eqn:2-4-hypercon-k} over $k$. In fact that proof strategy can be made to work given a few extra tricks (see the exercises), but it’s just as easy to repeat the induction technique used for the Bonami Lemma.

Proof of the $(2,4)$-Hypercontractivity Theorem: We’ll prove $\mathop{\bf E}[\mathrm{T}_{1/\sqrt{3}} f({\boldsymbol{x}})^4] \leq \mathop{\bf E}[f({\boldsymbol{x}})^2]^2$ using the same induction as in the Bonami Lemma. Retaining the notation $\boldsymbol{d}$ and $\boldsymbol{e}$, and using the shorthand $\mathrm{T} = \mathrm{T}_{1/\sqrt{3}}$, we have \[ \mathrm{T} \boldsymbol{f} = {\boldsymbol{x}}_n \cdot \tfrac{1}{\sqrt{3}} \mathrm{T} \boldsymbol{d} + \mathrm{T} \boldsymbol{e}. \] Similar computations to those in the Bonami Lemma proof yield \begin{align*} \mathop{\bf E}[(\mathrm{T} \boldsymbol{f})^4] &= \bigl(\tfrac{1}{\sqrt{3}}\bigr)^4\mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4] + 6 \bigl(\tfrac{1}{\sqrt{3}}\bigr)^2 \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^2(\mathrm{T} \boldsymbol{e})^2] + \mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4] \\ &\leq \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4] + 2 \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^2(\mathrm{T} \boldsymbol{e})^2] + \mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4] \\ &\leq \mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4] + 2 \sqrt{\mathop{\bf E}[(\mathrm{T} \boldsymbol{d})^4]}\sqrt{\mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4]} + \mathop{\bf E}[(\mathrm{T} \boldsymbol{e})^4] \\ &\leq \mathop{\bf E}[\boldsymbol{d}^2]^2 + 2 \mathop{\bf E}[\boldsymbol{d}^2]\mathop{\bf E}[\boldsymbol{e}^2] + \mathop{\bf E}[\boldsymbol{e}^2]^2 \\ &= \bigl(\mathop{\bf E}[\boldsymbol{d}^2] + \mathop{\bf E}[\boldsymbol{e}^2]\bigr)^2 = \mathop{\bf E}[\boldsymbol{f}^2]^2, \end{align*} where the second inequality is Cauchy–Schwarz, the third is induction, and the final equality is a simple computation analogous to equation (2) in the proof of the Bonami Lemma. $\Box$

The name “hypercontractivity” in this theorem describes the fact that not only is $\mathrm{T}_{1/\sqrt{3}}$ a “contraction” on $L^2(\{-1,1\}^n)$ — meaning $\|\mathrm{T}_{1/\sqrt{3}} f\|_2 \leq \|f\|_2$ for all $f$ — it’s even a contraction when viewed as an operator from $L^2(\{-1,1\}^n)$ to $L^4(\{-1,1\}^n)$. You should think of hypercontractivity theorems as quantifying the extent to which $\mathrm{T}_\rho$ is a “smoothing”, or “reasonable-izing” operator.

Unfortunately the quantity $\|\mathrm{T}_{1/\sqrt{3}}f\|_4$ in the $(2,4)$-Hypercontractivity Theorem does not have an obvious combinatorial meaning. On the other hand, the quantity \[ \|\mathrm{T}_{1/\sqrt{3}}f\|_2 = \sqrt{\langle \mathrm{T}_{1/\sqrt{3}}f, \mathrm{T}_{1/\sqrt{3}}f \rangle} = \sqrt{\langle f, \mathrm{T}_{1/\sqrt{3}} \mathrm{T}_{1/\sqrt{3}} f\rangle} = \sqrt{\mathbf{Stab}_{1/3}[f]}, \] does have a nice combinatorial meaning. And we can make this quantity appear in the Hypercontractivity Theorem via a simple trick from analysis, just using the fact that $\mathrm{T}_{1/\sqrt{3}}$ is a self-adjoint operator. We “flip the norms across $2$” using Hölder’s inequality:

$\mathbf{(4/3,2)}$-Hypercontractivity Theorem Let $f : \{-1,1\}^n \to {\mathbb R}$. Then \[ \|\mathrm{T}_{1/\sqrt{3}} f\|_2 \leq \|f\|_{4/3}; \] i.e., \begin{equation} \label{eqn:4/3-2} \mathbf{Stab}_{1/3}[f] \leq \|f\|_{4/3}^2. \end{equation}

Proof: Writing $\mathrm{T} = \mathrm{T}_{1/\sqrt{3}}$ for brevity we have \begin{equation} \label{eqn:hc-holder} \|\mathrm{T} f\|_2^2 = \langle \mathrm{T} f, \mathrm{T} f \rangle = \langle f, \mathrm{T} \mathrm{T} f \rangle \leq \|f\|_{4/3} \|\mathrm{T} \mathrm{T} f\|_4 \leq \|f\|_{4/3} \|\mathrm{T} f\|_2 \end{equation} by Hölder’s inequality and the $(2,4)$-Hypercontractivity Theorem. Dividing through by $\|\mathrm{T} f\|_2$ (which we may assume is nonzero) completes the proof. $\Box$

In the inequality \eqref{eqn:4/3-2} the left-hand side is a natural quantity. The right-hand side is just $1$ when $f : \{-1,1\}^n \to \{-1,1\}$, which is not very interesting. But if we instead look at $f : \{-1,1\}^n \to \{0,1\}$ we get something very interesting:

Corollary 8 Let $A \subseteq \{-1,1\}^n$ have volume $\alpha$; i.e., let $1_A : \{-1,1\}^n \to \{0,1\}$ satisfy $\mathop{\bf E}[1_A] = \alpha$. Then \[ \mathbf{Stab}_{1/3}[1_A] = \mathop{\bf Pr}_{\substack{{\boldsymbol{x}} \sim \{-1,1\}^n \\ \boldsymbol{y} \sim N_{1/3}({\boldsymbol{x}})}}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in A] \leq \alpha^{3/2}. \] Equivalently (for $\alpha > 0$), \[ \mathop{\bf Pr}_{\substack{{\boldsymbol{x}} \sim A \\ \boldsymbol{y} \sim N_{1/3}({\boldsymbol{x}})}}[\boldsymbol{y} \in A] \leq \alpha^{1/2}. \]

Proof: This is immediate from inequality \eqref{eqn:4/3-2}, since \[ \|1_A\|_{4/3}^2 = \Bigr(\mathop{\bf E}_{{\boldsymbol{x}}}[|1_A({\boldsymbol{x}})|^{4/3}]^{3/4}\Bigr)^2 = \mathop{\bf E}_{{\boldsymbol{x}}}[1_A({\boldsymbol{x}})]^{3/2} = \alpha^{3/2}. \qquad \Box \]

See Section 5 of this chapter for the generalization of this corollary to noise rates other than $1/3$.

Example 9 Assume $\alpha = 2^{-k}$, $k \in {\mathbb N}^+$, and $A$ is a subcube of codimension $k$; e.g., $1_A : {\mathbb F}_2^n \to \{0,1\}$ is the logical $\mathrm{AND}$ function on the first $k$ coordinates. For every $x \in A$, when we form $\boldsymbol{y} \sim N_{1/3}(x)$ we’ll have $\boldsymbol{y} \in A$ if and only if the first $k$ coordinates of $x$ do not change, which happens with probability $(2/3)^k = (2/3)^{\log(1/\alpha)} = \alpha^{\log(3/2)} \approx \alpha^{.585} \leq \alpha^{1/2}$. In fact, the bound $\alpha^{1/2}$ in Corollary 8 is essentially sharp when $A$ is a Hamming ball; see the exercises.

We can phrase Corollary 8 in terms of the expansion in a certain graph:

Definition 10 For $n \in {\mathbb N}^+$ and $\rho \in [-1,1]$, the $n$-dimensional $\rho$-stable hypercube graph is the edge-weighted, complete directed graph on vertex set $\{-1,1\}^n$ in which the weight on directed edge $(x,y) \in \{-1,1\}^n \times \{-1,1\}^n $ is equal to $\mathop{\bf Pr}[({\boldsymbol{x}},\boldsymbol{y}) = (x,y)]$ when $({\boldsymbol{x}},\boldsymbol{y})$ is a $\rho$-correlated pair. If $\rho = 1-2\delta$ for $\delta \in [0,1]$, we also call this the $\delta$-noisy hypercube graph. Here the weight on $(x,y)$ is $\mathop{\bf Pr}[({\boldsymbol{x}}, \boldsymbol{y}) = (x,y)]$ where ${\boldsymbol{x}} \sim \{-1,1\}^n$ is uniform and $\boldsymbol{y}$ is formed from ${\boldsymbol{x}}$ by negating each coordinate independently with probability $\delta$.

Remark 11 The edge weights in this graph are nonnegative and sum to $1$. The graph is also “regular” in the sense that for each $x \in \{-1,1\}^n$ the sum of all the edge weight leaving (or entering) $x$ is $2^{-n}$. You can also consider the graph to be undirected, since the weight on $(x,y)$ is the same as the weight on $(y,x)$; in this viewpoint, the weight on the undirected edge $(x,y)$ would be $2^{1-n}\delta^{\Delta(x,y)}(1-\delta)^{n-\Delta(x,y)}$. In fact, the graph is perhaps best thought of as the discrete-time Markov chain on state space $\{-1,1\}^n$ in which a step from state $x \in \{-1,1\}^n$ consists of moving to state $\boldsymbol{y} \sim N_\rho(x)$. This is a reversible chain with the uniform stationary distribution. Each discrete step is equivalent to running the “usual” continuous-time Markov chain on the hypercube for time $t = \ln(1/\rho)$ (assuming $\rho \in [0,1]$).

With this definition in place, we can see Corollary 8 as saying that the $1/3$-stable (equivalently, $1/3$-noisy) hypercube graph is a “small-set expander”: given any small $\alpha$-fraction of the vertices $A$, almost all of the edge weight touching $A$ is on its boundary. More precisely, if we choose a random vertex ${\boldsymbol{x}} \in A$ and take a random edge out of ${\boldsymbol{x}}$ (with probability proportional to its edge weight), we end up outside $A$ with probability at least $1 – \alpha^{1/2}$. You can compare this with the discussion surrounding the Level-$1$ Inequality in Chapter 5.4, which is the analogous statement for the $\rho$-stable hypercube graph “in the limit $\rho \to 0^+$”. The appropriate statement for general $\rho$ is appears in Section 5 of this chapter as the “Small-Set Expansion Theorem”.

Corollary 8 would apply equally well if $1_A$ were replaced by a function $g : \{-1,1\}^n \to \{-1,0,1\}$, with $\alpha$ denoting $\mathop{\bf Pr}[g \neq 0] = \mathop{\bf E}[|g|] = \mathop{\bf E}[g^2]$. This situation occurs naturally when $g = \mathrm{D}_i f$ for some boolean-valued $f : \{-1,1\}^n \to \{-1,1\}$. In this case $\mathbf{Stab}_{1/3}[g] = \mathbf{Inf}^{(1/3)}_i[f]$, the $1/3$-stable influence of $i$ on $f$. We conclude that for a boolean-valued function, if the influence of $i$ is small then its $1/3$-stable influence is much smaller:

Corollary 12 Let $f : \{-1,1\}^n \to \{-1,1\}$. Then $\mathbf{Inf}_i^{(1/3)}[f] \leq \mathbf{Inf}_i[f]^{3/2}$ for all $i$.

We remark that the famous KKL Theorem (stated in Chapter 4.2) more or less follows by summing the above inequality over $i \in [n]$; we could now prove it quite easily in a paragraph or so.

Let’s take one more look at the “small-set expansion result”, Corollary 8. Since noise stability roughly measures how “low” a function’s Fourier weight is, this corollary implies that a function $f : \{-1,1\}^n \to \{0,1\}$ with small mean $\alpha$ cannot have much of its Fourier weight at low degree. More precisely, for any $k \in {\mathbb N}$ we have \begin{equation} \label{eqn:weak-level-k} \alpha^{3/2} \geq \mathbf{Stab}_{1/3}[f] \geq (1/3)^k \mathbf{W}^{\leq k}[f] \quad\Rightarrow\quad \mathbf{W}^{\leq k}[f] \leq 3^k \alpha^{3/2}. \end{equation} For $k = 1$ this gives $\mathbf{W}^{\leq 1}[f] \leq 3\alpha^{3/2}$, which is nontrivial but not as strong as the Level-1 Inequality from Chapter 5.4. But \eqref{eqn:weak-level-k} also gives us “level-$k$ inequalities” for larger values of $k$. For example, \[ \mathbf{W}^{\leq .25 \log(1/\alpha)}[f] \leq \alpha^{-.25 \log 3 + 3/2} \leq \alpha^{1.1} \ll \alpha = \|f\|_2^2; \] i.e., almost all of $f$’s Fourier weight is above degree $.25 \log(1/\alpha)$. We will give slightly improved versions of these level-$k$ inequalities in Section 5 of this chapter.

2 comments to §9.2: Small subsets of the hypercube are noise-sensitive

Leave a Reply




You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>