In this section we’ll prove the full Hypercontractivity Theorem for uniform $\pm 1$ bits stated at the beginning of Chapter 9:

The Hypercontractivity TheoremLet $f : \{-1,1\}^n \to {\mathbb R}$ and let ${1 \leq p \leq q \leq \infty}$. Then $\|\mathrm{T}_\rho f\|_q \leq \|f\|_p$ for $0 \leq \rho \leq \sqrt{\tfrac{p-1}{q-1}}$.

Actually, when neither $p$ nor $q$ is $2$, the following equivalent form of theorem seems easier to interpret:

Two-Function Hypercontractivity TheoremLet $f, g : \{-1,1\}^n \to {\mathbb R}$, let $r, s \geq 0$, and assume $0 \leq \rho \leq \sqrt{rs} \leq 1$. Then \[ \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}}[f({\boldsymbol{x}}) g(\boldsymbol{y})] \leq \|f\|_{1+r} \|g\|_{1+s}. \]

As a reminder, the only difference between this theorem and its “weak” form (proven in Chapter 9.4) is that we don’t assume $r, s \leq 1$. Below we will show that the two theorems *are* equivalent, via Hölder’s inequality. Given the Two-Function Hypercontractivity Induction Theorem from Chapter 9.4, this implies that to prove the Hypercontractivity Theorem for general $n$ we only need to prove it for $n = 1$. This is an elementary but technical inequality which we defer to the end of the section.

Before carrying out these proofs, let’s take some time to interpret the Two-Function Hypercontractivity Theorem. One interpretation is simply as a generalization of Hölder’s inequality. Consider the case that the strings ${\boldsymbol{x}}$ and $\boldsymbol{y}$ in the theorem are fully correlated; i.e., $\rho = 1$. Then the theorem states that \begin{equation} \label{eqn:its-holder} \mathop{\bf E}[f({\boldsymbol{x}})g({\boldsymbol{x}})] \leq \|f\|_{1+r} \|g\|_{1+1/r} \end{equation} because the condition $\sqrt{rs} = 1$ is equivalent to $s = 1/r$. This statement is identical to Hölder’s inequality, since $(1+r)’ = 1+1/r$. Hölder’s inequality is often used to “break the correlation” between two random variables; in the absence of any information about how $f$ and $g$ correlate then we can at least bound $\mathop{\bf E}[f({\boldsymbol{x}})g({\boldsymbol{x}})]$ by the product of certain norms of $f$ and $g$. (If $f$ and $g$ have different “sizes” then Hölder lets us choose different norms for them; if $f$ and $g$ have roughly the same “size” then we can take $r = s = 1$ and get Cauchy–Schwarz.) Now suppose we are considering $\mathop{\bf E}[f({\boldsymbol{x}})g(\boldsymbol{y})]$ for $\rho$-correlated ${\boldsymbol{x}},\boldsymbol{y}$ with $\rho < 1$. In this case we might hope to improve \eqref{eqn:its-holder} by using smaller norms on the right-hand side; in the extreme case of independent ${\boldsymbol{x}}, \boldsymbol{y}$ (i.e., $\rho = 0$) we can use $\mathop{\bf E}[f({\boldsymbol{x}})g(\boldsymbol{y})] = \mathop{\bf E}[f] \mathop{\bf E}[g] \leq \|f\|_1 \|g\|_1$. The Two-Function Hypercontractivity Theorem gives a precise interpolation between these two cases; the smaller the correlation $\rho$ is, the smaller the norms we may take on the right-hand side.

In the case that $f$ and $g$ have range $\{0,1\}$, these ideas yield another interpretation of the Two-Function Hypercontractivity Theorem, namely a two-set generalization of the Small-Set Expansion Theorem:

Generalized Small-Set Expansion TheoremLet $0 \leq \rho \leq 1$. Let $A,B \subseteq \{-1,1\}^n$ have volumes $\exp(-\frac{a^2}{2}),\ \exp(-\frac{b^2}{2})$ and assume $0 \leq \rho a \leq b \leq a$. Then \[ \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in B] \leq \exp\left(-\tfrac12 \tfrac{a^2 – 2\rho ab + b^2}{1-\rho^2}\right). \]

*Proof:* Apply the Two-Function Hypercontractivity Theorem with $f = 1_A$, $g = 1_B$ and minimize the right-hand side by selecting $r = \rho \frac{a-\rho b}{b-\rho a}$, $s = \rho \frac{b – \rho a}{a – \rho b}$. $\Box$

Remark 1When $a$ and $b$ are not too close the optimal choice of $r$ in the proof exceeds $1$. Thus the Generalized Small-Set Expansion Theorem really needs the full (non-weak) Two-Function Hypercontractivity Theorem; equivalently, the full Hypercontractivity Theorem.

Remark 2This theorem is essentially sharp in the case that $A$ and $B$ are concentric Hamming balls; see the exercises. In case $b = a$ we recover the Small-Set Expansion Theorem. In case $b = \rho a$ we get only the trivial bound that $\mathop{\bf Pr}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in B] \leq \exp(-\frac{a^2}{2}) = \mathop{\bf Pr}[{\boldsymbol{x}} \in A]$. However not much better than this can be expected; in the concentric Hamming ball case it indeed holds that $\mathop{\bf Pr}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in B] \sim \mathop{\bf Pr}[{\boldsymbol{x}} \in A]$ whenever $b < \rho a$.

Remark 3There is also areverseform of the Hypercontractivity Theorem and its Two-Function version; see the exercises. It directly implies the following:

Reverse Small-Set Expansion TheoremLet $0 \leq \rho \leq 1$. Let $A,B \subseteq \{-1,1\}^n$ have volumes $\exp(-\frac{a^2}{2}),\ \exp(-\frac{b^2}{2})$, where $a, b \geq 0$. Then \[ \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in B] \geq \exp\left(-\tfrac12 \tfrac{a^2 + 2\rho ab + b^2}{1-\rho^2}\right). \]

We now turn to the proofs. We begin by showing that the Hypercontractivity Theorem and the Two-Function version are indeed equivalent. This is a consequence of the following general fact (take $T = \mathrm{T}_\rho$, $p = 1+r$, $q = 1+1/s$):

Proposition 4Let $T$ be an operator on $L^2(\Omega, \pi)$ and let $1 \leq p,\,q \leq \infty$. Then \begin{equation} \label{eqn:hc-one-to-two} \|Tf\|_q \leq \|f\|_p \end{equation} holds for all $f \in L^2(\Omega, \pi)$ if and only if \begin{equation} \label{eqn:hc-two-to-one} \langle Tf, g\rangle \leq \|f\|_p \|g\|_{q’} \end{equation} holds for all $f,g \in L^2(\Omega, \pi)$.

*Proof:* For the “only if” statement, $\langle Tf, g \rangle \leq \|Tf\|_q \|g\|_{q’} \leq \|f\|_p \|g\|_{q’}$ by Hölder’s inequality and \eqref{eqn:hc-one-to-two}. As for the “if” statement, by Hölder’s inequality and \eqref{eqn:hc-two-to-one} we have \[ \|T f\|_{q} = \sup_{\|g\|_{q'} = 1} \langle Tf, g \rangle \leq \sup_{\|g\|_{q'} = 1} \|f\|_p \|g\|_{q'}= \|f\|_p. \qquad \Box\]

Now suppose we prove the Hypercontractivity Theorem in the case $n = 1$. By the above proposition we deduce the Two-Function version in the case $n = 1$. Then the Two-Function Hypercontractivity Induction Theorem from Chapter 9.4 yields the general-$n$ case of the Two-Function Hypercontractivity Theorem. Finally, applying the above proposition again we get the general-$n$ case of the Hypercontractivity Theorem, thereby completing all needed proofs. These observations all hold in the context of more general product spaces, so let’s record the following for future use:

Hypercontractivity Induction TheoremLet $0 \leq \rho \leq 1$, $1 \leq p, q \leq \infty$, and assume that $\|\mathrm{T}_\rho f\|_q \leq \|f\|_p$ holds for every $f \in L^2(\Omega_1, \pi_1), \dots, L^2(\Omega_n, \pi_n)$. Then it also holds for every $f \in L^2(\Omega_1 \times \cdots \times \Omega_n, \pi_1 \otimes \cdots \otimes \pi_n)$.

Remark 5In traditional proofs of the Hypercontractivity Theorem for $\pm 1$ bits, this theorem is proven directly; it’s a slightly tricky induction by derivatives (see the exercises). For more general product spaces the same direct induction strategy also works but the notation becomes quite complicated.

Our remaining task, therefore, is to prove the Hypercontractivity Theorem in the case $n = 1$; in other words, to show that a uniformly random $\pm 1$ bit is $(p,q,\sqrt{(p-1)/(q-1)})$-hypercontractive. This fact is often called the “Two-Point Inequality” because (for fixed $p$, $q$, and $\rho$) it’s just an “elementary” inequality about two real variables.

Two-Point InequalityLet $1 \leq p \leq q \leq \infty$ and let $0 \leq \rho \leq \sqrt{(p-1)/(q-1)}$. Then $\|\mathrm{T}_\rho f\|_q \leq \|f\|_p$ for any $f : \{-1,1\} \to {\mathbb R}$. Equivalently (for $\rho \neq 1$), a uniformly random bit ${\boldsymbol{x}} \sim \{-1,1\}$ is $(p,q,\rho)$-hypercontractive; i.e., $\|a + \rho b {\boldsymbol{x}} \|_q \leq \|a+b {\boldsymbol{x}}\|_p$ for all $a,b \in {\mathbb R}$.

*Proof:* As in Section 9.3, our main task will be to prove the inequality for $1 \leq p < q \leq 2$. Having done this, the $2 \leq p < q \leq \infty$ cases follow from Proposition 9.19, the $p < 2 < q$ cases follow using the semigroup property of $\mathrm{T}_\rho$ (exercise), and the $p = q$ cases follow from continuity. The proof for $1 \leq p < q \leq 2$ will be very similar to that of Theorem 9.18 (the $q = 2$ case). As in that proof we may reduce to the case that $\rho = \sqrt{(p-1)/(q-1)}$, $a = 1$, and $b = \epsilon$ satisfies $|\epsilon| < 1$. It then suffices to show \begin{align} \|1 + \rho \epsilon {\boldsymbol{x}} \|_q^p &\leq \|1+\epsilon {\boldsymbol{x}}\|_p^p \nonumber\\ \iff\quad \left(\tfrac{1}{2} (1+\rho\epsilon)^q + \tfrac{1}{2} (1-\rho\epsilon)^q\right)^{p/q} &\leq \tfrac{1}{2} (1+\epsilon)^p + \tfrac{1}{2} (1-\epsilon)^p \nonumber\\ \iff\quad \left(1 + \sum_{k = 1}^\infty \tbinom{q}{2k} \rho^{2k} \epsilon^{2k}\right)^{p/q} &\leq 1 + \sum_{k = 1}^\infty \tbinom{p}{2k} \epsilon^{2k}. \label{eqn:hypercon2-proveme} \end{align} Again we used $|\epsilon| < 1$ to drop the absolute value signs and justify the Generalized Binomial Theorem. For each of the binomial coefficients on the left in \eqref{eqn:hypercon2-proveme} we have \[ \tbinom{q}{2k} = \tfrac{q(q-1)(q-2)(q-3)\cdots(q-(2k-2))(q-(2k-1))}{(2k)!} = \tfrac{q(q-1)(2-q)(3-q)\cdots((2k-2)-q)((2k-1)-q)}{(2k)!} \geq 0. \] (Here we reversed an even number of signs, since $1 \leq q \leq 2$. We will later do the same when expanding $\tbinom{p}{2k}$.) Thus we can again employ the inequality $(1+t)^\theta \leq 1+\theta t$ for $t \geq 0$ and $0 \leq \theta \leq 1$ to deduce that the left-hand side of \eqref{eqn:hypercon2-proveme} is at most \[ 1 + \sum_{k = 1}^\infty \tfrac{p}{q}\tbinom{q}{2k} \rho^{2k} \epsilon^{2k} = 1 + \sum_{k = 1}^\infty \tfrac{p}{q}\left(\tfrac{p-1}{q-1}\right)^{k}\tbinom{q}{2k} \epsilon^{2k}. \] We can now complete the proof of \eqref{eqn:hypercon2-proveme} by showing the following term-by-term inequality: for all $k \geq 1$, \begin{align*} \tfrac{p}{q}\left(\tfrac{p-1}{q-1}\right)^{k}\tbinom{q}{2k} &\leq \tbinom{p}{2k} \\ \iff\quad \tfrac{p}{q} \left(\tfrac{p-1}{q-1}\right)^{k} \tfrac{q(q-1)(2-q)\cdots((2k-1)-q)}{(2k)!} &\leq \tfrac{p(p-1)(2-p)\cdots((2k-1)-p)}{(2k)!} \\ \iff\quad \tfrac{2-q}{\sqrt{q-1}} \cdot \tfrac{3-q}{\sqrt{q-1}} \cdots \tfrac{(2k-1) -q}{\sqrt{q-1}} &\leq \tfrac{2-p}{\sqrt{p-1}} \cdot \tfrac{3-p}{\sqrt{p-1}} \cdots \tfrac{(2k-1) -p}{\sqrt{p-1}}. \end{align*} And indeed this inequality holds factor-by-factor. This is because $p < q$ and $\frac{j-r}{\sqrt{r-1}}$ is a decreasing function of $r \geq 1$ for all $j \geq 2$, as is evident from $\frac{d}{dr} \frac{j-r}{\sqrt{r-1}} = -\frac{j-2 + r}{2(r-1)^{3/2}}$. $\Box$

Remark 6The upper-bound $\rho \leq \sqrt{(p-1)/(q-1)}$ in this theorem is best possible; see the exercises.

In the final displayed inequality of the proof of Proposition 4, should the last two occurrences of $Tf$ be $f$?

Yes, oops! Thanks!