Although you can get a lot of mileage out of studying the $4$-norm of random variables, it’s also natural to consider other norms.

For example, we would get improved versions of our concentration and anticoncentration results, Propositions 3 and 4, if we could bound the higher norms of a random variable in terms of its $2$-norm. As we’ll see, we can also get stronger “level-$k$ inequalities” by bounding the $(2+\epsilon)$-norm of a boolean function for small $\epsilon > 0$.

We started with the $4$-norm due to the simplicity of the proofs of the Bonami Lemma and the $(2,4)$-Hypercontractivity Theorem. To generalize these results to other norms it’s a bit more elegant to work with the latter. Partly this is because it’s “formally stronger” (as we’ll see later). But the main reason is that the hypercontractivity version alleviates the inelegant issue that being “$B$-reasonable” is not translation-invariant. Thus instead of generalizing the condition that $\|\rho \boldsymbol{X}\|_4 \leq \|\boldsymbol{X}\|_2$ (“$\boldsymbol{X}$ is $\rho^{-4}$-reasonable”) we’ll generalize the condition that $\|a + \rho b \boldsymbol{X}\|_4 \leq \|a + b\boldsymbol{X}\|_2$ (cf. the $n = 1$ case of the $(2,4)$-Hypercontractivity Theorem).

Definition 13Let $1 \leq p \leq q \leq \infty$ and let $0 \leq \rho < 1$. We say that a real random variable $\boldsymbol{X}$ (with $\|\boldsymbol{X}\|_q < \infty$) is$(p,q,\rho)$-hypercontractiveif \[ \|a + \rho b \boldsymbol{X}\|_q \leq \|a + b \boldsymbol{X}\|_p \quad \text{for all constants } a, b \in {\mathbb R}. \]

Remark 14By homogeneity, it suffices to check the condition for $a = 1$, $b \in {\mathbb R}$ or for $a \in {\mathbb R}$, $b = 1$ (exercise). It’s also true (see the exercises) that if $\boldsymbol{X}$ is $(p,q,\rho)$-hypercontractive then it is $(p,q,\rho’)$-hypercontractive for $\rho’ < \rho$ as well.

In the exercises you will show that if $\boldsymbol{X}$ is hypercontractive then $\mathop{\bf E}[\boldsymbol{X}]$ must be $0$. Thus hypercontractivity, like reasonableness, is not a translation-invariant notion. Nevertheless, the fact that the definition involves translation by an arbitrary $a$ greatly facilitates proofs by induction. For example, an elegant property we gain from the definition is the following (which will be an exercise Chapter 10):

Proposition 15Let $\boldsymbol{X}$ and $\boldsymbol{Y}$ be independent $(p,q,\rho)$-hypercontractive random variables. Then $\boldsymbol{X} + \boldsymbol{Y}$ is also $(p,q,\rho)$-hypercontractive.

The $n = 1$ case of our $(2,4)$-Hypercontractivity Theorem precisely says that a single uniformly random $\pm 1$ bit ${\boldsymbol{x}}$ is $(2,4,1/\sqrt{3})$-hypercontractive; the $(4/3,2)$-Hypercontractivity Theorem says that ${\boldsymbol{x}}$ is also $(4/3,2,1/\sqrt{3})$-hypercontractive. We’ll spend the remainder of this section generalizing these facts to $(2,q,\rho)$- and $(p,2,\rho)$-hypercontractivity for other values of $p$ and $q$. We remark that in our study of hypercontractivity we’ll focus mainly on the cases of $p = 2$ or $q = 2$. The study of hypercontractivity for $p, q \neq 2$ and for random variables other than uniform $\pm 1$ bits is deferred to Chapter 10. We now consider hypercontractivity of a uniformly random $\pm 1$ bit ${\boldsymbol{x}}$. We know that ${\boldsymbol{x}}$ is $(2,q,1/\sqrt{3})$-hypercontractive for $q = 4$; what about other values of $q$? Things are most pleasant when $q$ is an even integer because then you don’t need to take the absolute value when computing $\|a + \rho b \boldsymbol{X}\|_q$. So let’s try $q = 6$.

Proposition 16For ${\boldsymbol{x}}$ a uniform $\pm 1$ bit, we have $\|a + \rho b {\boldsymbol{x}}\|_6 \leq \|a + b {\boldsymbol{x}}\|_2$ for all $a, b \in {\mathbb R}$ if (and only if) $\rho \leq 1/\sqrt{5}$. I.e., ${\boldsymbol{x}}$ is $(2,6,1/\sqrt{5})$-hypercontractive.

*Proof:* Raising the inequality to the $6$th power, we need to show \begin{equation} \label{eqn:2-6-hypercon1} \mathop{\bf E}[(a+\rho b {\boldsymbol{x}})^6] \leq \mathop{\bf E}[(a+b {\boldsymbol{x}})^2]^3. \end{equation} The result is trivial when $a = 0$; otherwise, we may assume $a = 1$ by homogeneity. We expand both quantities inside expectations and use the fact that $\mathop{\bf E}[{\boldsymbol{x}}^k]$ is $0$ when $k$ is odd and $1$ when $k$ is even. Thus \eqref{eqn:2-6-hypercon1} is equivalent to \begin{equation} \label{eqn:2-6-hypercon2} 1+ 15 \rho^2 b^2 + 15 \rho^4 b^4 + \rho^6 b^6 \leq (1 + b^2)^3 = 1 + 3b^2 + 3b^4 + b^6. \end{equation} Comparing the two sides term-by-term we see that the coefficient on $b^2$ is the limiting factor: in order for \eqref{eqn:2-6-hypercon2} to hold for all $b \in {\mathbb R}$ it is sufficient that $15 \rho^2 \leq 3$; i.e., $\rho \leq 1/\sqrt{5}$. By considering $b \to 0$ it’s also easy to see that this condition is necessary. $\Box$

If you repeat this analysis for the case of $q = 8$ you’ll find that again the limiting factor is the coefficient on $b^2$, and that ${\boldsymbol{x}}$ is $(2,8,\rho)$-hypercontractive if (and only if) $\binom{8}{2} \rho^2 \leq \binom{4}{1}$; i.e., $\rho \leq 1/\sqrt{7}$. In light of this it is natural to guess that the following is true:

Theorem 17Let ${\boldsymbol{x}}$ be a uniform $\pm 1$ bit and let $q \in (2, \infty]$. Then $\|a + \rho b {\boldsymbol{x}} \|_q \leq \|a+b {\boldsymbol{x}}\|_2$ for all $a,b \in {\mathbb R}$ assuming $\rho \leq 1/\sqrt{q-1}$.Equivalent statements are that $\|a + (1/\sqrt{q-1}) b {\boldsymbol{x}} \|_q^2 \leq a^2 + b^2$, that ${\boldsymbol{x}}$ is $(2,q,1/\sqrt{q-1})$-hypercontractive, and that $\|\mathrm{T}_{1/\sqrt{q-1}} f\|_q \leq \|f\|_2$ holds for any $f : \{-1,1\} \to {\mathbb R}$.

For $q$ an even integer it is not hard to prove Theorem 17 just as we did for $q = 6$ (see the exercises). Indeed, the proof works even under more general moment conditions on ${\boldsymbol{x}}$, as in Corollary 6. Unfortunately, obtaining Theorem 17 for all real $q > 2$ takes some more tricks. A natural idea is to try forging ahead as in Proposition 16, using the series expansions for $(1+\rho b x)^q$ and $(1+b^2)^{q/2}$ provided by the Generalized Binomial Theorem. However even when $|b| < 1$ (so that convergence is not an issue) there is a difficulty because the coefficients in the expansion of $(1+b^2)^{q/2}$ are sometimes negative.

Luckily, this issue of negative coefficients in the series expansion goes away if you try to prove the analogous $(p,2,\rho)$-hypercontractivity statement. Thus the slick proof of Theorem 17 proceeds by first proving that statement, then “flipping the norms across $2$”.

Theorem 18Let ${\boldsymbol{x}}$ be a uniform $\pm 1$ bit and let $1 \leq p < 2$. Then $\|a + \rho b {\boldsymbol{x}} \|_2 \leq \|a+b {\boldsymbol{x}}\|_p$ for all $a,b \in {\mathbb R}$ assuming $0 \leq \rho \leq \sqrt{p-1}$. I.e., ${\boldsymbol{x}}$ is $(p,2,\sqrt{p-1})$-hypercontractive.

*Proof:* By Remark 14 we may assume $a = 1$ and $\rho = \sqrt{p-1}$. We may also assume without loss of generality (exercise) that $1+bx \geq 0$ for $x \in \{-1,1\}$; i.e., that $|b| \leq 1$. It then suffices to prove the result for all $|b| < 1$ because the $|b| = 1$ case follows by continuity. Writing $b = \epsilon$ for the sake of intuition, we need to show \begin{align} \|1 + \sqrt{p-1}\cdot \epsilon {\boldsymbol{x}}\|_2^p &\leq \|1+\epsilon{\boldsymbol{x}}\|_p^p \nonumber\\ \quad\Leftrightarrow\quad \mathop{\bf E}[(1+\sqrt{p-1}\cdot \epsilon {\boldsymbol{x}})^2]^{p/2} &\leq \mathop{\bf E}[(1+\epsilon {\boldsymbol{x}})^p]. \label{eqn:hypercon-proveme} \end{align} Here we were able to drop the absolute value on the right-hand side because $|\epsilon| < 1$. The left-hand side of \eqref{eqn:hypercon-proveme} is \begin{equation} \label{eqn:hypercon-in-light} (1+(p-1) \epsilon^2)^{p/2} \leq 1 + \tfrac{p(p-1)}{2} \epsilon^2, \end{equation} where we used the inequality $(1+t)^{\theta} \leq 1+\theta t$ for $t \geq 0$ and $0 \leq \theta \leq 1$ (easily proved by comparing derivatives in $t$). As for the right-hand side of \eqref{eqn:hypercon-proveme}, since $|\epsilon {\boldsymbol{x}}| < 1$ we may use the Generalized Binomial Theorem to show it equals \begin{align*} &\mathop{\bf E}\left[1 + p \epsilon{\boldsymbol{x}} + \tfrac{p(p-1)}{2!} \epsilon^2{\boldsymbol{x}}^2 + \tfrac{p(p-1)(p-2)}{3!} \epsilon^3{\boldsymbol{x}}^3 + \tfrac{p(p-1)(p-2)(p-3)}{4!} \epsilon^4 {\boldsymbol{x}}^4 + \cdots\right] \\ =\ &1 + p \epsilon\mathop{\bf E}[{\boldsymbol{x}}] + \tfrac{p(p-1)}{2!} \epsilon^2\mathop{\bf E}[{\boldsymbol{x}}^2] + \tfrac{p(p-1)(p-2)}{3!} \epsilon^3\mathop{\bf E}[{\boldsymbol{x}}^3] + \tfrac{p(p-1)(p-2)(p-3)}{4!} \epsilon^4\mathop{\bf E}[{\boldsymbol{x}}^4] + \cdots \\ =\ & 1 + \tfrac{p(p-1)}{2} \epsilon^2 + \tfrac{p(p-1)(p-2)(p-3)}{4!} \epsilon^4 + \tfrac{p(p-1)(p-2)(p-3)(p-4)(p-5)}{6!} \epsilon^6 + \cdots. \end{align*} In light of \eqref{eqn:hypercon-in-light}, to verify \eqref{eqn:hypercon-proveme} it suffices to note that each “post-quadratic” term above, \[ \tfrac{p(p-1)(p-2)(p-3) \cdots (p-(2k+1))}{(2k)!}\epsilon^{2k}, \] is nonnegative. This follows from $1 \leq p \leq 2$: the numerator has two positive factors and an even number of negative factors. $\Box$

To deduce Theorem 17 from Theorem 18 we again just need to flip the norms across $2$ using the fact that $\mathrm{T}_\rho$ is self-adjoint. This is accomplished by taking $\Omega = \{-1,1\}$, $\pi = \pi_{1/2}$, $q = 2$, and $T = \mathrm{T}_{\sqrt{p-1}}$ in the following proposition (and noting that $1/\sqrt{p’-1} = \sqrt{p – 1}$):

Proposition 19Let $T$ be a self-adjoint operator on $L^2(\Omega, \pi)$, let $1 \leq p,\,q \leq \infty$, and let $p’,\,q’$ be their conjugate Hölder indices. Assume $\|T f\|_q \leq \|f\|_p$ for all $f$. Then $\|T g\|_{p’} \leq \|g\|_{q’}$ for all $g$.

*Proof:* This follows from \[ \|T g\|_{p'} = \sup_{\|f\|_p = 1} \langle f, Tg \rangle = \sup_{\|f\|_p = 1} \langle Tf, g \rangle \leq \sup_{\|f\|_p = 1} \|Tf\|_q \|g\|_{q'} \leq \|g\|_{q'}, \] where the first equality is the sharpness of Hölder’s inequality, the second equality holds because $T$ is self-adjoint, the third inequality is Hölder’s, and the final inequality uses the hypothesis $\|T f\|_{q} \leq \|f\|_p$. $\Box$

At this point we have established that if ${\boldsymbol{x}}$ is a uniform $\pm 1$ bit then it is $(2,q,1/\sqrt{q-1})$-hypercontractive and $(p,2,\sqrt{p-1})$-hypercontractive. In the next section we will give a very simple induction which transforms these facts into the full $(2,q)$- and $(p,2)$-Hypercontractivity Theorems stated at the beginning of the chapter.

Hi Ryan,

Thanks for the notes. Great as usual !

Assume a (real-valued) random variable $X$ and a Markov operator $T$ satisfy a $(p,2,\sqrt{p-1})$ and $(2,q,1/\sqrt{q-1})$ hypercontractivity. I think this information by itself would be sufficient to prove some $\|T\|_{p’\rightarrow q’}$ hypercontractive theorems for any $p’ \in [1,\infty)$ say, using interpolation theorems. However, I don’t know whether the optimal $(p,q, \sqrt{(p-1)/(q-1) )$ hypercontractivity of Bernoulli random variables can be proved just by relaying on optimal $(2,q)$ and $(p,2)$ statements and interpolation, H\”{o}lder, etc. Could someone (or Ryan) tell me what’s the answer here?

In general hypercontractive statement are rather geometric statements which might be much harder to establish when dealing with a more complicated setting than Bernoulli RV. (Say you want to prove hypercontracivity for a RW on a rather complicated graph.) So I would really like to know how much purely analytic statements (such as interpolation of operators, etc) and some modest $(2,q)$ hypercontractivity can take you. (for example, $(2,q)$ hypercontracvity seem to me easier to establish when you have some estimates of eigenvalues. But more general hypercontractive statements seem harder to me to prove.)

Mohammad

Hi Mohammad. Yes, one can use Riesz-Thorin interpolation

http://en.wikipedia.org/wiki/Riesz%E2%80%93Thorin_theorem

to deduce some hypercontractivity results from others; e.g., if you prove (2,q)-hypercontractivity for all even integers q you can also deduce it for all real $q \geq 2$, but not with an optimal constant. Bonami herself pointed this out in her paper.

But as I said, I think reaching optimal constants in this way is doubtful. On the other hand, most of the time it’s not too important to have optimal constants.

Thanks for the question!

A possible typo: In the second line of the equation after equation $\eqref{eqn:hypercon-in-light}$, the terms are separated by alternating signs; should those all be plus signs?

Yep, that’s a mistake. Fixed, thanks!

In the first line of the proof of Thm 18, did you mean we may WOLOG assume that 1 + bx >= 0 (without absolute value signs)?

Yes, thanks!

PS: please send me an email reminding me of that AOBF problem you asked me at the last theory lunch

In the proof of Thm 18, you said “(exercise)”, a fitting exercise is missing from the chapter exercises. Is it intentional?

Actually, exercise 7 is the intended exercise. (The book will eventually have correct pointers to exercises; due to the way I transform TeX into HTML it was too complicated to get correct exercise-references on the blog.) Thanks though!