§10.2: Hypercontractivity of general random variables

Let’s now study hypercontractivity for general random variables. By the end of this section we will have proved the General Hypercontractivity Theorem stated at the beginning of the chapter.

Recall Definition 9.13 which says that $\boldsymbol{X}$ is $(p,q,\rho)$-hypercontractive if $\mathop{\bf E}[|\boldsymbol{X}|^q] < \infty$ and \[ \|a + \rho b \boldsymbol{X}\|_q \leq \|a + b \boldsymbol{X}\|_p \quad \text{for all constants } a, b \in {\mathbb R}. \] (By homogeneity, it’s sufficient to check this either with $a$ fixed to $1$ or with $b$ fixed to $1$.) Let’s also collect some additional basic facts regarding the concept:

Fact 7 Suppose $\boldsymbol{X}$ is $(p,q,\rho)$-hypercontractive ($1 \leq p \leq q \leq \infty$, $0 \leq \rho < 1$). Then:

  1. $\mathop{\bf E}[\boldsymbol{X}] = 0$ (Exercise 9.10).
  2. $c\boldsymbol{X}$ is $(p,q,\rho)$-hypercontractive for any $c \in {\mathbb R}$ (Exercise 9.9).
  3. $\boldsymbol{X}$ is $(p,q,\rho’)$-hypercontractive for any $0 \leq \rho’ < \rho$ (Exercise 9.11).
  4. $\rho \leq \sqrt{\frac{p-1}{q-1}}$ and $\rho \leq \frac{\|\boldsymbol{X}\|_p}{\|\boldsymbol{X}\|_q}$ (Exercises 9.10, 9.9).

Proposition 8 Let $\boldsymbol{X}$ be $(2,q,\rho)$-hypercontractive. Then $\boldsymbol{X}$ is also $(q’,2,\rho)$-hypercontractive, where $q’$ is the conjugate Hölder index of $q$.

Proof: The deduction is essentially the same as inequality (3) from Chapter 9.2. Since $\mathop{\bf E}[\boldsymbol{X}] = 0$ (by the above Fact) we have \[ \|a + \rho b\boldsymbol{X}\|_2^2 = \mathop{\bf E}[a^2 + 2\rho ab \boldsymbol{X} + \rho^2 b^2 \boldsymbol{X}^2] = \mathop{\bf E}[(a+b\boldsymbol{X})(a+\rho^2 b\boldsymbol{X})]. \] By Hölder’s inequality and then the $(2,q,\rho)$-hypercontractivity of $\boldsymbol{X}$ this is at most \[ \|a+b\boldsymbol{X}\|_{q'} \|a+\rho^2b\boldsymbol{X}\|_q \leq \|a+b\boldsymbol{X}\|_{q'} \|a+\rho b\boldsymbol{X}\|_2. \] Dividing through by $\|a+\rho b\boldsymbol{X}\|_2$ (which can’t be $0$ unless $\boldsymbol{X} \equiv 0$) gives $\|a+\rho b \boldsymbol{X}\|_2 \leq \|a+b\boldsymbol{X}\|_{q’}$ as needed. $\Box$

Remark 9 The converse does not hold; see the exercises.

Remark 10 As mentioned in Proposition 9.15, the sum of independent hypercontractive random variables is equally hypercontractive. Furthermore, low-degree polynomials of independent hypercontractive random variables are “reasonable”. See the exercises.

Given $\boldsymbol{X}$, $p$, and $q$, computing the largest $\rho$ for which $\boldsymbol{X}$ is $(p,q,\rho)$-hypercontractive can often be quite a chore. However if you’re not overly concerned about constant factors then things become much easier. Let’s focus on the most useful case, $p = 2$ and $q > 2$. By Fact 7(2) we may assume $\|\boldsymbol{X}\|_2 = 1$. Then we can ask:

Question Let $\mathop{\bf E}[\boldsymbol{X}] = 0$, $\|\boldsymbol{X}\|_2 = 1$, and assume $\|\boldsymbol{X}\|_q < \infty$. For what $\rho$ is $\boldsymbol{X}$ $(2,q,\rho)$-hypercontractive?

In this section we’ll answer the question by showing that $\rho = \Theta_q(1/\|\boldsymbol{X}\|_q)$ is sufficient. By the second part of Fact 7(4), $\rho \leq 1/\|\boldsymbol{X}\|_q$ is also necessary. So for a mean-zero random variable $\boldsymbol{X}$, the largest $\rho$ for which $\boldsymbol{X}$ is $(2,q,\rho)$-hypercontractive is always within a constant (depending only on $q$) of $\frac{\|\boldsymbol{X}\|_2}{\|\boldsymbol{X}\|_q}$.

Let’s arrive at this result in steps, introducing the useful techniques of symmetrization and randomization along the way. When studying hypercontractivity of a random variable $\boldsymbol{X}$, things are much more convenient if $\boldsymbol{X}$ is a symmetric random variable, meaning $-\boldsymbol{X}$ has the same distribution as $\boldsymbol{X}$. One advantage of symmetric random variables $\boldsymbol{X}$ is that they have $\mathop{\bf E}[\boldsymbol{X}^k] = 0$ for all odd $k \in {\mathbb N}$. Using this it is easy to prove (exercise) the following fact, similar to Corollary 9.6. (The proof similar to that of Proposition 9.16.)

Proposition 11 Let $\boldsymbol{X}$ be a symmetric random variable with $\|\boldsymbol{X}\|_2 = 1$. Assume $\|\boldsymbol{X}\|_4 = C$ (hence $\boldsymbol{X}$ is “$C^4$-reasonable”). Then $\boldsymbol{X}$ is $(2,4,\rho)$-hypercontractive if and only if $\rho \leq \min(\frac{1}{\sqrt{3}}, \frac{1}{C})$.

Given a symmetric random variable $\boldsymbol{X}$, the randomization trick is to replace $\boldsymbol{X}$ by the identically distributed random variable $\boldsymbol{r}\boldsymbol{X}$, where $\boldsymbol{r} \sim \{-1,1\}$ is an independent uniformly random bit. This trick sometimes lets you reduce a probabilistic statement about $\boldsymbol{X}$ to a related one about $\boldsymbol{r}$.

Theorem 12 Let $\boldsymbol{X}$ be a symmetric random variable with $\|\boldsymbol{X}\|_2 = 1$ and let $\|\boldsymbol{X}\|_q = C$, where $q > 2$. Then $\boldsymbol{X}$ is $(2, q, \rho)$-hypercontractive for $\rho = \frac{1}{C\sqrt{q-1}}$.

Proof: Let $\boldsymbol{r} \sim \{-1,1\}$ be uniformly random and let $\widetilde{\boldsymbol{X}}$ denote $\boldsymbol{X}/C$. Then for any $a \in {\mathbb R}$, \begin{align*} \|a + \rho \boldsymbol{X}\|_q^2 & = \|a + \rho \boldsymbol{r} \boldsymbol{X}\|_q^2 \tag{by symmetry of $\boldsymbol{X}$} \\ &= \mathop{\bf E}_{\boldsymbol{X}}\left[ \mathop{\bf E}_{\boldsymbol{r}}[|a+\rho \boldsymbol{r} \boldsymbol{X}|^q]\right]^{2/q}\\ &\leq \mathop{\bf E}_{\boldsymbol{X}}\left[ \mathop{\bf E}_{\boldsymbol{r}}[|a+\tfrac{1}{C}\boldsymbol{r} \boldsymbol{X}|^2]^{q/2}\right]^{2/q} \tag{$\boldsymbol{r}$ is $(2,q,\frac{1}{\sqrt{q-1}})$-hypercontractive} \\ &= \mathop{\bf E}_{\boldsymbol{X}}[(a^2 + \widetilde{\boldsymbol{X}}^2)^{q/2}]^{2/q} \tag{Parseval}\\ &= \|a^2 + \widetilde{\boldsymbol{X}}^2\|_{q/2} \tag{norm with respect to $\boldsymbol{X}$}\\ &\leq a^2 + \|\widetilde{\boldsymbol{X}}^2\|_{q/2} \tag{triangle inequality for $\|\cdot\|_{q/2}$}\\ &= a^2 + \|\widetilde{\boldsymbol{X}}\|_q^2 \\ &= a^2 + 1 = a^2 + \mathop{\bf E}[\boldsymbol{X}^2] = \|a + \boldsymbol{X}\|_2^2, \end{align*} where the last step also used $\mathop{\bf E}[\boldsymbol{X}] = 0$. $\Box$

Next, if $\boldsymbol{X}$ is not symmetric then we can use a symmetrization trick to make it so. One way to do this is to replace $\boldsymbol{X}$ with the symmetric random variable $\boldsymbol{X} – \boldsymbol{X}’$, where $\boldsymbol{X}’$ is an independent copy of $\boldsymbol{X}$. In general $\boldsymbol{X} – \boldsymbol{X}’$ has similar properties to $\boldsymbol{X}$. In particular, if $\mathop{\bf E}[\boldsymbol{X}] =0$ we can compare norms using the following one-sided bound:

Lemma 13 Let $\boldsymbol{X}$ be a random variable satisfying $\mathop{\bf E}[\boldsymbol{X}] = 0$ and $\|\boldsymbol{X}\|_q < \infty$, where $q \geq 1$. Then for any $a \in {\mathbb R}$, \[ \|a + \boldsymbol{X}\|_q \leq \|a + \boldsymbol{X} - \boldsymbol{X}'\|_q, \] where $\boldsymbol{X}’$ denotes an independent copy of $\boldsymbol{X}$.

Proof: We have \[ \|a + \boldsymbol{X}\|_q^q = \mathop{\bf E}[|a + \boldsymbol{X}|^q] = \mathop{\bf E}[|a + \boldsymbol{X} - \mathop{\bf E}[\boldsymbol{X}']|^q], \] where we used the fact that $\mathop{\bf E}[\boldsymbol{X}' \mid \boldsymbol{X}] \equiv 0$. But now \[ \mathop{\bf E}[|a + \boldsymbol{X} - \mathop{\bf E}[\boldsymbol{X}']|^q] = \mathop{\bf E}[|\mathop{\bf E}[a + \boldsymbol{X} - \boldsymbol{X}']|^q] \leq \mathop{\bf E}[|a + \boldsymbol{X} - \boldsymbol{X}'|^q] = \|a + \boldsymbol{X} – \boldsymbol{X}’\|_q^q, \] where we used convexity of $t \mapsto |t|^q$ $\Box$

A combination of the randomization and symmetrization tricks is to replace an arbitrary random variable $\boldsymbol{X}$ by $\boldsymbol{r} \boldsymbol{X}$, where $\boldsymbol{r} \sim \{-1,1\}$ is an independent uniformly random bit. This often lets you extend results about symmetric random variables to the case of general mean-zero random variables. For example, the following hypercontractivity lemma lets us reduce to the case of a symmetric random variable while only “spending” a factor of $\tfrac{1}{2}$:

Lemma 14 Let $\boldsymbol{X}$ be a random variable satisfying $\mathop{\bf E}[\boldsymbol{X}] = 0$ and $\|\boldsymbol{X}\|_q < \infty$, where $q \geq 1$. Then for any $a \in {\mathbb R}$, \[ \|a + \tfrac{1}{2} \boldsymbol{X}\|_q \leq \|a + \boldsymbol{r} \boldsymbol{X}\|_q, \] where $\boldsymbol{r} \sim \{-1,1\}$ is an independent uniformly random bit.

Proof: Letting $\boldsymbol{X}’$ be an independent copy of $\boldsymbol{X}$ we have \begin{align*} \|a + \tfrac{1}{2} \boldsymbol{X}\|_q &\leq \|a + \tfrac{1}{2} \boldsymbol{X} – \tfrac{1}{2} \boldsymbol{X}’\|_q \tag{Lemma~13 applied to $\tfrac{1}{2}\boldsymbol{X}$}\\ &\leq \|a + \boldsymbol{r}(\tfrac{1}{2} \boldsymbol{X} – \tfrac{1}{2} \boldsymbol{X}’)\|_q \tag{since $\tfrac{1}{2} \boldsymbol{X} – \tfrac{1}{2} \boldsymbol{X}’$ is symmetric}\\ &= \|\tfrac{1}{2} a + \tfrac{1}{2} \boldsymbol{r}\boldsymbol{X} + \tfrac{1}{2} a – \tfrac{1}{2} \boldsymbol{r} \boldsymbol{X}’\|_q \\ &\leq \|\tfrac{1}{2} a + \tfrac{1}{2} \boldsymbol{r}\boldsymbol{X}\|_q + \|\tfrac{1}{2} a – \tfrac{1}{2} \boldsymbol{r}\boldsymbol{X}’\|_q \tag{triangle inequality for $\|\cdot\|_q$}\\ &= \|\tfrac{1}{2} a + \tfrac{1}{2} \boldsymbol{r}\boldsymbol{X}\|_q + \|\tfrac{1}{2} a + \tfrac{1}{2} \boldsymbol{r}\boldsymbol{X}’ \|_q \tag{$-\boldsymbol{r}$ distributed as $\boldsymbol{r}$}\\ &= \|a + \boldsymbol{r}\boldsymbol{X}\|_q. \tag*{$\Box$} \end{align*}

By employing these randomization/symmetrization techniques we obtain a $(2,q)$-hypercontractivity statement for all mean-zero random variables $\boldsymbol{X}$ with $\frac{\|\boldsymbol{X}\|_q}{\|\boldsymbol{X}\|_2}$ bounded, giving a good answer to the above Question:

Theorem 15 Let $\boldsymbol{X}$ satisfy $\mathop{\bf E}[\boldsymbol{X}] = 0$, $\|\boldsymbol{X}\|_2 = 1$, $\|\boldsymbol{X}\|_q = C$, where $q > 2$. Then $\boldsymbol{X}$ is $(2, q, \frac{1}{2} \rho)$-hypercontractive for $\rho = \frac{1}{\sqrt{q-1}\|\boldsymbol{X}\|_q}$. (If $\boldsymbol{X}$ is symmetric then the factor of $\tfrac{1}{2}$ may be omitted.)

Proof: By Lemma 14 we have \[ \|a + \tfrac{1}{2} \rho \boldsymbol{X}\|_q^2 \leq \|a + \rho \boldsymbol{r} \boldsymbol{X}\|_q^2. \] Since $\boldsymbol{r}\boldsymbol{X}$ is a symmetric random variable satisfying $\|\boldsymbol{r}\boldsymbol{X}\|_2 = 1$, $\|\boldsymbol{r}\boldsymbol{X}\|_q = C$, Theorem 12 implies \[ \|a + \rho \boldsymbol{r} \boldsymbol{X}\|_q^2 \leq \|a + \boldsymbol{r} \boldsymbol{X}\|_2^2 = a^2 + 1 = \|a + \boldsymbol{X}\|_2^2. \] This completes the proof. $\Box$

If $\boldsymbol{X}$ is a discrete random variable then instead of computing $\frac{\|\boldsymbol{X}\|_2}{\|\boldsymbol{X}\|_q}$ it can sometimes be convenient to use a bound based on the minimum value of $\boldsymbol{X}$’s probability mass function. The following is a simple generalization of Proposition 9.5, whose proof is left for the exercises:

Proposition 16 Let $\boldsymbol{X}$ be a discrete random variable with probability mass function $\pi$. Write \[ \lambda = \min(\pi) = \min_{x \in \mathrm{range}(\boldsymbol{X})}\{\mathop{\bf Pr}[\boldsymbol{X} = x]\}. \] Then for any $q > 2$ we have $\|\boldsymbol{X}\|_q \leq (1/\lambda)^{1/2 – 1/q} \cdot \|\boldsymbol{X}\|_2$.

As a consequence of Theorem 15, if in addition $\mathop{\bf E}[\boldsymbol{X}] = 0$ then $\boldsymbol{X}$ is $(2, q, \tfrac{1}{2} \rho)$-hypercontractive for $\rho = \frac{1}{\sqrt{q-1}} \cdot \lambda^{1/2 – 1/q}$, and also $(q’,2,\tfrac{1}{2} \rho)$-hypercontractive by Proposition 8. (If $\boldsymbol{X}$ is symmetric then the factor of $\tfrac{1}{2}$ may be omitted.)

For each $q > 2$, the value $\rho = \Theta_q(\lambda^{1/2 – 1/q})$ in Proposition 16 has the optimal dependence on $\lambda$, up to a constant. In fact, a perfectly sharp version of Proposition 16 is known. The most important case is when $\boldsymbol{X}$ is a $\lambda$-biased bit; more precisely, when $\boldsymbol{X} = \phi({\boldsymbol{x}}_i)$ for ${\boldsymbol{x}}_i \sim \pi_\lambda$ in the notation of Definition 8.39. In that case, the below theorem (whose very technical proof is left to the exercises) is due to Latała and Oleszkiewicz [LO94]. The case of general discrete random variables is a reduction to the two-valued case due to Wolff [Wol07].

Theorem 17 Let $\boldsymbol{X}$ be a mean-zero discrete random variable and let $\lambda < 1/2$ be the least value of its probability mass function, as in Proposition 16. Then for $q > 2$ it holds that $\boldsymbol{X}$ is $(2,q, \rho)$-hypercontractive and $(q’,2,\rho)$-hypercontractive for \begin{multline} \label{eqn:LO-bound} \rho = \sqrt{\frac{\exp(u/q) – \exp(-u/q)}{\exp(u/q’)- \exp(-u/q’)}} = \sqrt{\frac{\sinh(u/q)}{\sinh(u/q’)}}, \\
\text{ with $u$ defined by } \exp(-u) = \tfrac{\lambda}{1-\lambda}. \end{multline} This value of $\rho$ is optimal, even under the assumption that $\boldsymbol{X}$ is two-valued.

Remark 18 It’s not hard to see that for $\lambda \to 1/2$ (hence $u \to 0$) we get $\rho \to \sqrt{\frac{1/q – (-1/q)}{1/q’ – (-1/q’)}} = \frac{1}{\sqrt{q-1}}$, consistent with the Two-Point Inequality from Section 1. Also, for $\lambda \to 0$ (hence $u \to \infty$) we get $\rho \sim \sqrt{\frac{\lambda^{-1/q}}{\lambda^{-1/q’}}} = \lambda^{1/2 – 1/q}$, showing that Proposition 16 is sharp up to a constant. In the exercises you are asked to investigate the function defining $\rho$ in \eqref{eqn:LO-bound} more carefully. In particular, you’ll show that $\rho \geq \frac{1}{\sqrt{q-1}} \cdot \lambda^{1/2 – 1/q}$ holds for all $\lambda$. Hence we can omit the factor of $\tfrac{1}{2}$ from the simpler bound in Proposition 16 even for non-symmetric random variables.

Corollary 19 Let $(\Omega, \pi)$ be a finite probability space, $|\Omega| \geq 2$, in which every outcome has probability at least $\lambda$. Let $f \in L^2(\Omega, \pi)$. Then for any $q > 2$ and $0 \leq \rho \leq \frac{1}{\sqrt{q-1}} \cdot \lambda^{1/2-1/q}$, \[ \|\mathrm{T}_\rho f\|_q \leq \|f\|_2 \quad\text{and}\quad \|\mathrm{T}_\rho f\|_2 \leq \|f\|_{q'}. \]

Proof: Recalling Chapter 8.3, this follows from the decomposition $f(x) = f^{\emptyset} + f^{=\{1\}}$, under which $\mathrm{T}_\rho f = f^{\emptyset} + \rho f^{=\{1\}}$. Note that for ${\boldsymbol{x}} \sim \pi$ the random variable $f^{=\{1\}}({\boldsymbol{x}})$ has mean zero, and the least value of its probability mass function is at least $\lambda$. $\Box$

The General Hypercontractivity Theorem stated at the beginning of the chapter now follows by applying the Hypercontractivity Induction Theorem from Section 1.

2 comments to §10.2: Hypercontractivity of general random variables

Leave a Reply




You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>