- Let $\boldsymbol{X}$ be a random variable and let $1 \leq r \leq \infty$. Recall that the triangle (Minkowski) inequality implies that for real-valued functions $f_1, f_2$, \[ \|f_1(\boldsymbol{X}) + f_2(\boldsymbol{X})\|_r \leq \|f_1(\boldsymbol{X})\|_r + \|f_2(\boldsymbol{X})\|_r. \] More generally, if $w_1, \dots, w_m$ are nonnegative reals and $f_1, \dots, f_m$ are real functions then \[ \|w_1 f_1(\boldsymbol{X}) + \cdots + w_m f_m(\boldsymbol{X})\|_r \leq w_1 \|f_1(\boldsymbol{X})\|_r + \cdots + w_m \|f_m(\boldsymbol{X})\|_r. \] Still more generally, if $\boldsymbol{Y}$ is a random variable independent of $\boldsymbol{X}$ and $f(\boldsymbol{X},\boldsymbol{Y})$ is a (measurable) real-valued function then it holds that \[ \bigl\| \mathop{\bf E}_{\boldsymbol{Y}}[f(\boldsymbol{X},\boldsymbol{Y})] \bigr\|_{r, \boldsymbol{X}} \leq \mathop{\bf E}_{\boldsymbol{Y}}[\|f(\boldsymbol{X},\boldsymbol{Y})\|_{r,\boldsymbol{X}}]. \] Using this last fact, show that whenever $0 < p \leq q \leq \infty$, \[ \left\|\ \| f(\boldsymbol{X},\boldsymbol{Y})\|_{p,\boldsymbol{Y}}\ \right\|_{q, \boldsymbol{X}} \leq \left\|\ \| f(\boldsymbol{X},\boldsymbol{Y})\|_{q,\boldsymbol{X}}\ \right\|_{p, \boldsymbol{Y}}. \] (Hint: raise the inequality to the power of $p$ and use $r = q/p$.)
- The goal of this exercise is to prove Proposition 9.15: if $\boldsymbol{X}$ and $\boldsymbol{Y}$ are independent $(p,q,\rho)$-hypercontractive random variables then so is $\boldsymbol{X}+\boldsymbol{Y}$. Let $a, b \in {\mathbb R}$.
- First obtain \[ \|a+\rho b(\boldsymbol{X}+\boldsymbol{Y})\|_{q, \boldsymbol{X},\boldsymbol{Y}} \leq \left\|\ \|a+\rho b \boldsymbol{X} + b\boldsymbol{Y}\|_{p, \boldsymbol{Y}}\ \right\|_{q,\boldsymbol{X}}. \]
- Next, upper-bound this by \[ \left\|\ \|a+b\boldsymbol{Y} + \rho b \boldsymbol{X}\|_{q, \boldsymbol{X}}\ \right\|_{p,\boldsymbol{Y}}. \] (Hint: Exercise 1.)
- Finally, upper-bound this by \[ \left\|\ \|a+b\boldsymbol{Y} + b \boldsymbol{X}\|_{p, \boldsymbol{X}}\ \right\|_{p,\boldsymbol{Y}} = \|a+b(\boldsymbol{X}+\boldsymbol{Y})\|_{p,\boldsymbol{X},\boldsymbol{Y}}. \]
- Let $\boldsymbol{X}_1, \dots, \boldsymbol{X}_n$ be independent $(p,q,\rho)$-hypercontractive random variables. Let $F(x) = \sum_{S \subseteq [n]} \widehat{F}(S)x^S$ be an $n$-variate multilinear polynomial. Define formally the multilinear polynomial $\mathrm{T}_\rho F(x) = \sum_{S \subseteq [n]} \rho^{|S|}\widehat{F}(S)x^S$. The goal of this exercise is to show \begin{equation} \label{eqn:poly-hc-hc} \|\mathrm{T}_\rho F(\boldsymbol{X}_1, \dots, \boldsymbol{X}_n)\|_q \leq \|F(\boldsymbol{X}_1, \dots, \boldsymbol{X}_n)\|_p. \end{equation} Note that this result yields an alternative deduction of the Hypercontractivity Theorem for $\pm 1$ bits from the Two-Point Inequality. A (notationally intense) generalization of this exercise can also be used as an alternative inductive strategy for deducing the General Hypercontractivity Theorem from Proposition 16 or Theorem 17.
- Why is Exercise 2 a special case of \eqref{eqn:poly-hc-hc}?
- Begin the inductive proof of \eqref{eqn:poly-hc-hc} by showing that the base case $n = 0$ is trivial.
- For the case of general $n$, first establish \[ \|\mathrm{T}_\rho F(\boldsymbol{X})\|_q \leq \left\| \ \|\mathrm{T}'_\rho E(\boldsymbol{X}') + \boldsymbol{X}_n \mathrm{T}'_\rho D(\boldsymbol{X}') \|_{p, \boldsymbol{X}_n} \ \right\|_{q, \boldsymbol{X}'}, \] where we are using the notation $x’ = (x_1, \dots, x_{n-1})$, $F(x) = E(x’) + x_n D(x’)$, and $\mathrm{T}_\rho’$ for the operator acting formally on $(n-1)$-variate multilinear polynomials.
- Complete the inductive step, using steps similar to Exercises 2(b),(c). (Hint: for $X_n$ a real constant, why is $\mathrm{T}’_\rho E(\boldsymbol{X}’) + X_n \mathrm{T}’_\rho D(\boldsymbol{X}’) = \mathrm{T}’_\rho (E + X_n D)(\boldsymbol{X}’)$?)
- This exercise is concerned with the possibility of a converse for Proposition 8.
- In our proof of the Two-Point Inequality we used Proposition 9.19 to deduce that a uniform bit ${\boldsymbol{x}} \sim \{-1,1\}$ is $(p,q,\rho)$-hypercontractivity if it’s $(q’,p’,\rho)$-hypercontractive. Why can’t we use Proposition 9.19 to deduce this for a general random variable $\boldsymbol{X}$?
- For each $1 < p < 2$, exhibit a random variable $\boldsymbol{X}$ which is $(p, 2, \rho)$-hypercontractive (for some $\rho$) but not $(2,p’,\rho)$-hypercontractive.
-
- Regarding Remark 2, heuristically justify (in the manner of Exercise 9.24) the following statement: If $A, B \subseteq \{-1,1\}^n$ are concentric Hamming balls with volumes $\exp(-\frac{a^2}{2})$ and $\exp(-\frac{b^2}{2})$ and $\rho a \leq b \leq a$ (where $0 < \rho < 1$) then \[ \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in B] \gtrapprox \exp\left(-\tfrac12 \tfrac{a^2 – 2\rho ab + b^2}{1-\rho^2}\right); \] and further, if $b < \rho a$ then $\mathop{\bf Pr}[{\boldsymbol{x}} \in A, \boldsymbol{y} \in B] \sim \mathop{\bf Pr}[{\boldsymbol{x}} \in A]$. Here you should treat $\rho$ as fixed and $a, b \to \infty$.
- Similarly, heuristically justify that the Reverse Small-Set Expansion Theorem is essentially sharp by considering diametrically opposed Hamming balls.
- The goal of this exercise (and Exercise 7) is to prove the Reverse Hypercontractivity Theorem and its equivalent Two-Function version:
Reverse Hypercontractivity Theorem Let $f : \{-1,1\}^n \to {\mathbb R}^{\geq 0}$ be a nonnegative function and let $-\infty \leq q < p \leq 1$. Then $\|\mathrm{T}_\rho f\|_q \geq \|f\|_p$ for $0 \leq \rho \leq \sqrt{(1-p)/(1-q)}$.
Reverse Two-Function Hypercontractivity Theorem Let $f, g : \{-1,1\}^n \to {\mathbb R}^{\geq 0}$ be nonnegative, let $r, s \leq 0$, and assume $0 \leq \rho \leq \sqrt{rs} \leq 1$. Then \[ \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}}[f({\boldsymbol{x}}) g(\boldsymbol{y})] \geq \|f\|_{1+r} \|g\|_{1+s}. \]
Recall that for $-\infty < p < 0$ and for positive functions $f \in L^2(\Omega, \pi)$ the “norm” $\|f\|_p$ retains the definition $\mathop{\bf E}[f^p]^{-1/p}$. (The cases of $p = -\infty$, $p = 0$, and nonnegative functions are defined by appropriate limits; in particular $\|f\|_{-\infty}$ is the minimum of $f$’s values, $\|f\|_0$ is the geometric mean of $f$’s values, and $\|f\|_p$ is $0$ whenever $f$ is not everywhere positive. We also define $p’$ by $\frac1p + \frac1{p’} = 1$, with $0′ = 0$.)
The Reverse Two-Function Hypercontractivity Theorem can be thought of as a generalization of the lesser known “reverse Hölder inequality” in the setting of $L^2(\{-1,1\}^n, \pi_{1/2}^{\otimes n})$:
Reverse Hölder inequality Let $f \in L^2(\Omega, \pi)$ be a positive function. Then for any $p < 1$, \[ \|f\|_p = \inf\ \{ \mathop{\bf E}[fg] : g > 0, \|g\|_{p’} = 1\}. \] In particular, for $r < 0$ and $f, g > 0$ we have $\mathop{\bf E}[fg] \geq \|f\|_{1+r}\|g\|_{1+1/r}$.
- Show that to prove these two Reverse Hypercontractivity Theorems it suffices to consider the case of $f, g : \{-1,1\}^n \to {\mathbb R}^+$; i.e., strictly positive functions.
- Show that the Reverse Two-Function Hypercontractivity Theorem is equivalent (via the reverse Hölder inequality) to the Reverse Hypercontractivity Theorem.
- Reduce the Reverse Two-Function Hypercontractivity Theorem to the $n = 1$ case. (Hint: virtually identical to the Two-Function Hypercontractivity Induction.) Further reduce to following:
Reverse Two-Point Inequality Let $-\infty \leq q < p \leq 1$ and let $0 \leq \rho \leq \sqrt{(1-p)/(1-q)}$. Then $\|\mathrm{T}_\rho f\|_q \geq \|f\|_p$ for any $f : \{-1,1\} \to {\mathbb R}^+$.
- The goal of this exercise is to prove the Reverse Two-Point Inequality.
- Similar to the non-reverse case, the main effort is proving the inequality assuming that $0 < q < p \leq 1$ and that $\rho = \sqrt{(1-p)/(1-q)}$. Do this by mimicking the proof of the Two-Point Inequality. (Hint: you will need the inequality $(1+t)^\theta \geq 1 + \theta t$ for $\theta \geq 1$ and you will need to show that $\tfrac{j-r}{\sqrt{1-r}}$ is an increasing function of $r$ on $[0,1)$ for all $j \geq 2$.)
- Extend to the case of $0 \leq \rho \leq \sqrt{(1-p)/(1-q)}$. (Hint: use the fact that for any $f : \{-1,1\}^n \to {\mathbb R}^{\geq 0}$ and $-\infty \leq p \leq q \leq \infty$ we have $\|f\|_p \leq \|f\|_q$. You can prove this generalization of Exercise 1.14 by reducing to the case of negative $p$ and $q$ to the case of positive $p$ and $q$.)
- Establish the case $q = -\infty$ case of the Reverse Two-Point Inequality.
- Show that the cases $-\infty < q < p < 0$ follow by ``duality''. (Hint: like Proposition 9.19 but with the reverse Hölder inequality.)
- Show that the cases $q < 0 < p$ follow by the semigroup property of $\mathrm{T}_\rho$.
- Finally, treat the cases of $p = 0$ or $q = 0$.
- Give a simple proof of the $n = 1$ case of the Reverse Two-Function Hypercontractivity Theorem when $r = s = -1/2$. (Hint: replace $f$ and $g$ by $f^2$ and $g^2$; then you don't even need to assume $f$ and $g$ are nonnegative.) Can you also give a simple proof when $r = s = -1 + 1/k$ for integers $k > 2$?
- By selecting $\text{``}r\text{''} = -\rho \frac{\rho a + b}{a+\rho b}$ and $\text{``}s\text{''} = -\rho \frac{a +\rho b}{\rho a+ b}$, prove the Reverse Small-Set Expansion Theorem mentioned in Remark 3. (Hint: the negative norm of a $0$-$1$-indicator is $0$, so be sure to verify no negative norms arise.)
- Let $g \in L^2(\Omega^n, \pi^{\otimes n})$. Writing $x = (x_1, x')$, where $x' = (x_2, \dots, x_n)$, carefully justify the following identity of one-input functions: $(\mathrm{T}^1_{\rho} g)_{ \mid x'} = \mathrm{T}_\rho (g_{ \mid x'})$.
- Prove Proposition 11.
- Let $\boldsymbol{X}$ be a random variable and let $\boldsymbol{Y}$ denote its symmetrization $\boldsymbol{X}-\boldsymbol{X}'$, where $\boldsymbol{X}'$ is an independent copy of $\boldsymbol{X}$. Show for any $t, \theta \in {\mathbb R}$ that $\mathop{\bf Pr}[|\boldsymbol{Y}| \geq t] \leq 2\mathop{\bf Pr}[|\boldsymbol{X} - \theta| \geq t/2]$.
- The goal of this exercise is to establish Lemma 41.
- Show that we may take $c_2 = 1$ (and that equality holds). Henceforth assume $q > 2$.
- By following the idea of our $q = 4$ proof, reduce to showing that there exists $0 < c_q < 1$ such that \[ |1 - c_qx|^q + c_qqx \leq |1+x|^q - qx \quad \forall x \in {\mathbb R}. \]
- Further reduce to showing there exists $0 < c_q < 1$ such that \begin{equation} \label{eqn:unsymm-ex1} \frac{|1 – c_qx|^q + c_qqx – 1}{x^2} \leq \frac{|1+x|^q – qx – 1}{x^2} \quad \forall x \in {\mathbb R}. \end{equation} Here you should also establish that both sides are continuous functions of $x \in {\mathbb R}$ once the value at $x = 0$ is defined appropriately.
- Show that there exists $M > 0$ such that for every $0 < c_q < \tfrac{1}{2}$, inequality \eqref{eqn:unsymm-ex1} holds once $|x| \geq M$. (Hint: consider the limit of both sides as $|x| \to \infty$.)
- Argue that it suffices to show that \begin{equation} \label{eqn:unsymm-ex2} \frac{|1+x|^q – qx -1}{x^2} \geq \eta \end{equation} for some universal positive constant $\eta > 0$. (Hint: a uniform continuity argument for $(x,c_q) \in [-M,M] \times [0,\tfrac{1}{2}]$.)
- Establish \eqref{eqn:unsymm-ex2}. (Hint: the best possible $\eta$ is $1$ but to just achieve some positive $\eta$, argue using Bernoulli’s inequality that $\frac{|1+x|^q – qx -1}{x^2}$ is everywhere positive and then observe that it tends to $\infty$ as $|x| \to \infty$.)
- Possibly using a different argument, what is the best asymptotic bound you can achieve for $c_q$? Is $c_q \geq \Omega(\frac{\log q}{q})$ possible?
- In the proof of Lemma 41, show that the largest $c$ for which inequality (11) holds is the smaller real root of $c^4-2c^3-2c+1 = 0$, namely $c \approx .435$.
-
- Show that $1 + 6c^2 x^2 + c^4 x^4 \leq 1 + 6x^2 + 4x^3 + x^4$ holds for all $x \in {\mathbb R}$ when $c = 1/2$. (Can you also establish it for $c \approx .5269$?)
- Show that if $\boldsymbol{X}$ is a random variable satisfying $\mathop{\bf E}[\boldsymbol{X}] = 0$ and $\|\boldsymbol{X}\|_4 < \infty$ then $\|a + \tfrac{1}{2} \boldsymbol{r} \boldsymbol{X}\|_4 \leq \|a + \boldsymbol{X}\|_4$ for all $a \in {\mathbb R}$, where $\boldsymbol{r} \sim \{-1,1\}$ is a uniformly random bit independent of $\boldsymbol{X}$. (Cf. Lemma 14.)
- Establish the following improvement of Theorem 42 in the case of $q = 4$: for all $f \in L^2(\Omega^n, \pi^{\otimes n})$, \[ \|\mathrm{T}_{\frac12 \boldsymbol{r}} f({\boldsymbol{x}})\|_{4,\boldsymbol{r},{\boldsymbol{x}}} \leq \|f({\boldsymbol{x}})\|_{4, {\boldsymbol{x}}} \] (where ${\boldsymbol{x}} \sim \pi^{\otimes n}$, $\boldsymbol{r} \sim \{-1,1\}^n$).
- Complete the proof of Theorem 37. (Hint: you’ll need to rework Exercise 9.8 as in Lemma 36.)
- Prove Proposition 16.
- Recall from Theorem 17 the function $\rho = \rho(\lambda)$ defined for $\lambda \in (0,1/2)$ (and fixed $q > 2$) by \[ \rho = \rho(\lambda) = \sqrt{\frac{\exp(u/q) - \exp(-u/q)}{\exp(u/q')- \exp(-u/q')}} = \sqrt{\frac{\sinh(u/q)}{\sinh(u/q')}}, \] where $u = u(\lambda)$ is defined by $\exp(-u) = \tfrac{\lambda}{1-\lambda}$.
- Show that $\rho$ is an increasing function of $\lambda$. (Hint: one route is to reduce to showing that $\rho^2$ is a decreasing function of $u \in (0,\infty)$, reduce to showing that $q\tanh(u/q)$ is an increasing function of $q \in (1, \infty)$, reduce to showing $\frac{\tanh r}{r}$ is a decreasing function of $r \in (0,\infty)$, and reduce to showing $\sinh(2r) \geq 2r$.)
- Verify the following statements from Remark 18: \begin{align*} \text{for fixed $q$ and $\lambda \to 1/2$,} \quad &\rho \to \frac{1}{\sqrt{q-1}}; \\ \text{for fixed $q$ and $\lambda \to 0$,} \quad &\rho \sim \lambda^{1/2 – 1/q}. \end{align*} Also show: \[ \text{for fixed $\lambda$ and $q \to \infty$,} \quad \rho \sim \sqrt{\frac{u}{\sinh u}} \sqrt{\frac{1}{q}}, \] and $\sqrt{\frac{u}{\sinh u}} \sim 2\lambda\ln(1/\lambda)$ for $\lambda \to 0$.
- Show that $\rho \geq \frac{1}{\sqrt{q-1}} \lambda^{1/2-1/q}$ holds for all $\lambda$.
- Let $(\Omega, \pi)$ be a finite probability space, $|\Omega| \geq 2$, in which every outcome has probability at least $\lambda$. Let $1 < p < 2$ and $0 < \rho < 1$. The goal of this exercise is to prove the result of Wolff [Wol07] that, subject to $\|\mathrm{T}_\rho f\|_2 = 1$, every $f \in L^2(\Omega,\pi)$ which minimizes $\|f\|_p$ takes on at most two values (and there is at least one minimizing $f$).
- We consider the equivalent problem of minimizing $F(f) = \|f\|_p^p$ subject to $G(f) = \|\mathrm{T}_\rho f\|_2^2 = 1$. Show that both $F(f)$ and $G(f)$ are $\mathcal{C}^1$ functionals (identifying functions $f$ with points in ${\mathbb R}^\Omega$).
- Argue from continuity that the minimum value for $\|f\|_p^p$ subject to $\|\mathrm{T}_\rho f\|_2^2 = 1$ is attained. Henceforth write $f_0$ to denote any minimizer; the goal is to show that $f_0$ takes on at most two values.
- Show that $f_0$ is either everywhere nonnegative or everywhere nonpositive. (Hint: by homogeneity our problem is equivalent to maximizing $\|\mathrm{T}_\rho f\|_2$ subject to $\|f\|_p = 1$; now use Exercise 2.29.) Replacing $f_0$ by $|f_0|$ if necessary, henceforth assume $f_0$ is nonnegative.
- Show that $\nabla F(f_0) = \pi \cdot p f_0^{p-1}$ and $\nabla G(f_0) = \pi \cdot 2 \mathrm{T}_{\rho^2} f_0$. Here $\pi \cdot g$ signifies the pointwise product of functions on $\Omega$, with $\pi$ thought of as a function $\Omega \to {\mathbb R}^{\geq 0}$. (Hint: for the latter, write $G(f) = \langle \mathrm{T}_{\rho^2} f, f \rangle$.)
- Use the method of Lagrange Multipliers to show that $c f_0^{p-1} = T_{\rho^2} f_0$ for some $c \in {\mathbb R}^+$. (Hint: you’ll need to note that $\nabla G(f_0) \neq 0$.)
- Writing $\mu = \mathop{\bf E}[f_0]$, argue that each value $y = f(\omega)$ satisfies the equation \begin{equation} \label{eqn:wolff-equation} c y^{p-1} = \rho^2 y + (1-\rho^2) \mu. \end{equation}
- Show that \eqref{eqn:wolff-equation} has at most two solutions for $y \in {\mathbb R}^+$, thereby completing the proof that $f_0$ takes on at most two values. (Hint: strict concavity of $y^{p-1}$.)
- Suppose $q > 2$. By slightly modifying the above argument, show that subject to $\|g\|_2 = 1$, every $g \in L^2(\Omega,\pi)$ which maximizes $\|\mathrm{T}_{\rho} g\|_q$ takes on at most two values (and there is at least one maximizing $g$). (Hint: at some point you might want to make the substitution $g = \mathrm{T}_\rho f$; note that $g$ is two-valued if $f$ is.)
- Fix $1 < p < 2$ and $0 < \lambda < 1/2$. Let $\Omega = \{-1,1\}$ and $\pi = \pi_\lambda$, meaning $\pi(-1) = \lambda$, $\pi(1) = 1-\lambda$. The goal of this exercise is to show the result of Latała and Oleszkiewicz [LO94]: the largest value of $\rho$ for which $\|\mathrm{T}_\rho f\|_2 \leq \|f\|_p$ holds for all $f \in L^2(\Omega, \pi)$ is as given in Theorem 17; i.e., it satisfies \begin{equation} \label{eqn:LO-bound2} \rho^2 = r^* = \frac{\exp(u/p’) – \exp(-u/p’)}{\exp(u/p)- \exp(-u/p)}, \end{equation} where $u$ is defined by $\exp(-u) = \tfrac{\lambda}{1-\lambda}$. (Here we are using $p = q’$ to facilitate the proof; we get the $(2,q)$-hypercontractivity statement by Proposition 9.19.)
- Let’s introduce the notation $\alpha = \lambda^{1/p}$, $\beta = (1-\lambda)^{1/p}$. Show that \[ r^* = \frac{\alpha^p\beta^{2-p} - \alpha^{2-p}\beta^p}{\alpha^2-\beta^2}. \]
- Let $f \in L^2(\Omega, \pi)$. Write $\mu = \mathop{\bf E}[f]$ and $\delta = \mathrm{D}_1 f = \hat{f}(1)$. Our goal will be to show \begin{equation} \label{eqn:lo-goal} \mu^2 + \delta^2 r^* = \|T_{\sqrt{r^*}} f\|_2^2 \leq \|f\|_p^2. \end{equation} In the course of doing this, we’ll also exhibit a nonconstant function $f$ which makes the above inequality sharp. Why does this establish that no larger value of $\rho$ is possible?
- Show that without loss of generality we may assume \[ f(-1) = \frac{1+y}{\alpha}, \quad f(1) = \frac{1-y}{\beta} \] for some $-1 < y < 1$. (Hint: first use Exercise 2.29 and a continuity argument to show that we may assume $f > 0$; then use homogeneity of \eqref{eqn:lo-goal}.)
- The left-hand side of \eqref{eqn:lo-goal} is now a quadratic function of $y$. Show that our $r^*$ is precisely such that \[ \text{LHS}\eqref{eqn:lo-goal} = Ay^2 + C \] for some constants $A, C$. I.e., $r^*$ makes the linear term in $y$ drop out. (Hint: work exclusively with the $\alpha, \beta$ notation and recall from Definition 8.44 that $\delta^2 = \lambda(1-\lambda)(f(1)-f(-1))^2 = \alpha^p \beta^p(f(1)-f(-1))^2$.)
- Compute that \begin{equation} \label{eqn:loA} A = 2\frac{\beta^{p-1}-\alpha^{p-1}}{\beta-\alpha}. \end{equation} (Hint: you’ll want to multiply the above expression by $\alpha^p + \beta^p = 1$.)
- Show that \[ \text{RHS}\eqref{eqn:lo-goal} = ((1+y)^p + (1-y)^p)^{2/p}. \] Why does it now suffice to show \eqref{eqn:lo-goal} just for $0 \leq y < 1$?
- Let $y^* = \frac{\beta-\alpha}{\beta+\alpha} > 0$. Show that if $y = -y^*$ then $f$ is a constant function and both sides of \eqref{eqn:lo-goal} are equal to $\frac{4}{(\alpha+\beta)^2}$.
- Deduce that both sides of \eqref{eqn:lo-goal} are equal to $\frac{4}{(\alpha+\beta)^2}$ for $y = y^*$. Verify that after scaling, this yields the following nonconstant function for which \eqref{eqn:lo-goal} is sharp: $f(x) = \exp(-xu/p)$.
- Write $y = \sqrt{z}$ for $0 \leq z < 1$. By now we have reduced to showing \[ Az+C \leq ((1+\sqrt{z})^p + (1-\sqrt{z})^p)^{2/p}, \] knowing that both sides are equal when $\sqrt{z} = y^*$. Calling the expression on the right $\phi(z)$, show that \[ \frac{d}{dz} \phi(z)\Bigr|_{\sqrt{z}=y^*} = A. \] (Hint: you’ll need $\alpha^p+\beta^p = 1$, as well as the fact from part (h) that $\phi(z) = \frac{4}{(\alpha+\beta)^2}$ when $\sqrt{z}=y^*$.) Deduce that we can complete the proof by showing that $\phi(z)$ is convex for $z \in [0,1)$.
- Show that $\phi$ is indeed convex on $[0,1)$ by showing that its derivative is a nondecreasing function of $z$. (Hint: use the Generalized Binomial Theorem as well as $1 < p < 2$ to show that $(1+\sqrt{z})^p + (1-\sqrt{z})^p$ is expressible as $\sum_{j=0}^\infty b_j z^j$ where each $b_j$ is positive.)
- Complete the proof of Theorem 17. (Hint: besides Exercises 19 and 20, you'll also need Exercise 18(a).)
-
- Let $\Phi : [0, \infty) \to {\mathbb R}$ be defined by $\Phi(x) = x \ln x$, where we take $0\ln0 = 0$. Verify that $\Phi$ is a smooth, strictly convex function.
- Consider the following:
Definition 47 Let $g \in L^2(\Omega, \pi)$ be a nonnegative function. The entropy of $g$ is defined by \[ \mathbf{Ent}[g] = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi}[\Phi(g({\boldsymbol{x}}))] – \Phi\Bigl(\mathop{\bf E}_{{\boldsymbol{x}} \sim \pi}[g({\boldsymbol{x}})]\Bigr). \]
Verify that $\mathbf{Ent}[g] \geq 0$ always, that $\mathbf{Ent}[g] = 0$ if and only if $g$ is constant, and that $\mathbf{Ent}[c g] = c\mathbf{Ent}[g]$ for any constant $c \geq 0$.
- Suppose $\varphi$ is a probability density on $\{-1,1\}^n$ (recall Definition 1.20). Show that $\mathbf{Ent}[\varphi] = D_{\mathrm{KL}}(\varphi || \pi_{1/2}^{\otimes n})$, the Kullback–Leibler divergence of the uniform distribution from $\varphi$ (more precisely, the distribution with density $\varphi$).
- The goal of this exercise is to establish:
The Log-Sobolev Inequality Let $f: \{-1,1\}^n \to {\mathbb R}$. Then ${\tfrac{1}{2} \mathbf{Ent}[f^2] \leq \mathbf{I}[f]}$.
- Writing $\rho = \exp(-t)$, the $(p,2)$-Hypercontractivity Theorem tells us that \[ \|\mathrm{T}_{\exp(-t)} f\|_2^2 \leq \|f\|^2_{1+\exp(-2t)} \] for all $t \geq 0$. Denote the left- and right-hand sides as $\textrm{LHS}(t)$, $\textrm{RHS}(t)$. Verify that these are smooth functions of $t \in [0,\infty)$ and that $\textrm{LHS}(0) = \textrm{RHS}(0)$. Deduce that $\textrm{LHS}'(0) \leq \textrm{RHS}'(0)$.
- Compute $\textrm{LHS}'(0) = -2 \mathbf{I}[f]$. (Hint: pass through the Fourier representation.)
- Compute $\textrm{RHS}’(0) = -\mathbf{Ent}[f^2]$, thereby deducing the Log-Sobolev Inequality. (Hint: as an intermediate step, define $F(t) = \mathop{\bf E}[|f|^{1+\exp(-2t)}]$ and show that $\textrm{RHS}’(0) = F(0)\ln F(0) + F’(0)$.)
-
- Let $f : \{-1,1\}^n \to {\mathbb R}$. Show that $\mathbf{Ent}[(1+\epsilon f)^2] \sim 2 \mathop{\bf Var}[f] \epsilon^2$ as $\epsilon \to 0$.
- Deduce the Poincaré Inequality for $f$ from the Log-Sobolev Inequality.
-
- Deduce from the Log-Sobolev Inequality that for $f : \{-1,1\}^n \to \{-1,1\}$ with $\alpha = \min\{\mathop{\bf Pr}[f = 1], \mathop{\bf Pr}[f=-1]\}$, \begin{equation} \label{eqn:weak-edge-iso} 2 \alpha \ln(1/\alpha) \leq \mathbf{I}[f]. \end{equation} This is off by a factor of $\ln 2$ from the optimal edge-isoperimetric inequality Theorem 2.38. (Hint: apply the inequality to either $\tfrac{1}{2} – \tfrac{1}{2} f$ or $\tfrac{1}{2} + \tfrac{1}{2} f$.)
- Give a more streamlined direct derivation of \eqref{eqn:weak-edge-iso} by differentiating the Small-Set Expansion Theorem.
- This exercise gives a direct proof of the Log-Sobolev Inequality.
- The first step is to establish the $n = 1$ case. Towards this, show that we may assume $f : \{-1,1\} \to {\mathbb R}$ is nonnegative and has mean $1$. (Hint: Exercise 22(b).)
- Thus it remains to establish $\tfrac{1}{2} \mathbf{Ent}[(1+b{\boldsymbol{x}})^2] \leq b^2$ for $b \in [-1,1]$. Show that $g(b) = b^2 – \tfrac{1}{2} \mathbf{Ent}[(1+b{\boldsymbol{x}})^2]$ is smooth on $[-1,1]$ and satisfies $g(0) = 0$, $g’(0) = 0$, and $g”(b) = \frac{2b^2}{1+b^2} + \ln\frac{1+b^2}{1-b^2} \geq 0$ for $b \in (-1,1)$. Explain why this completes the proof of the $n = 1$ case of the Log-Sobolev Inequality.
- Show that for any two functions $f_+, f_- : \{-1,1\}^n \to {\mathbb R}$, \[ \left(\tfrac{\sqrt{\mathop{\bf E}[f_+^2]} – \sqrt{\mathop{\bf E}[f_-^2]}}{2}\right)^2 \leq \mathop{\bf E}\left[\Bigl(\tfrac{f_+ - f_-}{2}\Bigr)^2\right]. \] (Hint: the triangle inequality for $\| \cdot \|_2$.)
- Prove the Log-Sobolev Inequality via “induction by restrictions” (as described in Section 9.4). (Hint: for the right-hand side, establish $\mathbf{Inf}[f] = \mathop{\bf E}[(\tfrac{f_+ - f_-}{2})^2] + \tfrac{1}{2} \mathbf{I}[f_+] + \tfrac{1}{2} \mathbf{I}[f_-]$. For the left-hand side, apply induction, then the $n = 1$ base case, then part (c).)
-
- By following the strategy of Exercise 23, establish the following:
Log-Sobolev Inequality for general product space domains Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ and write $\lambda = \min(\pi)$, $\lambda’ = 1-\lambda$, $\exp(-u) = \frac{\lambda}{\lambda’}$. Then $ \tfrac{1}{2} \varrho \mathbf{Ent}[f^2] \leq \mathbf{I}[f]$, where \[ \varrho = \varrho(\lambda) = \frac{\tanh(u/2)}{u/2} = 2 \frac{\lambda' - \lambda}{\ln \lambda' - \ln \lambda}. \]
- Show that $\varrho(\lambda) \sim 2/\ln(1/\lambda))$ as $\lambda \to 0$.
- Let $f : \{-1,1\}^n \to \{-1,1\}$ and treat $\{-1,1\}^n$ as having the $p$-biased distribution $\pi_p^{\otimes n}$. Write $q = 1-p$. Show that if $\alpha = \min\{\mathop{\bf Pr}_{\pi_p}[f = 1], \mathop{\bf Pr}_{\pi_p}[f = -1]\}$ then \[ 4\frac{q - p}{\ln q - \ln p} \alpha \ln(1/\alpha) \leq \mathbf{I}[f^{(p)}] \] and hence, for $p \to 0$, \begin{equation} \label{eqn:weak-p-biased-isoperim} \alpha \log_p \alpha \leq (1+o_p(1))p \cdot \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[\mathrm{sens}_f({\boldsymbol{x}})]. \end{equation} We remark that \eqref{eqn:weak-p-biased-isoperim} is known to hold without the $o_p(1)$ for all $p \leq 1/2$.
- By following the strategy of Exercise 23, establish the following:
- Prove Theorem 20. (Hint: recall Proposition 8.28.)
- Let $\boldsymbol{X}_1, \dots, \boldsymbol{X}_n$ be independent $(2,q,\rho)$-hypercontractive random variables and let $F(x) = \sum_{|S| \leq k} \widehat{F}(S)\,x^S$ be an $n$-variate multilinear polynomial of degree at most $k$. Show that \[ \|F(\boldsymbol{X}_1, \dots, \boldsymbol{X}_n)\|_q \leq (1/\rho)^k \|F(\boldsymbol{X}_1, \dots, \boldsymbol{X}_n)\|_2. \] (Hint: you’ll need Exercise 3.)
- Let $0 < \lambda \leq 1/2$ and let $(\Omega, \pi)$ be a finite probability space in which some outcome $\omega_0 \in \Omega$ has $\pi(\omega_0) = \lambda$. (For example, $\Omega = \{-1,1\}$, $\pi = \pi_\lambda$.) Define $f \in L^2(\Omega, \pi)$ by setting $f(\omega_0) = 1$, $f(\omega) = 0$ for $\omega \neq \omega_0$. For $q \geq 2$, compute $\|f\|_q/\|f\|_2$ and deduce (in light of the proof of Theorem 20) that Corollary 19 cannot hold for $\rho > \lambda^{1/2 – 1/q}$.
- Prove Theorem 21.
- Prove Theorem 22.
- Prove Theorem 23. (Hint: immediately worsen $q-1$ to $q$ so that finding the optimal choice of $q$ is easier.)
- Prove Theorem 24.
- Prove Friedgut’s Junta Theorem for general product spaces as stated in Section 3.
- Show that in the proof of Theorem 28, inequality (4) implies $F(p_c + \eta p_c) \geq 1-\epsilon$. (Hint: consider $\frac{d}{dp} \ln(1-F(p))$.)
- Justify the various calculations and observations in Examples 43.
-
- Let $p = \frac1n$ and let $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ be any boolean-valued function. Show that $\mathbf{I}[f] \leq 4$. (Hint: Proposition 8.45.)
- Let us specialize to the case $f = \chi_{[n]}$. Show that $f$ is not $.1$-close to any width-$O(1)$ DNF (under the $\frac1n$-biased distribution, for $n$ sufficiently large). This shows that the assumption of monotonicity can’t be removed from Friedgut’s Conjecture. (Hint: show that fixing any constant number of coordinates cannot change the bias of $\chi_{[n]}$ very much.)
- A function $h : \Omega^n \to \Sigma$ is said to expressed as a pseudo-junta if the following hold: There are “juntas” $f_1, \dots, f_m : \Omega^n \to \{\mathsf{True}, \mathsf{False}\}$ with domains $J_1, \dots, J_m \subseteq [n]$ respectively. Further, $g : (\Omega \cup \{\ast\})^n \to \Sigma$, where $\ast$ is a new symbol not in $\Omega$. Finally, for each input $x \in \Omega^n$ we have $h(x) = g(y)$, where for $j \in [n]$, \[ y_j = \begin{cases} x_j & \text{if $j \in J_i$ for some $i$ with $f_i(x) = \mathsf{True}$,} \\ \ast & \text{else}. \end{cases} \] An alternative explanation is that on input $x$, the junta $f_i$ decides whether the coordinates in its domain are “notable”; then, $h(x)$ must be determined based only on the set of all notable coordinates. Finally, if $\pi$ is a distribution on $\Omega$ we say that the pseudo-junta has width-$k$ under $\pi^{\otimes n}$ if \[ \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[\#\{j : \boldsymbol{y}_j \neq \ast\}] \leq k; \] in other words, the expected number of notable coordinates is at most $k$. For $h \in L^2(\Omega^n, \pi^{\otimes n})$ we simply say that $h$ is a $k$-pseudo-junta. Show that if such a $k$-pseudo-junta $h$ is $\{-1,1\}$-valued then $\mathbf{I}[f] \leq 4k$. (Hint: referring to the second statement in Proposition 8.24, consider the notable coordinates for both ${\boldsymbol{x}}$ and ${\boldsymbol{x}}’ = ({\boldsymbol{x}}_i, \dots, {\boldsymbol{x}}_{i-1}, {\boldsymbol{x}}’_i, {\boldsymbol{x}}_{i+1}, \dots, {\boldsymbol{x}}_n)$.)
- Establish the following further consequence of Bourgain’s Sharp Threshold Theorem: Let $f : \{\mathsf{True}, \mathsf{False}\}^n \to \{\mathsf{True}, \mathsf{False}\}$ be a monotone function with $\mathbf{I}[f^{(p)}] \leq K$. Assume $\mathop{\bf Var}[f] \geq .01$ and $0 < p \leq \exp(-cK^2)$, where $c$ is a large universal constant. Then there exists $T \subseteq [n]$ with $|T| \leq O(K)$ such that \[ \mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}}) = \mathsf{True} \mid {\boldsymbol{x}}_i = \mathsf{True} \text{ for all $i \in T$}] \geq \mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}})= \mathsf{True}] + \exp(-O(K^2)). \] (Hint: Bourgain’s Sharp Threshold Theorem yields a booster either towards $\mathsf{True}$ or towards $\mathsf{False}$. In the former case you’re easily done; to rule out the latter case, use the fact that $p|T| \ll \exp(-O(K^2))$.)
- Suppose that in Bourgain’s Sharp Threshold Theorem we drop the assumption that $\mathop{\bf Var}[f] \geq .01$. (Assume at least that $f$ is nonconstant.) Show that there is some $\tau$ with $|\tau| \geq \mathop{\bf stddev}[f] \cdot \exp(-O(\mathbf{I}[f]^2/\mathop{\bf Var}[f]^2))$ such that \[ \mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[\exists T \subseteq [n], |T| \leq O(\mathbf{I}[f]/\mathop{\bf Var}[f]) \text{ such that ${\boldsymbol{x}}_T$ is a } \tau\text{-booster}] \geq |\tau|. \] (Cf. Exercise 9.32.)
- In this exercise we give the beginnings of the idea of how Bourgain’s Sharp Threshold Theorem can be used to show sharp thresholds for interesting monotone properties. We will consider $\neg\text{3Col}$, the property of a random $v$-vertex graph $\boldsymbol{G} \sim \mathcal{G}(v,p)$ being non-$3$-colourable.
- Prove that the critical property $p_c$ satisfies $p_c \leq O(1/v)$. I.e., establish that there is a universal constant $C$ such that $\mathop{\bf Pr}[\boldsymbol{G} \sim \mathcal{G}(v, C/v) \text{ is } 3\text{-colourable}] = o_n(1)$. (Hint: union-bound over all potential $3$-colourings.)
- Towards showing (non-)$3$-colourability has a sharp threshold, suppose the property had constant total influence at the critical probability. Bourgain’s Sharp Threshold Theorem would imply that there is a $\tau$ of constant magnitude such that for $\boldsymbol{G} \sim \mathcal{G}(v,p_c)$, there is a $|\tau|$ chance that $\boldsymbol{G}$ contains a $\tau$-boosting induced subgraph $\boldsymbol{G}_T$. There are two cases, depending on the sign of $\tau$. It’s easy to rule out that the boost is in favour of $3$-colourability; the absence of a few edges shouldn’t increase the probability of $3$-colourability by much (cf. Exercise 41). On the other hand, it might seem plausible that the presence of a certain constant number of edges should boost the probability of non-$3$-colourability by a lot. For example, the presence of a $4$-clique immediately boosts the probability to $1$. However, the point is that at the critical probability it is very unlikely that $\boldsymbol{G}$ contains a $4$-clique (or indeed, any “local” witness to non-$3$-colourability). Short of showing this, prove at least that the expected number of $4$-cliques in $\boldsymbol{G} \sim \mathcal{G}(v,p)$ is $o_v(1)$ unless $p = \Omega(v^{-2/3}) \gg p_c$.
In Ex 1. should all the norms in the next line be r?
$$\|w_1 f_1(\boldsymbol{X}) + \cdots + w_m f_m(\boldsymbol{X})\|_r \leq w_1 \|f_1(\boldsymbol{X})\|_1 + \cdots + w_m \|f_m(\boldsymbol{X})\|_m.$$
It also seems to me that the assumption that $ w_1+w_2+\cdots+w_m$ is redundant.
You’re right on both counts. Thanks again (and again!) Noam!
(I meant the assumption that $w_1+w_2+\cdots+w_m=1$)
Minor typo in the Reverse Hölder inequality:
It should be- $$\|f\|_p = \inf\ \{ \mathop{\bf E}[fg] : g > 0, \|g\|_{p’} = 1\}.$$
Sharp eyes! Thanks.
I think in ex. 10, it should be- $$(\mathrm{T}^1_{\rho} g)_{ \mid x’} = \mathrm{T}_\rho (g_{ \mid x’})$$
Yep, thanks!
I think in 13 c it should be:$$\begin{equation} \label{eqn:unsymm-ex1} \frac{|1 – c_qx|^q + c_qqx-1}{x^2} \leq \frac{|1+x|^q – qx-1}{x^2} \quad \forall x \in {\mathbb R}. \end{equation}$$
to obtain continuity.
$$ \frac{|1 – c_qx|^q + c_qqx-1}{x^2} \leq \frac{|1+x|^q – qx-1}{x^2} \quad \forall x \in {\mathbb R}.$$
Absolutely right, thanks.
$$\begin{equation} \frac{|1 – c_qx|^q + c_qqx}{x^2} \leq \frac{|1+x|^q – qx}{x^2} \quad \forall x \in {\mathbb R}. \end{equation}$$
I think that in exercise 23, it should be $(p,2)$ Hypercontractivity.
Yep, thanks.