Chapter 11 notes

The subject of Gaussian space is too enormous to be surveyed here; some recommended texts include Janson [Jan97] and Bogachev [Bog98], the latter having an extremely thorough bibliography.

The Ornstein–Uhlenbeck semigroup dates back to the work of Uhlenbeck and Ornstein [UO30] whose motivation was to refine Einstein’s theory of Brownian motion [Ein05] to take into account the inertia of the particle. The relationship between the action of $\mathrm{U}_\rho$ on functions and on Hermite expansions (i.e., Proposition 31) dates back even further, to Mehler [Meh66]. Hermite polynomials were first defined by Laplace [Lap11], and then studied by Chebyshev [Che60] and Hermite [Her64]. See Lebedev [Leb72] (Chapter 4.15) for a proof of the pointwise convergence of a piecewise-$\mathcal{C}^1$ function’s Hermite expansion.

As mentioned in the notes on Chapter 9, the Gaussian Hypercontractivity Theorem is originally due to Nelson [Nel66] and now has many known proofs. The idea behind the proof we presented — first proving the Boolean hypercontractivity result and then deducing the Gaussian case by the Central Limit Theorem — is due to Gross [Gro75] (see also Trotter [Tro58]). Gross actually used the idea to prove his Gaussian Log-Sobolev Inequality, and thereby deduced the Gaussian Hypercontractivity Theorem. Direct proofs of the Gaussian Hypercontractivity Theorem have been given by Neveu [Nev76] (using stochastic calculus), Brascamp and Lieb [BL76a] (using rearrangement [BL76a]), and Ledoux [Led13] (using a variation on Exercises 2629); direct proofs of the Gaussian Log-Sobolev Inequality have been given by Adams and Clarke [AC79], by Bakry and Emery [BE85a], and by Ledoux [Led92], the latter two using semigroup techniques. Bakry’s survey [Bak94] on these topics is also recommended.

The Gaussian Isoperimetric Inequality was first proved independently by Borell [Bor75] and by Sudakov and Tsirel’son [ST78]. Both works derived the result by taking the isoperimetric inequality on the sphere (due to Lévy [Lev22] and Schmidt [Sch48], see also Figiel, Lindenstrauss, and Milman [FLM77]) and then taking “Poincaré’s limit” — i.e., viewing Gaussian space as a projection of the sphere of radius $\sqrt{n}$ in $n$ dimensions, with $n \to \infty$ (see Lévy [Lev22], McKean [McK73], and Diaconis and Freedman [DF87]). Ehrhard [Ehr83] gave a different proof using a symmetrization argument intrinsic to Gaussian space. This may be compared to the alternate proof of the spherical isoperimetric inequality [Ben84] based on the “two-point symmetrization” of Baernstein and Taylor [BT76] (analogous to Riesz rearrangement in Euclidean space and to the polarization operation from Exercise 2.45).

To carefully define Gaussian surface area for a broad class of sets requires venturing into the study of geometric measure theory and functions of bounded variation. For a clear and comprehensive development in the Euclidean setting (including the remark in Exercise 15), see the book by Ambrosio, Fusco, and Pallara [AFP00]. There’s not much difference between the Euclidean and finite-dimensional Gaussian settings; research on Gaussian perimeter tends to focus on the trickier infinite-dimensional case. For a thorough development of surface area in this latter setting (which of course includes finite-dimensional Gaussian space as a special case) see the work of Ambrosio, Miranda, Maniglia, and Pallara [AMMP10]; in particular, Theorem 4.1 in that work gives several additional equivalent definitions for $\text{surf}_\gamma$ besides those in Definition 48. Regarding the fact that $\mathbf{RS}_A’(0^+)$ is an equivalent definition, the Euclidean analogue of this statement was proven in Miranda et al. [MPPP07] and the statement itself follows similarly [Mir13] using Ambrosio et al. [AFR13]. (Our heuristic justification of 11.4.(2) is similar to the one given by Kane [Kan11a].) Additional related results can be found in Hino [Hin10] (which includes the remark about convex sets at the end of Definition 48), Ambrosio and Figalli [AF11], Miranda et al. [MNP12], and Ambrosio et al. [AFR13].

The inequality of Theorem 51 is explicit in Ledoux [Led94] (see also the excellent survey [Led96]); he used it to deduce the Gaussian Isoperimetric Inequality. He also noted that it’s essentially deducible from an earlier inequality of Pisier and Maurey [Pis86] (Theorem 2.2). Theorem 43, which expresses the subadditivity of rotation sensitivity, can be viewed as a discretization of the Pisier–Maurey inequality. This theorem appeared in work of Kindler and O’Donnell [KO12], which also made the observations about the volume-$\tfrac{1}{2}$ case of Borell’s Isoperimetric Theorem at the end of Section 3 and in Remark 75.

Bobkov’s Inequality [Bob97] in the special case of Gaussian space had already been implicitly established by Ehrhard [Ehr84]; the striking novelty of Bobkov’s work (partially inspired by Talagrand [Tal93]) was his reduction to the two-point Boolean inequality. The proof of this inequality which we presented is, as mentioned a discretization of the stochastic calculus proof of Barthe and Maurey [BM00a]. (In turn, they were extending the stochastic calculus proof of Bobkov’s Inequality in the Gaussian setting due to Capitaine, Hsu, and Ledoux [CHL97].) The idea that it’s enough to show that Claim 54 is “nearly true” by computing two derivatives — as opposed to showing it’s exactly true by computing four derivatives — was communicated to the author by Yuval Peres. Following Bobkov’s paper, Bakry and Ledoux [BL96b] established Theorem 55 in very general infinite-dimensional settings including Gaussian space; Ledoux [Led98] further pointed out that the Gaussian version of Bobkov’s Inequality has a very short and direct semigroup-based proof. See also Bobkov and Götze [BG99] and Tillich and Zémor [TZ00] for results similar to Bobkov’s Inequality in other discrete settings. Borell’s Isoperimetric Theorem is from Borell [Bor85]. Borell’s proof used “Ehrhard symmetrization” and actually gave much stronger results — e.g., that if $f, g \in L^2({\mathbb R}^n, \gamma)$ are nonnegative and $q \geq 1$, then $\langle (\mathrm{U}_\rho f)^q, g\rangle$ can only increase under simultaneous Ehrhard symmetrization of $f$ and $g$. There are at least four other known proofs of the basic Borell Isoperimetric Theorem. Beckner [Bec92] observed that the analogous isoperimetric theorem on the sphere follows from two-point symmetrization; this yields the Gaussian result via Poincaré’s limit (for details, see Carlen and Loss [CL90]). (This proof is perhaps the conceptually simplest one, though carrying out all the technical details is a chore.) Mossel and Neeman [MN12] gave the proof based on semigroup methods outlined in Exercises 2629, and later together with De [DMN12] gave a “Bobkov-style” Boolean proof (see Exercise 30). Finally, Eldan [Eld13] gave a proof using stochastic calculus.

As mentioned in Section 5 there are several known ways to prove the Berry–Esseen Theorem. Aside from the original method (characteristic functions), there is also Stein’s Method [Ste72,Ste86a]; see also, e.g., [Bol84,BH84,CGS11]. The Replacement Method approach we presented originates in the work of Lindeberg [Lin22]. The mollification techniques used (e.g., those in Exercise 40) are standard. The Invariance Principle as presented in Section 6 is from Mossel, O’Donnell, and Oleszkiewicz [MOO10]. Further extensions (e.g., Exercise 48) appear in the work of Mossel [Mos10]. In fact the Invariance Principle dates back to the 1971 work of Rotar’ [Rot73,Rot74]; therein he essentially proved the Invariance Principle for degree-$2$ multilinear polynomials (even employing the term “influence” as we do for the quantity in Definition 63). Earlier work on extending the Central Limit Theorem to higher-degree polynomials had focused on obtaining sufficient conditions for polynomials (especially quadratics) to have a Gaussian limit distribution; this is the subject of U-statistics. Rotar’ emphasized the idea of invariance and of allowing any (quadratic) polynomial with low influences. Rotar’ also credited Girko [Gir73] with related results in the case of positive definite quadratic forms. In 1975, Rotar’ [Rot75] generalized his results to handle multilinear polynomials of any constant degree, and also random vectors (as in Exercise 48). (Rotar’ also gave further refinements in 1979 [Rot79].)

The difference between the results of Rotar’ [Rot75] and Mossel et al. [MOO10] comes in the treatment of the error bounds. It’s somewhat difficult to extract simple-to-state error bounds from Rotar’ [Rot75], as the error there is presented as a sum over $i \in [n]$ of expressions $\mathop{\bf E}[F({\boldsymbol{x}}) \boldsymbol{1}_{|F({\boldsymbol{x}})| > u_i}]$, where $u_i$ involves $\mathbf{Inf}_i[F]$. (Partly this is so as to generalize the statement of the Lindeberg CLT.) Nevertheless, the work of Rotar’ implies a Lévy distance bound as in Corollary 69, with some inexplicit function $o_\epsilon(1)$ in place of $(1/\rho)^{O(k)} \epsilon^{1/8}$. By contrast, the work of Mossel et al. [MOO10] shows that a straightforward combination of the Replacement Method and hypercontractivity yields good, explicit error bounds. Regarding the Carbery–Wright Theorem [CW01], an alternative exposition appears in Nazarov, Sodin, and Vol’berg [NSV02].

Regarding the Majority Is Stablest Theorem (conjectured in Khot, Kindler, Mossel, and O’Donnell [KKMO04] and proved originally in Mossel, O’Donnell, and Oleszkiewicz [MOO05a]), it can be added that additional motivation for the conjecture came from Kalai [Kal02]. The fact that (SDP) is an efficiently computable relaxation for the Max-Cut problem dates back to the 1990 work of Delorme and Poljak [DP93]; however, they were unable to give an analysis relating its value to the optimum cut value. In fact, they conjectured that the case of the $5$-cycle from Example 72 had the worst ratio of $\mathrm{Opt}(G)$ to $\mathrm{SDPOpt}(G)$. Goemans and Williamson [GW94] were the first to give a sharp analysis of the SDP (Theorem 71), at least for $\theta \geq \theta^*$. Feige and Schechtman [FS02] showed an optimal integrality gap for the SDP for all values $\theta \geq \theta^*$ (in particular, showing an integrality gap ratio of $c_{\text{GW}}$); interestingly, their construction essentially involved proving Borell’s Isoperimetric Inequality (though they did it on the sphere rather than in Gaussian space). Both before and after the Khot et al. [KKMO04] UG-hardness result for Max-Cut there was a long line of work [Kar99,Zwi99,AS00,ASZ02,CW04,KV05,FL06,KO06] devoted to improving the known approximation algorithms and UG-hardness results, in particular for $\theta < \theta^*$. This culminated in the results from O’Donnell and Wu [OW08] (mentioned in Remark 74), which showed explicit matching $(\alpha,\beta)$-approximation algorithms, integrality gaps, and UG-hardness results for all $\frac12 < \beta < 1$. The fact that the best integrality gaps matched the best UG-hardness results proved not to be a coincidence; in contemporaneous work, Raghavendra [Rag08] showed that for any CSP, any SDP integrality gap could be turned into a matching Dictator-vs.-No-Notables test. This implies the existence of matching efficient $(\alpha,\beta)$-approximation algorithms and UG-hardness results for every CSP and every $\beta$. See Raghavendra’s thesis [Rag09] for full details of his earlier publication [Rag08] (including some Invariance Principle extensions building further on Mossel [Mos10]); see also Austrin’s work [Aus07,Aus07a] for precursors to the Raghavendra theory.

Exercise 31 concerns a problem introduced by Talagrand [Tal89]. Talagrand offers a $1000 prize [Tal06a] for a solution to the following Boolean version of the problem: Show that for any fixed $0 < \rho < 1$ and for $f : \{-1,1\}^n \to {\mathbb R}^{\geq 0}$ with $\mathop{\bf E}[f] = 1$ it holds that $\mathop{\bf Pr}[\mathrm{T}_\rho f > t] = o(1/t)$ as $t \to \infty$. (The rate of decay may depend on $\rho$ but not, of course, on $n$; in fact, a bound of the form $O(\frac{1}{t \sqrt{\log t}})$ is expected.) The result outlined in Exercise 31 (obtained together with James Lee) is for the very special case of $1$-dimensional Gaussian space; Ball, Barthe, Bednorz, Oleszkiewicz, and Wolff [BBB+13] obtained the same result and also showed a bound of $O(\frac{\log \log t}{t \sqrt{\log t}})$ for $d$-dimensional Gaussian space (with the constant in the $O(\cdot)$ depending on $d$).

The Multifunction Invariance Principle (Exercise 48 and its special case Exercise 46) are from Mossel [Mos10]; the version for general product spaces (Exercise 49) is from Mossel, O’Donnell, and Oleszkiewicz [MOO10].

1 comment to Chapter 11 notes

  • Andy D

    Thanks for writing these excellent chapter notes. I wish more math textbooks would attempt this level of historical detail.

Leave a Reply




You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>