A small warning: On the download page, I am using a “Google form” to [...]]]>
A small warning: On the download page, I am using a “Google form” to ask for your email address, to which the PDF will be sent. I promise not to send you any other emails; just the one containing the attached PDF. I would also like to warn that I think Google allows at most 100 emails per day, so there is some small chance that we might hit the limit today. If that happens, please try again tomorrow! Finally, the server hosting this website has been a bit flaky lately; I hope it will be fine in the future.
Many thanks to Cambridge University Press for agreeing (several years ago!) to allow the PDF to be freely available. Of course, I’m also very thankful to the heroes who sent in comments and bug-fixes to the blog draft of the book. Please feel free to continue doing so — I’m happy to keep making corrections.
Regarding the differences between the printed book and the electronic version: The printed book is “Version 1.00″ and the PDF is “Version 1.01″. The main difference is that about 40 small typos/bugs have been fixed (nothing major; in almost all cases you can easily guess the correct thing once you notice the mistake). I would also like to add that the numbering of the theorems/definitions/exercises/etc. is identical between the printed book and the PDF, so if you would like to cite something, it doesn’t matter which version you cite. The page numbering is not the same, though, so please try not to cite things by page number.
Once more, thanks again to all the readers of the blog; I hope you enjoy the book!
Added later: We hit 100 emails; however, it appears that the downloads are still working. But in any case, if you have trouble, please try again tomorrow.
]]>In July I was at the (First Annual?) Swedish Summer School in Computer Science. This was wonderfully organized by KTH faculty Per Austrin, Johan Håstad, and Jakob Nordström, and took place [...]]]>
In July I was at the (First Annual?) Swedish Summer School in Computer Science. This was wonderfully organized by KTH faculty Per Austrin, Johan Håstad, and Jakob Nordström, and took place on the scenic island of Djurhamn in the Stockholm archipelago. Boaz Barak and I alternated lectures (he on Sums of Squares) to 50 or so highly motivated graduate students. I was very impressed that many of them worked quite hard on the homework outside of lecture times! Some photos by Talya Abram can be found here, including one of me holding an exquisite table linen gift, with the De–Mossel–Neeman differential equation in the background.
The second summer school (just finished) was held at the Nesin Mathematical Village, an idyllic site in the hills near the Aegean coast of Turkey. The village operates year-round, but its main business is in the summer, when a hundred or so high school, college, and graduate math students come each week for courses. The spirit of the place is charming, and the math village itself is quite beautiful, as you can see from these photos. I lectured here, though I wish I got to lecture here.
By the way, if you ever teach a class on Analysis of Boolean Functions yourself, please drop me a line; I like to keep track of links to them. (So far I know of 31 such courses since 2002.)
]]> We describe a web of connections between the following [...]]]>
We describe a web of connections between the following topics: the mathematical theory of voting and social choice; the computational complexity of the Maximum Cut problem; the Gaussian Isoperimetric Inequality and Borell’s generalization thereof; the Hypercontractive Inequality of Bonami; and, the analysis of Boolean functions. A major theme is the technique of reducing inequalities about Gaussian functions to inequalities about Boolean functions $f : \{-1,1\}^n \to \{-1,1\}$, and then using induction on $n$ to further reduce to inequalities about functions $f : \{-1,1\} \to \{-1,1\}$. We especially highlight De, Mossel, and Neeman’s recent use of this technique to prove the Majority Is Stablest Theorem and Borell’s Isoperimetric Inequality simultaneously.
The article actually includes a few ideas that don’t explicitly appear in the book, so please take a look if you’re interested. I hope the article will make a nice fit for the congress; it has a few connections (some major, some minor) with work being discussed by several other speakers here, including Boaz Barak, Andrei Bulatov, Sourav Chatterjee, Julia Chuzhoy, Alessio Figalli, Monique Laurent, Michel Ledoux, and (last but certainly not least) our newest Nevanlinna Prize winner, Subhash Khot. (Congratulations Subhash!) Indeed, my talk on Saturday will mostly center around a very famous conjecture made by Subhash — I refer, of course, to the Majority Is Stablest Conjecture.
PS: In case you happen to actually be at the congress, you can buy a “special ICM edition” of the book for $30 (or, even cheaper, ₩30,000) at the Cambridge University Press booth.
]]>One small note: In the relatively near future I will post here an electronic version of the book. It will actually be version 1.01, with version 1.00 being the [...]]]>
One small note: In the relatively near future I will post here an electronic version of the book. It will actually be version 1.01, with version 1.00 being the printed book. The electronic version will include fixes for typos found subsequent to the book’s printing.
Due to laziness, however, I will stop incorporating fixes into the old blog posts. So those are now basically deprecated. Feel free to take a look at them till the PDF comes out, of course. But this means there’s a chance that when you find a typo or mistake on the blog pages, I might respond with “Thanks — someone else found that too, and it’s already been corrected.”
Once again, thanks so much to all of you for the corrections!
]]>There will be at least one more post on this blog; I will release a free PDF of the book in its entirety some time around September. Watch this space!
]]>As mentioned in the notes on Chapter 9, the Gaussian Hypercontractivity Theorem is originally due to Nelson [Nel66] and now has many known proofs. The idea behind the proof we presented — first proving the Boolean hypercontractivity result and then deducing the Gaussian case by the Central Limit Theorem — is due to Gross [Gro75] (see also Trotter [Tro58]). Gross actually used the idea to prove his Gaussian Log-Sobolev Inequality, and thereby deduced the Gaussian Hypercontractivity Theorem. Direct proofs of the Gaussian Hypercontractivity Theorem have been given by Neveu [Nev76] (using stochastic calculus), Brascamp and Lieb [BL76a] (using rearrangement [BL76a]), and Ledoux [Led13] (using a variation on Exercises 26–29); direct proofs of the Gaussian Log-Sobolev Inequality have been given by Adams and Clarke [AC79], by Bakry and Emery [BE85a], and by Ledoux [Led92], the latter two using semigroup techniques. Bakry’s survey [Bak94] on these topics is also recommended.
The Gaussian Isoperimetric Inequality was first proved independently by Borell [Bor75] and by Sudakov and Tsirel’son [ST78]. Both works derived the result by taking the isoperimetric inequality on the sphere (due to Lévy [Lev22] and Schmidt [Sch48], see also Figiel, Lindenstrauss, and Milman [FLM77]) and then taking “Poincaré’s limit” — i.e., viewing Gaussian space as a projection of the sphere of radius $\sqrt{n}$ in $n$ dimensions, with $n \to \infty$ (see Lévy [Lev22], McKean [McK73], and Diaconis and Freedman [DF87]). Ehrhard [Ehr83] gave a different proof using a symmetrization argument intrinsic to Gaussian space. This may be compared to the alternate proof of the spherical isoperimetric inequality [Ben84] based on the “two-point symmetrization” of Baernstein and Taylor [BT76] (analogous to Riesz rearrangement in Euclidean space and to the polarization operation from Exercise 2.45).
To carefully define Gaussian surface area for a broad class of sets requires venturing into the study of geometric measure theory and functions of bounded variation. For a clear and comprehensive development in the Euclidean setting (including the remark in Exercise 15), see the book by Ambrosio, Fusco, and Pallara [AFP00]. There’s not much difference between the Euclidean and finite-dimensional Gaussian settings; research on Gaussian perimeter tends to focus on the trickier infinite-dimensional case. For a thorough development of surface area in this latter setting (which of course includes finite-dimensional Gaussian space as a special case) see the work of Ambrosio, Miranda, Maniglia, and Pallara [AMMP10]; in particular, Theorem 4.1 in that work gives several additional equivalent definitions for $\text{surf}_\gamma$ besides those in Definition 48. Regarding the fact that $\mathbf{RS}_A’(0^+)$ is an equivalent definition, the Euclidean analogue of this statement was proven in Miranda et al. [MPPP07] and the statement itself follows similarly [Mir13] using Ambrosio et al. [AFR13]. (Our heuristic justification of 11.4.(2) is similar to the one given by Kane [Kan11a].) Additional related results can be found in Hino [Hin10] (which includes the remark about convex sets at the end of Definition 48), Ambrosio and Figalli [AF11], Miranda et al. [MNP12], and Ambrosio et al. [AFR13].
The inequality of Theorem 51 is explicit in Ledoux [Led94] (see also the excellent survey [Led96]); he used it to deduce the Gaussian Isoperimetric Inequality. He also noted that it’s essentially deducible from an earlier inequality of Pisier and Maurey [Pis86] (Theorem 2.2). Theorem 43, which expresses the subadditivity of rotation sensitivity, can be viewed as a discretization of the Pisier–Maurey inequality. This theorem appeared in work of Kindler and O’Donnell [KO12], which also made the observations about the volume-$\tfrac{1}{2}$ case of Borell’s Isoperimetric Theorem at the end of Section 3 and in Remark 75.
Bobkov’s Inequality [Bob97] in the special case of Gaussian space had already been implicitly established by Ehrhard [Ehr84]; the striking novelty of Bobkov’s work (partially inspired by Talagrand [Tal93]) was his reduction to the two-point Boolean inequality. The proof of this inequality which we presented is, as mentioned a discretization of the stochastic calculus proof of Barthe and Maurey [BM00a]. (In turn, they were extending the stochastic calculus proof of Bobkov’s Inequality in the Gaussian setting due to Capitaine, Hsu, and Ledoux [CHL97].) The idea that it’s enough to show that Claim 54 is “nearly true” by computing two derivatives — as opposed to showing it’s exactly true by computing four derivatives — was communicated to the author by Yuval Peres. Following Bobkov’s paper, Bakry and Ledoux [BL96b] established Theorem 55 in very general infinite-dimensional settings including Gaussian space; Ledoux [Led98] further pointed out that the Gaussian version of Bobkov’s Inequality has a very short and direct semigroup-based proof. See also Bobkov and Götze [BG99] and Tillich and Zémor [TZ00] for results similar to Bobkov’s Inequality in other discrete settings. Borell’s Isoperimetric Theorem is from Borell [Bor85]. Borell’s proof used “Ehrhard symmetrization” and actually gave much stronger results — e.g., that if $f, g \in L^2({\mathbb R}^n, \gamma)$ are nonnegative and $q \geq 1$, then $\langle (\mathrm{U}_\rho f)^q, g\rangle$ can only increase under simultaneous Ehrhard symmetrization of $f$ and $g$. There are at least four other known proofs of the basic Borell Isoperimetric Theorem. Beckner [Bec92] observed that the analogous isoperimetric theorem on the sphere follows from two-point symmetrization; this yields the Gaussian result via Poincaré’s limit (for details, see Carlen and Loss [CL90]). (This proof is perhaps the conceptually simplest one, though carrying out all the technical details is a chore.) Mossel and Neeman [MN12] gave the proof based on semigroup methods outlined in Exercises 26–29, and later together with De [DMN12] gave a “Bobkov-style” Boolean proof (see Exercise 30). Finally, Eldan [Eld13] gave a proof using stochastic calculus.
As mentioned in Section 5 there are several known ways to prove the Berry–Esseen Theorem. Aside from the original method (characteristic functions), there is also Stein’s Method [Ste72,Ste86a]; see also, e.g., [Bol84,BH84,CGS11]. The Replacement Method approach we presented originates in the work of Lindeberg [Lin22]. The mollification techniques used (e.g., those in Exercise 40) are standard. The Invariance Principle as presented in Section 6 is from Mossel, O’Donnell, and Oleszkiewicz [MOO10]. Further extensions (e.g., Exercise 48) appear in the work of Mossel [Mos10]. In fact the Invariance Principle dates back to the 1971 work of Rotar’ [Rot73,Rot74]; therein he essentially proved the Invariance Principle for degree-$2$ multilinear polynomials (even employing the term “influence” as we do for the quantity in Definition 63). Earlier work on extending the Central Limit Theorem to higher-degree polynomials had focused on obtaining sufficient conditions for polynomials (especially quadratics) to have a Gaussian limit distribution; this is the subject of U-statistics. Rotar’ emphasized the idea of invariance and of allowing any (quadratic) polynomial with low influences. Rotar’ also credited Girko [Gir73] with related results in the case of positive definite quadratic forms. In 1975, Rotar’ [Rot75] generalized his results to handle multilinear polynomials of any constant degree, and also random vectors (as in Exercise 48). (Rotar’ also gave further refinements in 1979 [Rot79].)
The difference between the results of Rotar’ [Rot75] and Mossel et al. [MOO10] comes in the treatment of the error bounds. It’s somewhat difficult to extract simple-to-state error bounds from Rotar’ [Rot75], as the error there is presented as a sum over $i \in [n]$ of expressions $\mathop{\bf E}[F({\boldsymbol{x}}) \boldsymbol{1}_{|F({\boldsymbol{x}})| > u_i}]$, where $u_i$ involves $\mathbf{Inf}_i[F]$. (Partly this is so as to generalize the statement of the Lindeberg CLT.) Nevertheless, the work of Rotar’ implies a Lévy distance bound as in Corollary 69, with some inexplicit function $o_\epsilon(1)$ in place of $(1/\rho)^{O(k)} \epsilon^{1/8}$. By contrast, the work of Mossel et al. [MOO10] shows that a straightforward combination of the Replacement Method and hypercontractivity yields good, explicit error bounds. Regarding the Carbery–Wright Theorem [CW01], an alternative exposition appears in Nazarov, Sodin, and Vol’berg [NSV02].
Regarding the Majority Is Stablest Theorem (conjectured in Khot, Kindler, Mossel, and O’Donnell [KKMO04] and proved originally in Mossel, O’Donnell, and Oleszkiewicz [MOO05a]), it can be added that additional motivation for the conjecture came from Kalai [Kal02]. The fact that (SDP) is an efficiently computable relaxation for the Max-Cut problem dates back to the 1990 work of Delorme and Poljak [DP93]; however, they were unable to give an analysis relating its value to the optimum cut value. In fact, they conjectured that the case of the $5$-cycle from Example 72 had the worst ratio of $\mathrm{Opt}(G)$ to $\mathrm{SDPOpt}(G)$. Goemans and Williamson [GW94] were the first to give a sharp analysis of the SDP (Theorem 71), at least for $\theta \geq \theta^*$. Feige and Schechtman [FS02] showed an optimal integrality gap for the SDP for all values $\theta \geq \theta^*$ (in particular, showing an integrality gap ratio of $c_{\text{GW}}$); interestingly, their construction essentially involved proving Borell’s Isoperimetric Inequality (though they did it on the sphere rather than in Gaussian space). Both before and after the Khot et al. [KKMO04] UG-hardness result for Max-Cut there was a long line of work [Kar99,Zwi99,AS00,ASZ02,CW04,KV05,FL06,KO06] devoted to improving the known approximation algorithms and UG-hardness results, in particular for $\theta < \theta^*$. This culminated in the results from O’Donnell and Wu [OW08] (mentioned in Remark 74), which showed explicit matching $(\alpha,\beta)$-approximation algorithms, integrality gaps, and UG-hardness results for all $\frac12 < \beta < 1$. The fact that the best integrality gaps matched the best UG-hardness results proved not to be a coincidence; in contemporaneous work, Raghavendra [Rag08] showed that for any CSP, any SDP integrality gap could be turned into a matching Dictator-vs.-No-Notables test. This implies the existence of matching efficient $(\alpha,\beta)$-approximation algorithms and UG-hardness results for every CSP and every $\beta$. See Raghavendra’s thesis [Rag09] for full details of his earlier publication [Rag08] (including some Invariance Principle extensions building further on Mossel [Mos10]); see also Austrin’s work [Aus07,Aus07a] for precursors to the Raghavendra theory.
Exercise 31 concerns a problem introduced by Talagrand [Tal89]. Talagrand offers a $1000 prize [Tal06a] for a solution to the following Boolean version of the problem: Show that for any fixed $0 < \rho < 1$ and for $f : \{-1,1\}^n \to {\mathbb R}^{\geq 0}$ with $\mathop{\bf E}[f] = 1$ it holds that $\mathop{\bf Pr}[\mathrm{T}_\rho f > t] = o(1/t)$ as $t \to \infty$. (The rate of decay may depend on $\rho$ but not, of course, on $n$; in fact, a bound of the form $O(\frac{1}{t \sqrt{\log t}})$ is expected.) The result outlined in Exercise 31 (obtained together with James Lee) is for the very special case of $1$-dimensional Gaussian space; Ball, Barthe, Bednorz, Oleszkiewicz, and Wolff [BBB+13] obtained the same result and also showed a bound of $O(\frac{\log \log t}{t \sqrt{\log t}})$ for $d$-dimensional Gaussian space (with the constant in the $O(\cdot)$ depending on $d$).
The Multifunction Invariance Principle (Exercise 48 and its special case Exercise 46) are from Mossel [Mos10]; the version for general product spaces (Exercise 49) is from Mossel, O’Donnell, and Oleszkiewicz [MOO10].
]]>Theorem 77 For each $\rho \in (0,1)$ and $\epsilon > 0$, there exists a constant $C_{\rho, \epsilon}$ such that the following holds: If $f, g : \{-1,1\}^n \to [\epsilon, 1-\epsilon]$, then \[ \mathop{\bf E}_{\substack{({\boldsymbol{x}}, {\boldsymbol{x}}') \\ \text{$\rho$-correlated}}}[\Lambda_\rho(f({\boldsymbol{x}}), g({\boldsymbol{x}}'))] \leq \Lambda_\rho(\mathop{\bf E}[f], \mathop{\bf E}[g]) + C_{\rho, \epsilon}\cdot (\Delta_n[f] + \Delta_n[g]). \] Here we using the following inductive notation: $\Delta_1[f] = \mathop{\bf E}[|f - \mathop{\bf E}[f]|^3]$, and \[ \Delta_n[f] = \mathop{\bf E}_{{\boldsymbol{x}}_n \sim \{-1,1\}}\left[\Delta_{n-1}[f_{ \mid {\boldsymbol{x}}_n}]\right] + \Delta_1[f^{\subseteq \{n\}}]. \]
Proposition 78 Let $\psi : {\mathbb R}^d \to {\mathbb R}$ be $c$-Lipschitz. Then for any $\eta > 0$ there exists $\widetilde{\psi}_\eta : {\mathbb R}^d \to {\mathbb R}$ satisfying $\|\psi – \widetilde{\psi}_\eta\|_\infty \leq c \sqrt{d} \eta$ and $\|\partial^\beta \widetilde{\psi}_\eta\|_\infty \leq C_{|\beta|} c \sqrt{d}/\eta^{|\beta|-1}$ for each multi-index $\beta \in {\mathbb N}^{d}$ with $|\beta| = \sum_i \beta_i \geq 1$, where $C_k$ is a constant depending only on $k$.
Invariance Principle for Sums of Random Vectors Let $\vec{\boldsymbol{X}}_1, \dots, \vec{\boldsymbol{X}}_n$, $\vec{\boldsymbol{Y}}_1, \dots, \vec{\boldsymbol{Y}}_n$ be independent ${\mathbb R}^d$-valued random variables with matching means and covariance matrices; i.e., $\mathop{\bf E}[\vec{\boldsymbol{X}}_t] = \mathop{\bf E}[\vec{\boldsymbol{Y}}_t]$ and $\mathop{\bf Cov}[\vec{\boldsymbol{X}}_t] = \mathop{\bf Cov}[\vec{\boldsymbol{Y}}_t]$ for all $t \in [n]$. (Note that the $d$ individual components of a particular $\vec{\boldsymbol{X}}_t$ or $\vec{\boldsymbol{Y}}_t$ are not required to be independent.) Write $\vec{\boldsymbol{S}}_X = \sum_{t=1}^n \vec{\boldsymbol{X}}_t$ and $\vec{\boldsymbol{S}}_Y = \sum_{t=1}^n \vec{\boldsymbol{Y}}_t$. Then for any $\mathcal{C}^3$ function $\psi : {\mathbb R}^d \to {\mathbb R}$ satisfying $\|\partial^\beta \psi\|_\infty \leq C$ for all $|\beta| = 3$, \[ \left| \mathop{\bf E}[\psi(\vec{\boldsymbol{S}}_X)] – \mathop{\bf E}[\psi(\vec{\boldsymbol{S}}_Y)] \right| \leq C \gamma_{\vec{X}\vec{Y}}, \] where \[ \gamma_{\vec{X}\vec{Y}} = \sum_{\substack{\beta \in {\mathbb N}^d\\|\beta| = 3}} \frac{1}{\beta!} \sum_{t=1}^n \bigl(\mathop{\bf E}[|\vec{\boldsymbol{X}}_t^\beta|] + \mathop{\bf E}[|\vec{\boldsymbol{Y}}_t^\beta|]\bigr). \]
Multifunction Invariance Principle Let $F^{(1)}$, \dots, $F^{(d)}$ be formal $n$-variate multilinear polynomials each of degree at most $k \in {\mathbb N}$. Let $\vec{{\boldsymbol{x}}}_1, \dots, \vec{{\boldsymbol{x}}}_n$ and $\vec{\boldsymbol{y}}_1, \dots, \vec{\boldsymbol{y}}_n$ be independent ${\mathbb R}^d$-valued random variables such that $\mathop{\bf E}[\vec{{\boldsymbol{x}}}_t] = \mathop{\bf E}[\vec{\boldsymbol{y}}_t] = 0$ and $M_t = \mathop{\bf Cov}[\vec{{\boldsymbol{x}}}_t] = \mathop{\bf Cov}[\vec{\boldsymbol{y}}_t]$ for each $t \in [n]$. Assume each $M_t$ has all its diagonal entries equal to $1$ (i.e., each of the $d$ components of $\vec{{\boldsymbol{x}}}_t$ has variance $1$, and similarly for $\vec{\boldsymbol{y}}_t$). Further assume each component random variable $\vec{{\boldsymbol{x}}}_t^{(j)}$ and $\vec{\boldsymbol{y}}_t^{(j)}$ is $(2,3,\rho)$-hypercontractive ($t \in [n]$, $j \in [d]$). Then for any $\mathcal{C}^3$ function $\psi : {\mathbb R}^d \to {\mathbb R}$ satisfying $\|\partial^\beta \psi\|_\infty \leq C$ for all $|\beta| = 3$, \[ \left|\mathop{\bf E}[\psi(\vec{F}(\vec{{\boldsymbol{x}}}))] – \mathop{\bf E}[\psi(\vec{F}(\vec{\boldsymbol{y}}))]\right| \leq \tfrac{Cd^2}{3} \cdot (1/\rho)^{3k} \cdot \sum_{t=1}^n \sum_{j=1}^d \mathbf{Inf}_t[F^{(j)}]^{3/2}. \] Here we are using the following notation: If $\vec{\boldsymbol{z}} = (\vec{\boldsymbol{z}}_1, \dots, \vec{\boldsymbol{z}}_n)$ is a sequence of ${\mathbb R}^d$-valued random variables, $\vec{F}(\vec{\boldsymbol{z}})$ denotes the vector in ${\mathbb R}^d$ whose $j$th component is $F^{(j)}(\vec{\boldsymbol{z}}_1^{(j)}, \dots, \vec{\boldsymbol{z}}_n^{(j)})$.
(Hint: Combine the proofs of the Basic Invariance Principle and the Invariance Principle for Sums of Random Vectors, Exercise 46. The only challenging part should be notation.)
Invariance Principle in general product spaces
Let $(\Omega, \pi)$ be a finite probability space, $|\Omega| = m \geq 2$, in which every outcome has probability at least $\lambda$. Suppose $f \in L^2(\Omega^n, \pi^{\otimes n})$ has degree at most $k$; thus, fixing some Fourier basis $\phi_0, \dots, \phi_{m-1}$ for $L^2(\Omega, \pi)$, we have \[ f = \sum_{\substack{\alpha \in {\mathbb N}^n_{<m} \\ \#\alpha \leq k}} \widehat{f}(\alpha)\phi_\alpha. \] Introduce indeterminates $x = (x_{i,j})_{i \in [n], j \in [m-1]}$ and let $F$ be the formal $(m-1)n$-variate polynomial of degree at most $k$ defined by \[ F(x) = \sum_{\#\alpha \leq k} \widehat{f}(\alpha) \prod_{i \in \mathrm{supp}(\alpha)} x_{i, \alpha_i}. \] Then for any $\psi : {\mathbb R} \to {\mathbb R}$ that is $\mathcal{C}^3$ and satisfies $\|\psi^{\prime \prime \prime}\|_\infty \leq C$ we have \[ \left|\mathop{\bf E}_{{\boldsymbol{x}} \sim \{-1,1\}^{(m-1)n}}[\psi(F({\boldsymbol{x}}))] – \mathop{\bf E}_{\boldsymbol{\omega} \sim \pi^{\otimes n}}[\psi(f(\boldsymbol{\omega}))]\right| \leq \tfrac{C}{3} \cdot (2\sqrt{2/\lambda})^k \cdot \sum_{i=1}^n \mathbf{Inf}_i[f]^{3/2}. \]
(Hint: For $0 \leq t \leq n$, define the function $h_t \in L^2(\Omega^t \times \{-1,1\}^{(m-1)(n-t)}, \pi^{\otimes t} \otimes \pi_{1/2}^{\otimes (m-1)(n-t)})$ via \[ h_t(\boldsymbol{\omega}_1, \dots, \boldsymbol{\omega}_t, {\boldsymbol{x}}_{t+1,1}, \dots, {\boldsymbol{x}}_{n,m-1}) = \sum_{\#\alpha \leq k} \widehat{f}(\alpha) \prod_{\substack{i \in \mathrm{supp}(\alpha) \\ i \leq t}} \phi_{\alpha_i}(\boldsymbol{\omega}_i) \prod_{\substack{i \in \mathrm{supp}(\alpha) \\ i > t}} {\boldsymbol{x}}_{i,\alpha_i}. \] Express \[ h_{t} = \mathrm{E}_t h_{t} + \mathrm{L}_t h_{t} = \mathrm{E}_t h_t + \sum_{j = 1}^m D_j \cdot \phi_j(\boldsymbol{\omega}_t) \] where \[ D_j = \sum_{\alpha : \alpha_t = j} \widehat{f}(\alpha) \prod_{\substack{i \in \mathrm{supp}(\alpha) \\ i < t}} \phi_{\alpha_i}(\boldsymbol{\omega}_i) \prod_{\substack{i \in \mathrm{supp}(\alpha) \\ i > t}} {\boldsymbol{x}}_{i,\alpha_i}, \] and note that $h_{t-1} = \mathrm{E}_t h_t + \sum_{j = 1}^m D_j \cdot {\boldsymbol{x}}_{t,j}$.)
Theorem 79 Let $(\Omega, \pi)$ be a finite probability space in which each outcome has probability at least $\lambda$. Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ have range $[0,1]$. Suppose that $f$ has no $(\epsilon, \frac{1}{\log(1/\epsilon)})$-notable coordinates. Then for any ${0 \leq \rho < 1}$, \[ \mathbf{Stab}_\rho[f] \leq \Lambda_\rho(\mathop{\bf E}[f]) + O\bigl(\tfrac{\log \log (1/\epsilon)}{\log(1/\epsilon)}\bigr) \cdot \tfrac{\log(1/\lambda)}{1-\rho}. \]
(Hint: Naturally, you’ll need Exercise 49(b).)
Remark: It can be shown that the $\mathcal{C}^3$ hypothesis in Propositions 26 and 28 is not necessary (provided the derivatives are interpreted in the distributional sense); see, e.g., Bogachev [Bog98] (Chapter 1) for more details.
Definition 76 For $\rho \in [-1,1]$ we define the Gaussian quadrant probability function $\Lambda_\rho : [0,1]^2 \to [0,1]$ by \[ \Lambda_\rho(\alpha,\beta) = \mathop{\bf Pr}_{\substack{(\boldsymbol{z}, \boldsymbol{z}') \text{ $\rho$-correlated} \\ \text{standard Gaussians}}}[\boldsymbol{z} \leq t, \boldsymbol{z}' \leq t'], \] where $t$ and $t’$ are defined by $\Phi(t) = \alpha$, $\Phi(t’) = \beta$. This is a slight reparametrization of the bivariate Gaussian cdf. We also use the shorthand notation \[ \Lambda_\rho(\alpha) = \Lambda_\rho(\alpha, \alpha), \] which we encountered in Borell’s Isoperimetric Theorem (and also in Exercises 5.33 and 9.24, with a different, but equivalent, definition).
Two-Set Borell Isoperimetric Theorem Fix $\rho \in (0,1)$ and $\alpha, \beta \in [0,1]$. Then for any $A, B \subseteq {\mathbb R}^n$ with $\mathrm{vol}_\gamma(A) = \alpha$, $\mathrm{vol}_\gamma(B) = \beta$, \begin{equation} \label{eqn:two-set-borell1} \mathop{\bf Pr}_{\substack{(\boldsymbol{z}, \boldsymbol{z}’) \text{ $\rho$-correlated} \\ \text{$n$-dimensional Gaussians}}}[\boldsymbol{z} \in A, \boldsymbol{z}' \in B] \leq \Lambda_\rho(\alpha,\beta). \end{equation}
By definition of $\Lambda_\rho(\alpha,\beta)$, equality holds if $A$ and $B$ are parallel halfspaces. Taking $\beta = \alpha$ and $B = A$ in this theorem gives Borell’s Isoperimetric Theorem as stated in Section 3 (in the case of range $\{0,1\}$, at least, which is equivalent by Exercise 25). It’s quite natural to guess that parallel halfspaces should maximize the “joint Gaussian noise stability” quantity on the left of \eqref{eqn:two-set-borell1}, especially in light of Remark 2 from Chapter 10.1 concerning the analogous Generalized Small-Set Expansion Theorem. Just as our proof of the Small-Set Expansion Theorem passed through the Two-Function Hypercontracitivity Theorem to facilitate induction, so too does the Mossel–Neeman proof pass through the following “two-function version” of Borell’s Isoperimetric Theorem:
Two-Function Borell Isoperimetric Theorem Fix $\rho \in (0,1)$ and let $f, g \in L^2({\mathbb R}^n, \gamma)$ have range $[0,1]$. Then \[ \mathop{\bf E}_{\substack{(\boldsymbol{z}, \boldsymbol{z}') \text{ $\rho$-correlated} \\ \text{$n$-dimensional Gaussians}}}[\Lambda_\rho(f(\boldsymbol{z}),g(\boldsymbol{z}'))] \leq \Lambda_\rho\left(\mathop{\bf E}[f], \mathop{\bf E}[g]\right). \]