## Simons Symposium 2014 — Day 1

Li-Yang here.

Avi Wigderson kicked off this year’s symposium with a talk describing recent joint work with Dimitri Gavinsky, Or Meir, and Omri Weinstein attacking one of the central open problems in computational complexity: does ${\mathsf P} = \mathsf{NC}^1$, or in words, can every sequential computation be efficiently parallelized? For a Boolean function $f:\{0,1\}^n \to\{0,1\}$, let $S(f)$ denote the minimum size of any Boolean circuit computing $f$, and $L(f)$ the minimum size of any Boolean formula computing $f$. Certainly $S(f) \le L(f)$ for all $f$, and we know of examples showing that $S(f)$ can be polynomially smaller than $L(f)$ — the parity function, for example, has $S(\mathsf{PAR}_n) = O(n)$ whereas $L(\mathsf{PAR}_n) = \Omega(n^2)$. The $\mathsf{P}$ versus $\mathsf{NC}^1$ problem seeks an explicit function $f$ witnessing a super-polynomial separation between these two quantities.

The work of Karchmer, Raz, and Wigderson (KRW) proposed the following approach to the problem. Let $D(f)$ denote the minimum depth of any formula computing $f$ (and recall that $D(f) = O(\log(L(f))$ by Spira’s depth-reduction theorem).  Given two Boolean functions $g:\{0,1\}^m \to\{0,1\}$ and $f:\{0,1\}^n\to\{0,1\}$, we write $g\circ f : \{0,1\}^{mn}\to\{0,1\}$ to denote their (disjoint) composition $(g\circ f)(x^1,\ldots,x^m) = g(f(x^1),\ldots,f(x^m))$. Clearly $D(g \circ f) \le D(g) + D(f)$: to compute $g\circ f$ with a formula, simply stick on a formula for $f$ at every leaf in a formula computing $g$. The KRW conjecture states that this naive construction is essentially the best one can do, i.e. that $D(g \circ f) \approx D(g) + D(f)$; equivalently, $C(R_{g\circ f}) \approx C(R_g) + C(R_f)$ where $R_g$ and $R_f$ are the communication problems corresponding to the Karchmer-Wigderson relations of $f$ and $g$ respectively, and $C(\cdot)$ denotes deterministic communication complexity. KRW showed that proving this conjecture would separate $\mathsf{P}$ from $\mathsf{NC}^1$ (notably, this approach appears to circumvent the Razborov-Rudich natural proofs barrier). They furthermore proposed the first step of proving the conjecture in the case when $f = U_n$ and $g=U_m$, where $U_k$ is the $k$-ary Universal Relation that defines the following communication problem: Alice receives $x\in \{0,1\}^k$ and Bob $y\in \{0,1\}^k$ with the sole guarantee that $x\ne y$, and their goal is to find a coordinate $i\in [n]$ such that $x_i \ne y_i$. (It is easy to see that every Karchmer-Wigderson relation reduces to the Universal Relation, hence its name.)

This first step was taken by Edmonds, Impagliazzo, Rudich, and Sgall; Håstad and Wigderson subsequently gave an alternative proof. Using tools from interactive information complexity Avi and his co-authors take the natural next step: they establish the KRW conjecture in the case when $f = U_n$ and $g$ is an arbitrary function.  The case when $g = U_m$ and $f$ is an arbitrary function remains open: while superficially similar, Avi points out that this case appears to be of significantly different nature, and proposes it as good next step to pursue.

For more details here’s a video of Or giving a talk on their results.

Next up were back-to-back talks by Rafał Latała and Witold Bednorz on their recent work on the suprema of Bernoulli processes, the main result of which is an affirmative answer to Talagrand’s \$5000 Bernoulli Conjecture (among the list of open problems from the 2012 Symposium). The problem can be stated as follows: let$T\subseteq \ell^2$be a collection of vectors, and consider the quantity: $b(T) := \mathop{\bf E} \left[\sup_{t\in T} \sum_{i =1}^\infty \boldsymbol{\epsilon}_i t_i \right],$ where$\boldsymbol{\epsilon}_1,\boldsymbol{\epsilon}_2,\ldots$are i.i.d. uniform$\pm$1 random variables. The goal is to obtain tight upper and lower bounds on$b(T)$; more precisely, to understand its relationship with the geometry of$T$. As is often the case with questions concerning Bernoulli random variables, it is natural to first consider the Gaussian setting where we can define the analogous quantity: $g(T) := \mathop{\bf E}\left[ \sup_{t\in T} \sum_{i =1}^\infty \boldsymbol{g}_i t_i \right],$ and where the$\boldsymbol{g}_i$’s are i.i.d. standard Gaussians. The generic chaining theory of Michel Talagrand (building on results of Fernique and Dudley, the culmination of a body of work dating back to Kolmogorov in the 30′s) has shown that$g(T)$is characterized, up to universal constant factors, by the quantity $\gamma_2(T,\|\cdot \|_2) := \inf \sup_{t \in T} \sum_{n=0}^\infty 2^{n/2} \mathrm{diam}(\mathcal{A}_n(t)),$ where the infimum is taken over all choices of$\{\mathcal{A}_n\}_{n=0}^\infty$, increasing sequences of partitions of$T$satisfying$\mathcal{A}_0 = \{T\}$and$|\mathcal{A}_n| = 2^{2^n}$, and$\mathcal{A}_n(t)$denotes the unique set in the partition$\mathcal{A}_n$that contains$t$. With Talagrand’s characterization of$g(T)$in mind we now revisit the Bernoulli setting. An easy application of Jensen’s inequality yields the relationship$b(T) \le g(T)\sqrt{\pi/2}$, and an even simpler “$\ell_1$-bound” yields$b(T) \le \sup_{t\in T} \| t \|_1$. Together these give us the upper bound: $b(T) \le \inf\{ \sup_{t\in T_1} \|t\|_1 + g(T_2) \colon T \subseteq T_1 + T_2\}.$ Talagrand’s Bernoulli Conjecture essentially asks: “Can we reverse this inequality?”, and Rafał and Witek’s work answers this in the affirmative. Their main theorem states the following: for all$T$such that$b(T) < \infty$, there exists$T_1$and$T_2$such that$T \subseteq T_1 + T_2$and$\sup _{t\in T_1}\| t \|_1 \le O(b(T))$and$g(T_2) = O(b(T))$. Rafał’s talk ended with the following intriguing question from Yuval Peres and Elchanan Mossel: is there an efficient algorithm for computing or approximating$b(T)$? (This is very closely related to Raghu Meka’s talk; see below.) Witek spoke about several potential extensions of the Bernoulli Theorem which remain open. Here I will highlight two: The first is a strict generalization, conjectured by Kwapień, to the Banach space setting. Let$(B, \| \cdot \|)$be a normed space, and let$u_1, u_2, \dots \in B$be such that$\sum_i u_i \boldsymbol{\epsilon}_i$converges almost surely. (Here the$\boldsymbol{\epsilon}_i$’s represent independent uniformly$\pm 1$random variables.) Is it possible to write each$u_i$as$u_i = v_i + w_i$such that $\mathop{\bf E}\left[\sum_i u_i \boldsymbol{\epsilon}_i\right] \gtrsim \sup_{\delta_i \in \{-1,1\}} \left\{\left\|\sum_i v_i \delta_i\right\| \right\}, \ \ \mathop{\bf E}\left[\sum_i u_i \boldsymbol{\epsilon}_i\right] \gtrsim {\bf E}\left[\sum_i w_i \boldsymbol{g}_i\right]?$ (Here as usual the$\boldsymbol{g}_i$’s represent independent standard Gaussians, and$\gtrsim$means inequality up to a universal constant.) In case$B = \ell^\infty$this follows from Rafal and Witek’s proof of the Bernoulli Conjecture; however, this potential generalization does not seem so easy because their proof does not give much control over the “$T_1$” and “$T_2$” sets. Another potential generalization of the Bernoulli Theorem is to$p$-biased$\pm 1$random variables. More specifically, let$p > 0$be small and let$\boldsymbol{\delta} = (\boldsymbol{\delta}_1,\dots, \boldsymbol{\delta} _n)$be a sequence of independent random variables which are$1-p$with probability$p$and$-p$with probability$1-p$. Let$T$be a finite collection of vectors in${\mathbb R}^n$. As usual, we are interesting in finding deterministic quantities which bound $b_p(T) := \mathop{\bf E}\left[\max_{t \in T} |\langle t, \boldsymbol{\delta}\rangle| \right]$ up to a constant factor not depending on$n$or on$p$. As in the Bernoulli Conjecture, one can give upper bounds based on: (a) trivial$\ell_1$norm considerations; (b) chaining (roughly speaking, the union bound with clever clustering of vectors). In the Bernoulli Conjecture case (i.e.,$p = 1/2$), the chaining bound relies on Chernoff bounds; i.e., the subgaussian property of uniform$\pm 1$random vectors. In this$p$-biased case one has to be a little more careful using Bernstein bounds. In any case, Talagrand has again conjectured that these are the “only” two ways to get an upper bound. Precisely, the conjecture boils down to showing the following: one can find vector sets$T_1, T_2$with$T \subseteq T_1
+ T_2$, and $b_p(T) \gtrsim \mathop{\bf E}\left[\max_{t \in T} \sum_i |t_i| \boldsymbol{\delta}_i\right],\ \ \sqrt{p} \cdot \gamma_2(T_1, \| \cdot \|_2), \ \ \gamma_1(T_2, d_{\infty}),$ where$\gamma_2$is as before, and the$\gamma_1$quantity is like$\gamma_2$except that the diameter is measured with respect to the 1-norm and the coefficient on contribution from the$j$th part is$2^j$rather than$2^{j/2}$. (And, it should be stressed again, the constant hidden by the$\gtrsim$should not depend on$p$or$n$.) For more details here’s a video of Rafal giving a longer version of his talk; see also this great series of blog posts by James Lee. The next speaker was Yuval Filums, who spoke about the Erdős-Ko-Rado (EKR) theorem, stability results, and association schemes. The classical EKR theorem from extremal combinatorics says the following: for all$n \ge 2k$, if$\mathcal{F} \subseteq {{[n]}\choose k}$is an intersecting family then$|\mathcal{F}| \le {{n-1}\choose {k-1}}$. (A family$\mathcal{F}$is intersecting if$A \cap B \ne \emptyset$for all$A,B \in \mathcal{F}$.) The extremal family is perhaps the first that comes to mind: take an arbitrary element in the universe$[n]$, say$1$, and let$\mathcal{F}$be the family of all${{n-1}\choose {k-1}}$subsets that include$1$; in fact, these are the only families attaining the bound of${{n-1}\choose {k-1}}$. Let’s call such a family$\mathcal{F}$a star. It is natural to wonder if a “stability version” of the EKR theorem is true: if$|\mathcal{F}| \ge (1-\epsilon){{n-1}\choose {k-1}}$, must there exist some star$\mathcal{S}$such that$|\mathcal{F} \triangle \mathcal{S}| \le \epsilon |\mathcal{F}|$? There are by now several different proofs of the EKR theorem; however many of these, like the original shifting-based proof EKR, do not appear to be amendable to obtaining stability results. An analytic approach introduced by Ehud Friedgut has the key advantage that it automatically yields stability, showing that families of cardinality close to the extremal bound of${{n-1}\choose {k-1}}$must in fact themselves be close to a star. Roughly speaking, Ehud accomplishes this by considering the Fourier expansion of$\mathbf{1}_{\mathcal{F}}: {{[n]}\choose k} \to\{0,1\}$, and showing that its Fourier mass is concentrated on the first two levels. This brings to mind the Friedgut-Kalai-Naor theorem, which says the Boolean functions$f:\{0,1\}^n\to\{0,1\}$with most of their Fourier mass concentrated on the first two levels are close to dictators (or more precisely,$1$-juntas) — we note that the indicator of an extremal family is in fact a dictator, since$\mathbf{1}_{\mathcal{F}}(S) = 1$iff the distinguished element ($1$in the case of our example above) is in$S$. Of course to carry out this proof strategy one would need an FKN-type theorem for Boolean(-valued) functions${{[n]}\choose k}\to\{0,1\}$over “slices of the hypercube”; in other words, functions over the Johnson scheme. This falls into a line of work extending classical Fourier-analytic theorems concerning Boolean-valued functions over the discrete cube$\{-1,1\}^n$to non-independent domains. Yuval ended his talk by with a wish-list of six such theorems he would like to see extended to various association schemes: the KKL theorem, Friedgut’s junta theorem, the FKN theorem, the Kindler-Safra theorem and the closely-related Bourgain’s theorem, and finally the invariance principle. Analogues of the first two for the Johnson scheme have been established by O’Donnell and Wimmer and Wimmer respectively. For more on related spectral techniques in extremal combinatorics, check out Yuval’s thesis. The final speaker of the day was Raghu Meka, who presented a PTAS for computing the supremum of Gaussian processes (answering an open problem of Lee and Ding). More precisely, he constructs a deterministic algorithm which, given a finite set of vectors$T  \subseteq \mathbb{R}^d$and$\epsilon > 0$, computes a$(1+\epsilon)$-factor approximation to the quantity$\mathbf{E}_{\mathbf{X} \leftarrow \mathcal{N}^d}[\sup_{t \in T} |\langle t, \mathbf{X}\rangle|]$in time$d^{O(1)}\cdot |T|^{O_\epsilon(1)}$(here$\mathcal{N}$is the standard Gaussian). Gaussian processes play a central role in the study of stochastic processes, functional analysis, convex geometry, and machine learning (among others); we elaborate on one specific application which is of particular relevance to Raghu’s result. Given a finite connected graph$G = (V,E)$, we may associate with it the following fundamental parameter: the cover time of$G=(V,E)$, denoted$\mathrm{cov}(G)$, is the maximum over all vertices$v \in V$, of the expected time it takes for a simple random walk starting at$v$to visit all every other vertex of$G$. (A simple random walk is one which, at every step, moves to a uniformly random neighbor.) Aldous and Fill asked the natural question: “Can we approximate$\mathrm{cov}(G)$deterministically?” (Note that there is a simple randomized algorithm: simulate a random walk a few of times, and output the median time required to visit all vertices.) The first result in this direction was due to Kahn, Kim, Lovász, and Vu, who gave a$O((\log\log n)^2)$-factor approximation algorithm; this was followed by the work of Feige and Zeitouni who gave an FPTAS for trees. In recent exciting work Ding, Lee, and Peres significantly improved this state of affairs with a deterministic polynomial-time$O(1)$-factor approximation algorithm. They accomplish this by bridging a connection between the cover time of$G$and an associated Gaussian process on$G$, known as the Gaussian free field on$G$; more precisely, they show that the supremum of the Gaussian free field on$G$characterizes, up to universal constant factors, the cover time of$G$. With this connection in hand they then give a dynamic programming-based algorithm for computing$\gamma_2(\cdot,\|\cdot \|_2)$, which by Talagrand’s characterization yields a$O(1)$-factor approximation to the supremum of Gaussian processes. Motivated by this application, Ding and Lee asked if there is a PTAS for computing the supremum of Gaussian processes, and Raghu’s work answers this question in the affirmative (with a “direct” algorithm, one that does not incur the$O(1)$-factor loss that is likely to be inherent in any approach that goes through Talagrand’s characterization). Raghu’s algorithm can be viewed as proceeding in two modular steps, the first of which is dimension reduction. Using the standard Johnson-Lindenstrauss (JL) lemma (or rather, a derandomized version of the lemma), we first project$T$onto a space of dimension roughly$k=O(\log |T|)$. In the analysis of this first step, the JL lemma ensures that the distances between pairs of points are preserved, while Slepian’s lemma allows us to relate the supremum of the original Gaussian process to the supremum of the projected Gaussian process. As one would expect, the next step involves solving the problem in the projected space (with an algorithm that is allowed run in time exponential in the dimension$k$); we accomplish this via the construction of explicit$\epsilon$-nets for convex functions in Gaussian space. A naive rounding-based approach yields a net of size$k^{O(k)}$, and Dadush and Vempala give a construction achieving size$(\log k)^{O(k)}$. Using a classical comparison lemma called Kanter’s lemma to lift a simple net for the univariate setting to the multivariate setting, Raghu constructs an optimal$\epsilon$-net of size$(1/\epsilon)^{O(k)}$. Raghu concluded with two open problems. The first is a conjecture of Ding, Lee, and Peres, which states that the supremum of the Gaussian free field on a graph and characterizes its cover time asymptotically; if true, this coupled with Raghu’s result would give us a PTAS for computing the cover time of all graphs. The second is whether an analogue of Slepian’s lemma holds for graphs: Given a graph$G = (V,E)$, for any$u,v\in V$we write$\mathrm{hit}_G(u,v)$to denote the expected time it takes for a random walk starting at$u$to hit$v$. Let$G, H$be two graphs on the set vertex set$V$, and suppose$\mathrm{hit}_G(u,v)\le \mathrm{hit}_H(u,v)$for all$u,v\in V$. Does this imply that$\mathrm{cov}(G) \le \mathrm{cov}(H)$? For more details, here’s a video of Raghu’s talk at FOCS 2012; see also this blog post by James Lee for more on the nice connection to cover times and the Gaussian free field. And finally, here are a few pictures from the first day! ## Simons Symposium 2014 — Discrete Analysis: Beyond the Boolean Cube I’m pleased to announce that this week we’ll be reporting on the 2014 Simons Symposium — Discrete Analysis: Beyond the Boolean Cube. This is the second of three biannual symposia on Analysis of Boolean Functions, sponsored by the Simons Foundation. You may remember our reports on the 2012 edition which took place in Caneel Bay, US Virgin Islands. This year we’re lucky to be holding the symposium in Rio Grande, Puerto Rico. I’m also happy to report that we will have guest blogging by symposium attendee Li-Yang Tan. This year’s talk lineup looks quite diverse, with topics ranging from the Bernoulli Conjecture Theorem to Fourier analysis on the symmetric group, to additive number theory. Stay tuned! ## §11.3: Borell’s Isoperimetric Theorem If we believe that the Majority Is Stablest Theorem should be true, then we also have to believe in its “Gaussian special case”. Let’s see what this Gaussian special case is. Suppose$f : {\mathbb R}^n \to [-1,1]$is a “nice” function (smooth, say, with all derivatives bounded) having$\mathop{\bf E}[f] = 0$. You’re encouraged to think of$f$as (a smooth approximation to) the indicator$\pm 1_A$of some set$A \subseteq {\mathbb R}^n$of Gaussian volume$\mathrm{vol}_\gamma(A) = \tfrac{1}{2}$. Now consider the Boolean function$g : \{-1,1\}^{nM} \to \{0,1\}$defined by $g = f \circ \text{BitsToGaussians}_{M}^{n}.$ Using the multidimensional Central Limit Theorem, for any$\rho \in (0,1)$we should have $\mathbf{Stab}_\rho[g] \xrightarrow{M \to \infty} \mathbf{Stab}_\rho[f],$ where on the left we have Boolean noise stability and on the right we have Gaussian noise stability. Using$\mathop{\bf E}[g] \to \mathop{\bf E}[f] = 0$, the Majority Is Stablest Theorem would tell us that $\mathbf{Stab}_\rho[g] \leq 1 – \tfrac{2}{\pi}\arccos \rho + o_\epsilon(1),$ where$\epsilon = \mathbf{MaxInf}[g]$. But$\epsilon = \epsilon(M) \to 0$as$M \to \infty$. Thus we should simply have the Gaussian noise stability bound $$\label{eqn:vol12-Borell} \mathbf{Stab}_\rho[f] \leq 1 – \tfrac{2}{\pi}\arccos \rho.$$ (By a standard approximation argument this extends from “nice”$f : {\mathbb R}^n \to [-1,1]$with$\mathop{\bf E}[f] = 0$to any measurable$f : {\mathbb R}^n \to [-1,1]$with$\mathop{\bf E}[f] = 0$.) Note that the upper bound \eqref{eqn:vol12-Borell} is achieved when$f$is the$\pm 1$-indicator of any halfspace through the origin; see Corollary 20. (Note also that if$n = 1$and$f = \mathrm{sgn}$, then the function$g$is simply$\mathrm{Maj}_M$.) The “isoperimetric inequality” \eqref{eqn:vol12-Borell} is indeed true, and is a special case of a theorem first proved by Borell [Bor85]. Borell’s Isoperimetric Theorem (volume-${\frac{\mathbf 1}{\mathbf 2}}$case) Fix$\rho \in (0,1)$. Then for any$f \in L^2({\mathbb R}^n,\gamma)$with range$[-1,1]$and$\mathop{\bf E}[f] = 0$, $\mathbf{Stab}_{\rho}[f] \leq 1 – \tfrac{2}{\pi}\arccos \rho,$ with equality if$f$is the$\pm 1$-indicator of any halfspace through the origin. Remark 41 In Borell’s Isoperimetric Theorem, nothing is lost by restricting attention to functions with range$\{-1,1\}$, i.e., by considering only$f = \pm 1_A$for$A \subseteq {\mathbb R}^n$. This is because the case of range$[-1,1]$follows straightforwardly from the case of range$\{-1,1\}$, essentially because$\sqrt{\mathbf{Stab}_\rho[f]} = \|\mathrm{U}_{\sqrt{\rho}} f\|_2$is a convex functional of$f$; see the exercises. More generally, Borell showed that for any fixed volume$\alpha \in [0,1]$, the maximum Gaussian noise stability of a set of volume$\alpha$is no greater than that of a halfspace of volume$\alpha$. We state here the more general theorem, using range$\{0,1\}$rather than range$\{-1,1\}$for future notational convenience (and with Remark 41 applying equally): Borell’s Isoperimetric Theorem Fix$\rho \in (0,1)$. Then for any$f \in L^2({\mathbb R}^n, \gamma)$with range$[0,1]$and$\mathop{\bf E}[f] = \alpha$, $\mathbf{Stab}_{\rho}[f] \leq \Lambda_\rho(\alpha).$ Here$\Lambda_\rho(\alpha)$is the Gaussian quadrant probability function, discussed in Exercise 5.33, and equal to$\mathbf{Stab}_\rho[1_H]$for any (every) halfspace$H \subseteq {\mathbb R}^n$having Gaussian volume$\mathrm{vol}_\gamma(H) = \alpha$. We’ve seen that the volume-$\tfrac{1}{2}$case of Borell’s Isoperimetric Theorem is a special case of the Majority Is Stablest Theorem, and similarly, the general version of Borell’s theorem is a special case of the General-Volume Majority Is Stablest Theorem mentioned at the beginning of the chapter. As a consequence, proving Borell’s Isoperimetric Theorem is a prerequisite for proving the General-Volume Majority Is Stablest Theorem. In fact, our proof in Section 7 of the latter will be a reduction to the former. The proof of Borell’s Isoperimetric Theorem itself is not too hard; one of five known proofs, the one due to Mossel and Neeman [MN12], is outlined in the exercises. If our main goal is just to prove the basic Majority Is Stablest Theorem, then we only need the volume-$\tfrac{1}{2}$case of Borell’s Isoperimetric Inequality. Luckily, there’s a very simple proof of this volume-$\tfrac{1}{2}$case for “many” values of$\rho$, as we will now explain. Let’s first slightly rephrase the statement of Borell’s Isoperimetric Theorem in the volume-$\tfrac{1}{2}$case. By Remark 41 we can restrict attention to sets; then the theorem asserts that among sets of Gaussian volume$\tfrac{1}{2}$, halfspaces through the origin have maximal noise stability, for each positive value of$\rho$. Equivalently, halfspaces through the origin have minimal noise sensitivity under correlation$\cos \theta$, for$\theta \in (0,\frac{\pi}{2})$. The formula for this minimal noise sensitivity was given as inequality (2) in our proof of Sheppard’s Formula. Thus we have: Equivalent statement of the volume-${\frac{\mathbf 1}{\mathbf 2}}$Borell Isoperimetric Theorem Fix$\theta \in (0,\frac{\pi}{2})$. Then for any$A \subset {\mathbb R}^n$with$\mathrm{vol}_\gamma(A) = \tfrac{1}{2}$, $\mathop{\bf Pr}_{\substack{(\boldsymbol{z}, \boldsymbol{z}') \\\text{\cos \theta-correlated}}}[1_A(\boldsymbol{z}) \neq 1_A(\boldsymbol{z}')] \geq \tfrac{\theta}{\pi},$ with equality if$A$is any halfspace through the origin. In the remainder of this section we’ll show how to prove this formulation of the theorem whenever$\theta = \frac{\pi}{2\ell}$, where$\ell$is a positive integer. This gives the volume-$\tfrac{1}{2}$case of Borell’s Isoperimetric Inequality for all$\rho$of the form$\arccos\frac{\pi}{2\ell}$,$\ell \in {\mathbb N}^+$; in particular, for an infinite sequence of$\rho$’s tending to$1$. To prove the theorem for these values of$\theta$, it’s convenient to introduce notation for the following noise sensitivity variant: Definition 42 For$A \subseteq {\mathbb R}^n$and$\delta \in {\mathbb R}$(usually$\delta \in [0,\pi]$) we write$\mathbf{RS}_A(\delta)$for the defined by $\mathbf{RS}_A(\delta) = \mathop{\bf Pr}_{\substack{(\boldsymbol{z}, \boldsymbol{z}') \\ \cos \delta\text{-correlated}}}[1_A(\boldsymbol{z}) \neq 1_A(\boldsymbol{z}')].$ The key property of this definition is the following: Theorem 43 For any$A \subseteq {\mathbb R}^n$the function$\mathbf{RS}_A(\delta)$is subadditive; i.e., $\mathbf{RS}_A(\delta_1 + \cdots + \delta_\ell) \leq \mathbf{RS}_A(\delta_1) + \cdots + \mathbf{RS}_A(\delta_\ell).$ In particular, for any$\delta \in {\mathbb R}$and$\ell \in {\mathbb N}^+$, $\mathbf{RS}_A(\delta) \leq \ell \cdot \mathbf{RS}_A(\delta/\ell).$ Proof: Let$\boldsymbol{g}, \boldsymbol{g}’ \sim \mathrm{N}(0,1)^n$be drawn independently and define$\boldsymbol{z}(\theta) = (\cos\theta)\boldsymbol{g} + (\sin\theta)\boldsymbol{g}’$. Geometrically, as$\theta$goes from$0$to$\frac{\pi}{2}$the random vectors$\boldsymbol{z}(\theta)$trace from$\boldsymbol{g}$to$\boldsymbol{g}’$along the origin-centered ellipse passing through these two points. The random vectors$\boldsymbol{z}(\theta)$are jointly normal, with each individually distributed as$\mathrm{N}(0,1)^n$. Further, for each fixed$\theta, \theta’ \in {\mathbb R}$the pair$(\boldsymbol{z}(\theta), \boldsymbol{z}(\theta’))$constitute$\rho$-correlated Gaussians with $\rho = \cos\theta \cos\theta' + \sin\theta \sin\theta' = \cos(\theta' - \theta).$ Now consider the sequence$\theta_0, \dots, \theta_\ell$defined by the partial sums of the$\delta_i$’s, i.e.,$\theta_j = \sum_{i = 1}^j \delta_i$. We get that$\boldsymbol{z}(\theta_0)$and$\boldsymbol{z}(\theta_\ell)$are$\cos(\delta_1 + \cdots + \delta_\ell)$-correlated, and that$\boldsymbol{z}(\theta_{j-1})$and$\boldsymbol{z}(\theta_{j})$are$\cos\delta_j$-correlated for each$j \in [\ell]. Thus \begin{align} \mathbf{RS}_A(\delta_1 + \cdots + \delta_\ell) &= \mathop{\bf Pr}[1_A(\boldsymbol{z}(\theta_0)) \neq 1_A(\boldsymbol{z}(\theta_\ell))] \nonumber\\ &\leq \sum_{j=1}^\ell \mathop{\bf Pr}[1_A(\boldsymbol{z}(\theta_{j})) \neq 1_A(\boldsymbol{z}(\theta_{j-1}))] = \sum_{j=1}^\ell \mathbf{RS}_A(\delta_j), \label{eqn:ellipse-union-bound} \end{align} where the inequality is the union bound.\Box$With this subadditivity result in hand, it’s indeed easy to prove the equivalent statement of the volume-$\tfrac{1}{2}$Borell Isoperimetric Theorem for any$\theta \in \{\frac{\pi}{4}, \frac{\pi}{6}, \frac{\pi}{8}, \frac{\pi}{10}, \dots\}$. As we’ll see in Section 7, the case of$\theta = \frac{\pi}{4}$can be used to give an excellent UG-hardness result for the Max-Cut CSP. Corollary 44 The equivalent statement of the volume-$\tfrac{1}{2}$Borell Isoperimetric Theorem holds whenever$\theta = \frac{\pi}{2\ell}$for$\ell \in {\mathbb N}^+$. Proof: The exact statement we need to show is$\mathbf{RS}_A(\frac{\pi}{2\ell}) \geq \frac{1}{2\ell}$. This follows by taking$\delta = \frac{\pi}{2}$in Theorem 43 because $\mathbf{RS}_A(\tfrac{\pi}{2}) = \mathop{\bf Pr}_{\substack{(\boldsymbol{z}, \boldsymbol{z}') \\ 0\text{-correlated}}}[1_A(\boldsymbol{z}) \neq 1_A(\boldsymbol{z}')] = \tfrac12,$ using that$0$-correlated Gaussians are independent and that$\mathrm{vol}_\gamma(A) = \frac12$.$\Box$Remark 45 Although Sheppard’s Formula already tells us that equality holds in this corollary when$A$is a halfspace through the origin, it’s also not hard to derive this directly from the proof. The only inequality in the proof, \eqref{eqn:ellipse-union-bound}, is an equality when$A$is a halfspace through the origin, because the elliptical arc can only cross such a halfspace$0$or$1$times. Remark 46 Suppose that$A \subseteq {\mathbb R}^n$not only has volume$\tfrac{1}{2}$, it has the property that$x \in A$if and only if$-x \not \in A$; in other words, the$\pm 1$-indicator of$A$is an odd function. (In both statements, we allow a set of measure$0$to be ignored.) An example set with this property is any halfspace through the origin. Then$\mathbf{RS}_A(\pi) = 1$, and hence we can establish Corollary 44 more generally for any$\theta \in \{\frac{\pi}{1}, \frac{\pi}{2}, \frac{\pi}{3}, \frac{\pi}{4}, \frac{\pi}{5}, \dots\}$by taking$\delta = \pi$in the proof. ## §11.2: Hermite polynomials Having defined the basic operators of importance for functions on Gaussian space, it’s useful to also develop the analogue of the Fourier expansion. Continue reading §11.2: Hermite polynomials ## §11.1: Gaussian space and the Gaussian noise operator We begin with a few definitions concerning Gaussian space. Continue reading §11.1: Gaussian space and the Gaussian noise operator ## The book is available for pre-order I’m happy to announce that the book is very nearly completed. In fact, you can pre-order a copy now, either directly from Cambridge University Press, or from Amazon (currently with a 10% discount). If all goes well, the book will become physically available at the end of May. Fairly soon thereafter I will also make a pdf “final draft” version freely available here at the website. In the meantime, I will continue to serialize the final chapter (Chapter 11) as usual over the next month. In fact, I had to trim and glue together the planned Chapters 11 and 12. Also, as mentioned earlier, I dropped a chapter on Additive Combinatorics that would have gone between Chapters 8 and 9, although its “highlight”, Sanders’s Theorem, is here on the blog. Oh well, I had to wrap this project up at some point; at least you’ll have an idea of what will be in the 2nd Edition. ## Chapter 11: Gaussian space and Invariance Principles The final destination of this chapter is a proof of the following theorem due to Mossel, O’Donnell, and Oleszkiewicz [MOO05a,MOO10], first mentioned in Chapter 5.25: Majority Is Stablest Theorem Fix$\rho \in (0,1)$. Let$f : \{-1,1\}^n \to [-1,1]$have$\mathop{\bf E}[f] = 0$. Then, assuming$\mathbf{MaxInf}[f] \leq \epsilon$, or more generally that$f$has no$(\epsilon,\epsilon)$-notable coordinates, $\mathbf{Stab}_\rho[f] \leq 1 – \tfrac{2}{\pi} \arccos \rho + o_\epsilon(1).$ ## Chapter 10 notes ## Chapter 10 exercises ## §10.5: Highlight: General sharp threshold theorems In Chapter 8.4 we described the problem of “threshold phenomena” for monotone functions$f : \{-1,1\}^n \to \{-1,1\}\$.
Continue reading §10.5: Highlight: General sharp threshold theorems