Simons Symposium 2014 — Day 2

The second day began with Tom Sanders speaking about the Bourgain-Green sumset problem in additive combinatorics, including some of his own work on the problem.

Tom started his talk with the celebrated theorem of Green and Tao establishing the existence of arbitrarily long arithmetic progressions (APs) in the primes. At the heart of Green and Tao’s proof is Szemerédi’s theorem, which states the following: Let $A \subseteq \{1,\ldots,N\}$ be a subset of density $\alpha > 0$. Then either $A$ contains a $k$-term AP or $\alpha = o_k(1)$ — in words, every subset of the integers of sufficiently large density contains a long AP. (Pinning down the exact rate at which the term $o_k(1)$ tends to zero is an important open problem.)  We may consider analogues of the Green-Tao and Szemerédi theorems in the Boolean hypercube $\{-1,1\}^n$, or rather $\mathbb{F}_2^n$ to be more precise, where APs “become” affine subspaces (throughout we will associate $|\mathbb{F}_2^n| = 2^n$ with $N$ in this correspondence).  As it turns out many arguments in additive combinatorics that apply in $[N]$ are modeled slightly more cleanly in $\mathbb{F}_2^n$; Tom mentions that this may be more a statement about the methods than the questions, since $\mathbb{F}_2^n$ naturally constitutes a group whereas this is not exactly true for $[N]$. Anyway, with this correspondence in mind we may wonder about an analogue of Szemerédi’s theorem in $\mathbb{F}_2^n$, and indeed, we have the following theorem: Let $A\subseteq \mathbb{F}_2^n$ be a subset of density $\alpha$. Then $A$ contains a subspace $V$ of dimension at least $(\log N)/\log(1/\alpha)$. The proof proceeds by repeated applications of Cauchy-Schwarz.

And now as Tom promised in the title for his talk, we next consider sumsets in $\mathbb{F}_2^n$; intuitively, taking the sumset of $A$ with itself should enrich the additive structure of $A$. (For any two subsets $A,B\subseteq \mathbb{F}_2^n$, the sumset of $A$ and $B$, denoted $A+B$, is simply the set $\{a + b \colon a \in A, b\in B\}$.)  Consider a subset $A\subseteq \mathbb{F}_2^n$ of density $\alpha$. By “Szemerédi’s theorem for $\mathbb{F}_2^n$” above, we know that $A$ contains a large subspace, one of dimension at least $(\log N)/(\log(1/\alpha))$. The sumset problem of Bourgain and Green asks, “How about the largest subspace in $A+A$?”  It is not hard to see (via the inclusion-exclusion principle) that if $\alpha > 1/2$ then $A+A = \mathbb{F}_2^n$, or to phrase it in a somewhat non-standard manner, $A+A$ contains a subspace of codimension $0$. Certainly this is tight, since if $\alpha = 1/2$ then $A$ could itself be a subspace of codimension $1$, in which case $A+A = A$.

In the remainder of his talk Tom surveyed some of the known results for $\alpha < 1/2$, beginning with the following theorem of his: Suppose $\alpha > 1/2-1/(1000\sqrt{n})$. Then $A+A$ contains a subspace of codimension $1$. And this is tight by a result of Green and Ruzsa: for all $\epsilon > 0$ there exists $A = A(\epsilon) \subseteq \mathbb{F}_2^n$ of density $\alpha > 1/2-\epsilon$ such that if $V \subseteq A + A$ is a subspace, then the codimension of $V$ is $\Omega(\epsilon\sqrt{n})$. (As you may have guessed the set $A$ achieving this bound is one of the usual suspects in the analysis of Boolean functions: the Hamming ball.) And finally, complementing this last result of Green and Ruzsa we have the following theorem due to Tom: Suppose $\alpha > 1/2-\epsilon$. Then $A+A$ contains a subspace of of codimension $O(n/\log(1/\epsilon))$.  The proof of this proceeds via a density-increment-type argument reminiscent of Meshulam’s Fourier-analytic proof of an upper bound on the size of $3$-AP-free subsets of $\mathbb{F}_3^n$ (i.e. the “cap set problem”).

Tom concluded with two open problems, the first of which appears in his paper: Let $\alpha > 1/2-c/\sqrt{n}$, and show that $A+A$ contains a subspace of codimension $O(c^2)$ (or any bound that depends only on $c$). The second is a coloring-type problem: Suppose we partition $\mathbb{F}_2^n$ into $k$ color classes $A_1,\ldots,A_k$. Show that there exists an index $i$ such that $A_i + A_i$ contains a subspace of codimension $O_k(1)$.  The special case when $k=3$ is open, as is the relaxation “there exist indices $i$ and $j$ such that $A_i + A_j$ contains a subspace of codimension $O_k(1)$”.

For more details, see the “Subspaces in sumsets” open problem from 2012 Symposium’s open problem list and references therein.

Next up was Thomas Vidick, who spoke about joint work with Assaf Naor and Oded Regev about non-commutative extensions of Grothendieck’s inequality. Given a matrix $A \in \mathbb{R}^{n\times n}$, consider the optimization problem of determining the quantity $\mathrm{opt}(A)$ where

\[ \mathrm{opt}(A) := \max_{x_i,y_j \in \{\pm 1\}} \sum_{i,j=1}^n A_{ij} x_i y_j. \]

Unsurprisingly computing $\mathrm{opt}(A)$ is NP-hard in general (it can encode $\mathsf{Max Cut}$, for example), and so we consider the relaxation

\[ \mathrm{opt}(A) \le \mathop{\max_{u_i,v_j \in \mathbb{R}^n}}_{\|u_i\|=\|v_j\|=1} \sum_{i,j=1}^n A_{ij} u_i \cdot v_j, \]

which we may then solve efficiently via semidefinite programming.  The natural question is, how good of an approximation is this to $\mathrm{opt}(A)$? Grothendieck’s inequality provides an answer, showing that there exists a universal constant $K_G \le 1.78$ (the Grothendieck constant) such that the maximum on the right-hand size is at most $K_G \mathrm{opt}(A)$. (The precise value of $K_G$ is yet to be determined: Krivine in 1979 proved an upper bound of $K_G \le \pi/(2\ln(1+\sqrt{2})) = 1.7822\ldots$ and conjectured that this is in fact the right answer; Krivine’s conjecture was recently disproved by Braverman, Makarychev, Makarychev, and Naor.) Grothendieck’s inequality, fundamental in the study of tensor norms, was imported into TCS by the work of Alon and Naor, who gave an algorithmic proof of Grothendieck’s inequality, at the heart of which is a rounding scheme that turns vector solutions $u_i,v_j\in \mathbb{R}^n$ to the SDP into signs $x_i, y_j \in \{-1,1\}$ that come close to attaining $\mathrm{opt}(A)$.

Grothendieck’s inequality has the following non-commutative generalization due to Pisier and Haagerup: there exists a universal constant $K_H \le 2\sqrt{2}$ such that for all $M \in \mathbb{R}^{n\times n\times n\times n}$,

\[ \mathrm{opt}(M) := \mathop{\max_{X,Y\in \mathbb{R}^{n\times n}}}_{XX^T = YY^T = \mathrm{Id}} \sum_{i,j,k,\ell} M_{ijk\ell} X_{ij} Y_{k\ell} \le  \mathop{\max_{\vec{U}\vec{U^T} = \vec{U^T} \vec{U} = \mathrm{Id}}}_{\vec{V}\vec{V^T} = \vec{V^T} \vec{V} = \mathrm{Id}} \sum_{i,j,k,\ell} M_{ijk\ell} \vec{U}_{ij}\cdot \vec{V}_{k\ell} \le K_H\mathrm{opt}(M) \]

where the second maximum is taken over all $n\times n$ vector-valued matrices $\vec{U}$ and $\vec{V}$ with entries in $\mathbb{R}^n$, with multiplication defined to be $(\vec{U}\vec{V})_{ij} = \sum_k \vec{U}_{ik}\cdot \vec{V}_{kj}$, i.e. just like regular matrix multiplication, except that we take the inner product of the corresponding vector-valued matrix entries instead of multiplying real-valued entries. A few remarks. First, we note that if $M$ is diagonal in the sense that there is some $A\in \mathbb{R}^{n\times n}$ such that $M_{ijk\ell} = A_{ik}$ if $i=j$ and $k=\ell$, and $0$ otherwise, then $\mathrm{opt}(M) = \mathrm{opt}(A)$, so this is indeed a generalization of Grothendiek’s inequality. Second, it seems natural to wonder if the constraint that $\vec{U^T}\vec{U} = \mathrm{Id}$ is actually necessary given that we are already enforcing that $\vec{U}\vec{U^T}=\mathrm{Id}$ (and likewise for $V$). These two additional constraints are in fact necessary; without them this second maximum would not be within a universal constant of $\mathrm{opt}(M)$.  Thomas’s main result in his work with Oded and Assaf Naor is an analogue of Alon and Naor’s algorithmic proof of Grothendieck’s inequality in this non-commutative setting, giving an efficient scheme for rounding vector-valued orthogonal matrices into regular orthogonal matrices that preserves the objective value.  In slightly more detail, given vector-valued matrices that have objective value within $(1-\epsilon)$ of the SDP optimum, they give an algorithm that runs in time polynomial in $n$ and $1/\epsilon$ and finds two orthogonal matrices $X, Y \in \mathbb{R}^{n\times n}$ with objective value

\[ \sum_{i,j,k,\ell} M_{ijk\ell} X_{ij} Y_{k\ell} \ge \Big(\frac1{2\sqrt{2}}-2\epsilon\Big) \mathrm{opt}(M). \]

For more details, check out Thomas’s blog post on this topic. Also look for Oded’s talk on the same topic in the list of videos from the 2012 edition, and this video of Assaf Naor for the two-hour version. There’s also this 87-page survey devoted to Grothendieck’s inequality.

Next up was Oded Regev, who spoke about ongoing work with Shachar Lovett and Raghu Meka on discrepancy, a beautiful area which has seen much exciting progress in recent years. Oded’s talk was based on a series of three superb blog posts on the topic by Raghu, so I will only mention here a “vector generalization” of Spencer’s Six Standard Deviations Suffice theorem which does not appear in those posts: For all $n\times n$ matrices $A$ of unit vectors, there exists $\epsilon_1,\ldots,\epsilon_n\in \{-1,1\}$ such that $\|\sum_{j=1}^n\epsilon_j A_{ij} \|_2 \le O(\sqrt{n})$ for all $i\in [n]$. (Oded attributed this generalization to Krzysztof Oleszkiewicz.)

Next I (Li-Yang) spoke about recent work with Rocco Servedio in property testing, giving a polynomial lower bound on the query complexity of testers for the monotonicity of Boolean functions. In this problem we are given as input a distance parameter $\epsilon > 0$ and black-box query access to an unknown Boolean function $f$, and using as few queries as possible, we would like to determine with high confidence whether $f$ is monotone or $\epsilon$-far from monotone. This problem was introduced in a 1998 paper of Goldreich et al., who proposed a non-adaptive tester and proved an $O_\epsilon(n^2)$ upper bound on its query complexity, subsequently improved to $O(n/\epsilon)$ in the journal version. Fischer et al. established the first lower bounds shortly after, showing the existence of a constant distance parameter $\epsilon_0 > 0$ such that $\Omega(\log n)$ queries are necessary for any non-adaptive tester. (This directly implies an $\Omega(\log\log n)$ lower bound for adaptive testers.) These upper and lower bounds were the best known for more than a decade, until a recent beautiful paper of Chakrabarty and Sheshadhri improved on the linear upper bound of Goldreich et al. with an $\tilde{O}_\epsilon(n^{7/8})$-query non-adaptive tester.

Our main result is an improvement of the lower bounds of Fischer et al.: we prove the existence of a universal constant $\epsilon_0 > 0$ such that any non-adaptive algorithm for testing whether an unknown $f$ is monotone versus $\epsilon_0$-far from monotone requires $\tilde{\Omega}(n^{1/5})$ many queries.  We build on and extend techniques introduced by Blais and O’Donnell, applying multidimensional central limit theorems (CLTs) to reason about testers for certain properties; the main technical ingredient in our case is a recent multidimensional CLT of Valiant and Valiant for Wasserstein distance.

The last scheduled talk was by Ehud Friedgut, who spoke about a series of works with David Ellis, Yuval Filmus, and Haran Pilpel, codenamed [EFP], [EFF1], [EFF2], and [EFF3]. The general theme of these works is solving extremal problems in the symmetric group $S_n$ by Fourier-analytic (rather than combinatorial) means, which often has the benefit that one can upgrade them to “stability results”. Ehud gave a brief review of Fourier analysis over $S_n$. For those who haven’t seen it before, the quick-and-dirtiest explanation is that the Fourier coefficients $\widehat{f}(\lambda)$ of a function $f : S_n \to \mathbb{C}$ are matrices, indexed by partitions $\lambda$ of $n$; i.e., sequences of positive integers $\lambda_1 \geq \lambda_2 \geq \cdots$ which add up to $n$. As in the more familiar case of functions $f : \mathbb{F}_2^n \to \mathbb{R}$, the partitions are stratified into “levels”; the “$k$th level” corresponds to those partitions with $\lambda_1 \geq n – k$. This also coincides with the functions which are linear combinations of “$k$-cosets”; i.e., functions of the form $1[\sigma(i_1) = j_1, \dots, \sigma(i_k) = j_k]$ for some $i_1, \dots, i_k, j_1, \dots, j_k \in [n]$.

Friedgut and coauthors used Fourier-analytic methods to prove resolve several longstanding extremal combinatorics problems; e.g., the 34-year-old Deza-Frankl Conjecture, which states that if $A$ is a “$t$-intersecting” subset $S_n$ (meaning any two elements in $A$ agree about the image of at least $t$ points in $[n]$) then $|A| \leq (n-t)!$, with equality iff $A$ is a $t$-coset. Furthermore, by virtue of using spectral rather than combinatorial techniques, they were able to provide associated “stability” results. These often came by generalizing classic results in the Boolean hypercube setting (the Nisan-Szegedy Theorem, the FKN Theorem, the Kindler-Safra Theorem, …) to the $S_n$ setting. As just one example, [EFF2] showed the analogue of the FKN Theorem: if $f : S_n \to \{0,1\}$ is a balanced function with all but $\epsilon$ of its Fourier mass on “level $0$ and level $1$” then $f$ must be $\epsilon^{\Omega(1)}$-close to a union of disjoint $1$-cosets.

Finally, some more photos…

2014-03-11 09.42.04

2014-03-11 10.08.46

2014-03-11 18.19.14

5 comments to Simons Symposium 2014 — Day 2

  • kodlu

    Many thanks for blogging from the conference.

    At the end of paragraph one is the statement: Let A be a subset of $F_2^n$ of density α. Then A contains a subspace V of dimension at least (logN)/log(1/α). The proof proceeds by repeated applications of Cauchy-Schwarz.

    Is V a genuine subspace, or can it be a translate of one? i.e., can we have V = a + W where a is a nonzero vector in $F_2^n$ and where W is a genuine subspace (containing 0)?

  • In one of the equations in the middle (after the paragraph starting with “Grothendieck’s inequality has the following non-commutative generalization [..]“), shouldn’t the $A_{ijk\ell}$’s be $M_{ijk\ell}$’s?

Leave a Reply




You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>