Additive Combinatorics and the Polynomial Freiman–Rusza Conjecture

I originally planned for Chapter 9 of the book to be about additive combinatorics. However in the interests of completing the book before I die of old age, I am planning to make some cuts. It now looks like Chapter 9 (being a rather standalone topic in the context of the book) will hit the cutting room floor. Drafts of some parts of it are already written, and I decided to publish here a “deleted scene” — the Quasipolynomial Freiman–Ruzsa Theorem highlight. It seemed timely to me in light of several other very nice surveys/expositions/extensions that have appeared recently. Shachar Lovett has posted one survey here, and Eli Ben-Sasson, Noga Ron-Zewi, Madhur Tulsiani, and Julia Wolf have just posted a paper on the topic. Also, Sanders himself has a 1.5-page proof of his result in ${\mathbb F}_2^n$ in a manuscript called On the Bogolyubov–Ruzsa Lemma in ${\mathbb F}_2^n$; I don’t know if it’s appeared anywhere, but I’m sure he’ll send you a copy if you ask nicely :)

Below is the draft of what was to appear in the book on this topic. I would like to emphasize that its interpretation/proof of the Croot–Sisask result is based on joint discussions with Eric Blais and Shachar Lovett.


In this section we give Sanders’s proof [San10,San10b] of the following “almost-polynomial Bogolyubov conjecture”:

Sanders’s Theorem Let $A \subseteq {\mathbb F}_2^n$ have volume $0 < \alpha \leq 1/2$. Then $A+A$ contains $99$% of an affine subspace of codimension $O(\log^{4}(1/\alpha))$.

The proof of Sanders’s Theorem has two key components. One is a Fourier component — Chang’s Lemma, a slight generalization of the Level-$1$ Inequality. The other is a purely probabilistic result due to Croot and Sisask [CS10]. Let’s first give Chang’s Lemma:

Chang’s Lemma Let $A \subseteq {\mathbb F}_2^n$ be a set of volume $\alpha$. Let $\Gamma = \{\gamma \in {\widehat{{\mathbb F}_2^n}} : |\widehat{\varphi_A}(\gamma)| \geq \epsilon\}$. Then $\dim(\mathrm{span}(\Gamma)) \leq 2\ln(1/\alpha)/\epsilon^2$.

Proof: Let $\Gamma’$ be a maximal linear independent subset of $\Gamma$. By applying a suitable invertible linear transformation to the domain ${\mathbb F}_2^n$ we may assume that $\Gamma’ = \{e_1, e_2, \dots, e_d\}$. Note that $d = \dim(\mathrm{span}(\Gamma))$. Now the Level-$1$ Inequality implies that \begin{multline*} 2\alpha^2\ln(1/\alpha) \geq \mathbf{W}^{1}[1_A] = \alpha^2 \mathbf{W}^{1}[\varphi_A] \\ \Rightarrow\quad 2\ln(1/\alpha) \geq \mathbf{W}^{1}[\varphi_A] \geq \widehat{\varphi_A}(e_1)^2 + \cdots + \widehat{\varphi_A}(e_d)^2 \geq d\epsilon^2; \end{multline*} hence $d \leq 2\ln(1/\alpha)/\epsilon^2$ as claimed. $\Box$

The second key ingredient, due to Croot and Sisask [CS10], is concerned with the “smoothing” effect of convolution by $\varphi_A$ whenever $A$ is a “large” set. Roughly speaking the result says that for a fixed function $f$, the function $\varphi_A * f$ is sufficiently smoothed that it must be roughly the same as $\varphi_{z+A} * f$ for many translates $z$. Here we present a variant of the Croot–Sisask result:

Theorem 1 Fix $0 < \alpha, \epsilon, \delta \leq 1/2$. Let $\mathcal{F}$ be a collection of functions $f : {\mathbb F}_2^n \to [-1,1]$. Let $\Phi$ be a collection of probability density functions on ${\mathbb F}_2^n$, where each $\varphi \in \Phi$ satisfies $\|\varphi\|_\infty \leq 1/\alpha$. Then there is a subcollection $\Phi’ \subseteq \Phi$ with $|\Phi’|/|\Phi| \geq \alpha^{O(\log(1/\delta)/\epsilon^2)}$ such that for all $\varphi, \psi \in \Phi’$, \[ | \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \psi}[f(\boldsymbol{y})]| \leq \epsilon \] holds for at least a $1-\delta$ fraction of the $f \in \mathcal{F}$.


Proof: We give the proof assuming that the collection $\Phi$ consists of uniform densities $\varphi_B$ over sets $B$ of volume at least $\alpha$. The extension to general densities of “high min-entropy” is left for the exercises.

First, suppose we fix some $\varphi_B \in \Phi$ and some $f \in \mathcal{F}$. Now imagine we choose $m$ uniformly random elements from $B$ and let $\boldsymbol{S}$ be the (multi-)set thus formed. A simple Chernoff bound tells us that the average of $f$ over $\boldsymbol{S}$ is likely to be close to the average of $f$ over $B$. More precisely, for a certain $m = O(\log(1/\delta)/\epsilon^2)$ we have that with probability at least $1-\delta$ over the choice $\boldsymbol{S} = S$, \[ |\mathop{\bf E}_{\boldsymbol{y} \sim S}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim B}[f(\boldsymbol{y})]| \leq \epsilon. \] Let’s say that $S$ is a “good $B$-sampler” for $f$ when this event occurs. Our observations show that when $\boldsymbol{S}$ is randomly chosen, the expected fraction of $f \in \mathcal{F}$ for which it is a bad $B$-sampler is at most $\delta$. Thus by Markov’s inequality, \[ \mathop{\bf Pr}_{\substack{\boldsymbol{S} \text{ a random} \\ \text{$m$-multiset of $B$}}}[\boldsymbol{S} \text{ is a bad $B$-sampler for more than a $2\delta$ fraction of $f \in \mathcal{F}$}] \leq \frac12. \] Let’s say that $S$ is a “good $B$-sampler for $\mathcal{F}$” when the event above does not occur. Thus \[ \mathop{\bf Pr}_{\substack{\boldsymbol{S} \text{ a random} \\ \text{$m$-multiset of $B$}}}[\boldsymbol{S} \text{ is a good $B$-sampler for $\mathcal{F}$}] \geq \frac12. \]

Next, let $\boldsymbol{T}$ be a random (multi-)set formed by choosing $m$ uniformly random elements of ${\mathbb F}_2^n$. Consider the event $E_B$ that $\boldsymbol{T} \subseteq B$. Since $B$ has volume at least $\alpha$ we have $\mathop{\bf Pr}[E_B] \geq \alpha^m$. Further, conditioned on $E_B$ the distribution of $\boldsymbol{T}$ is precisely that of the random multiset $\boldsymbol{S}$ we just analyzed. Thus we may conclude that for each fixed $\varphi_B \in \Phi$, \[ \mathop{\bf Pr}_{\boldsymbol{T}}[\boldsymbol{T} \text{ is a good $B$-sampler for $\mathcal{F}$}] \geq \frac12 \alpha^m. \] It follows immediately that there must exist some $T$ which is a good $B$-sampler for $\mathcal{F}$ for at least a $\frac12 \alpha^m = \alpha^{O(\log(1/\delta)/\epsilon^2)}$ fraction of $\varphi_B \in \Phi$. These $\varphi_B$’s will constitute our subcollection $\Phi’$. What we have just shown is that for any $\varphi \in \Phi’$, \[ | \mathop{\bf E}_{\boldsymbol{y} \sim T}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})]| \leq \epsilon \quad \text{for all but at most a $2\delta$ fraction of $f \in \mathcal{F}$}. \] Using the triangle inequality we may conclude that for any $\varphi, \psi \in \Phi’$, \[ | \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \psi}[f(\boldsymbol{y})]| \leq 2\epsilon \quad \text{for all but at most a $4\delta$ fraction of $f \in \mathcal{F}$}. \] This completes the proof (after adjusting the constants on $\epsilon$ and $\delta$). $\Box$

An almost immediate corollary is that if $A$ is a large set then there is a somewhat large set $Z$ such that $\varphi_{z+A} * f$ is “close” to $\varphi_{A} * f$ for all $z \in Z$. Let’s first define the appropriate notion of closeness, then give the corollary.

Definition 2 Let $f, g : {\mathbb F}_2^n \to {\mathbb R}$. We write $\displaystyle f \mathop{\approx}^{\epsilon,\delta} g$ to denote that $|f(x) – g(x)| \leq \epsilon$ for all but at most a $\delta$ fraction of $x \in {\mathbb F}_2^n$.

Corollary 3 Fix $0 < \epsilon, \delta \leq 1/2$. Let $f : {\mathbb F}_2^n \to [-1,1]$ and let $A \subseteq {\mathbb F}_2^n$ have volume $0 < \alpha \leq 1/2$. Then there exists a set $Z \subseteq {\mathbb F}_2^n$ of volume at least $\alpha^{O(\log(1/\delta)/\epsilon^2)}$ such that \[ \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{A} * f \] for all $z \in Z$.


Proof: Apply Theorem 1 with $\Phi = \{\varphi_{z+A} : z \in {\mathbb F}_2^n\}$ and $\mathcal{F} = \{f^{+t} : t \in {\mathbb F}_2^n\}$. The resulting subcollection $\Phi’$ is naturally associated with a set of translates $Z’$ of the claimed volume, and the theorem tells us that for all $z, z’ \in Z’$, \[ | \langle \varphi_{z+A}, f^{+t} \rangle - \langle \varphi_{z'+A}, f^{+t} \rangle| \leq \epsilon \] holds for all but at most a $\delta$ fraction of $t \in {\mathbb F}_2^n$. But $\langle \varphi, f^{+t} \rangle$ is simply $\varphi * f(t)$ and hence we have \[ \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{z' + A} * f \] for all $z, z’ \in Z’$. Finally, fix any $z_0 \in Z’$. We know for any $z \in Z’$ that \begin{align*} \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{z_0 + A} * f \quad&\Rightarrow\quad (\varphi_{z+A} * f)^{+z_0} \ \ \mathop{\approx}^{\epsilon, \delta} \ \ (\varphi_{z_0 + A} * f)^{+z_0} \\ &\Rightarrow\quad \varphi_{z_0 + z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{A} * f, \end{align*} the first implication holding because the relation $\displaystyle f \mathop{\approx}^{\epsilon,\delta} g$ is clearly preserved under translation. Taking $Z = z_0 + Z’$ completes the proof. $\Box$

By a simple “triangle inequality” argument (see the exercises), we can further deduce:

Corollary 4 Let $k \in {\mathbb N}^+$. In the setting of Corollary 3, for any $w \in kZ$, \[ \varphi_{w+A} * f \ \ \mathop{\approx}^{k\epsilon, k\delta} \ \ \varphi_{A} * f. \]

This is (essentially) the result of Croot and Sisask that Sanders uses in his proof. Let’s now give that proof.


Proof: Given $A \subseteq {\mathbb F}_2^n$ of volume at least $\alpha$, let $Z$ be as in Corollaries 3, 4 (we will specify $\epsilon$, $\delta$, and $k$ later). Thus for all $w \in kZ$, \[ \varphi_{w+A} * f \ \ \mathop{\approx}^{k\epsilon, k\delta} \ \ \varphi_{A} * f. \] The first observation in the proof is that if we convolve these functions with another probability density $\varphi$ having small $\|\varphi\|_\infty$, we can dispense the “$\delta$” part of the closeness. Specifically (see the exercises) we get \[ |\langle \varphi_A, \varphi_{w+A} * f \rangle - \langle \varphi_A, \varphi_{A} * f \rangle| \leq k\epsilon + 2k\delta\|\varphi_A\|_\infty \leq k(\epsilon + 2\delta/\alpha) \] for all $w \in kZ$. Now we select $\epsilon$ and $\delta$ in terms of $k$: we’ll take $\delta = \epsilon \alpha$ and $\epsilon = .005/(3k)$. Then \[ |\langle \varphi_A, \varphi_{w+A} * f \rangle - \langle \varphi_A, \varphi_{A} * f \rangle| \leq .005 \ \ \text{for all $w \in kZ$,} \] where $Z$ has volume at least $\alpha^{O(k^2 \log(k/\alpha))}$. Next, we select $f = 1_{A+A}$. Obviously, \[ \langle \varphi_A, \varphi_{A} * 1_{A+A} \rangle = \langle \varphi_A * \varphi_A, 1_{A+A} \rangle = \mathop{\bf E}_{\substack{\boldsymbol{a} \sim A \\ \boldsymbol{a}' \sim A}}[1_{A+A}(\boldsymbol{a}+\boldsymbol{a}')] = 1; \] hence we deduce \[ \langle \varphi_A, \varphi_{w+A} * 1_{A+A} \rangle = \langle \varphi_{w+A} * \varphi_A, 1_{A+A} \rangle \geq .995 \ \ \text{for all $w \in kZ$}. \] In particular, the above holds (in expectation) if $\boldsymbol{w}$ is the sum of $k$ independently chosen uniformly random elements of $Z$. Thus \begin{equation} \label{eqn:sanders-key} \langle \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle \geq .995. \end{equation} We now would like to replace $Z$ with a subspace of ${\mathbb F}_2^n$ of roughly the same cardinality. To that end, define $\Gamma = \{\gamma \in {\widehat{{\mathbb F}_2^n}} : |\widehat{\varphi_Z}(\gamma)| \geq 1/2\}$ and define $H = \mathrm{span}(\Gamma)$. By Chang’s Lemma we have $\dim(H) \leq O(k^2 \log(k/\alpha) \log(1/\alpha))$. We will eventually choose $k = \Theta(\log(1/\alpha))$, yielding $\dim(H) \leq O(\log^4(1/\alpha))$. To complete the proof we will show that $A+A$ contains $99$% of the points in a translate of $H^\perp$, which indeed has codimension $O(\log^4(1/\alpha))$. We claim this follows from \begin{equation} \label{eqn:sanders-goal} |\langle \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle – \langle \varphi_{H^\perp} * \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle| \leq .005. \end{equation} To see this claim, combine \eqref{eqn:sanders-goal} with \eqref{eqn:sanders-key} obtaining \begin{align*} .99 &\leq \langle \varphi_{H^\perp} * \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle \\ &= \mathop{\bf E}_{\boldsymbol{h} \sim H^\perp} \ \ \mathop{\bf E}_{\boldsymbol{z}^{(1)}, \dots, \boldsymbol{z}^{(k)}\sim Z} \ \ \mathop{\bf E}_{\boldsymbol{a} \sim A}\ \ \mathop{\bf E}_{\boldsymbol{a}’ \sim A}\ \ [1_{A+A}(\boldsymbol{h} + \boldsymbol{z}^{(1)} + \cdots + \boldsymbol{z}^{(k)} + \boldsymbol{a} + \boldsymbol{a}')]. \end{align*} It follows that there exists a fixed outcome $y$ for $\boldsymbol{z}^{(1)} + \cdots + \boldsymbol{z}^{(k)} + \boldsymbol{a} + \boldsymbol{a}’$ such that $\mathop{\bf E}_{\boldsymbol{h} \sim H^\perp}[1_{A+A}(\boldsymbol{h} + y)] = \mathop{\bf Pr}_{\boldsymbol{h} \sim H^\perp}[\boldsymbol{h} + y \in A+A] \geq .99$. I.e., $A+A$ contains $99$% of the elements of $y + H^\perp$.

It therefore remains to verify \eqref{eqn:sanders-goal}, assuming $k = \Theta(\log(1/\alpha))$. Applying the Fourier formula for convolution, the left-hand side of \eqref{eqn:sanders-goal} is \begin{align*} &\phantom{=} \Bigl| \sum_{\gamma \in {\widehat{{\mathbb F}_2^n}}} \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma) – \sum_{\gamma \in {\widehat{{\mathbb F}_2^n}}} \widehat{\varphi_{H^\perp}}(\gamma) \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma)\Bigr| \\ &= \Bigl| \sum_{\gamma \not \in H} \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma)\Bigr| \tag*{(by Proposition 3.11)} \\ &\leq \sum_{\gamma \not \in H} |\widehat{\varphi_Z}(\gamma)|^k \widehat{\varphi_A}(\gamma)^2 |\widehat{1_{A+A}}(\gamma)| \\ &< \sum_{\gamma \not \in H} (1/2)^k \widehat{\varphi_A}(\gamma)^2 \tag*{(by definition of $H$ and $|\widehat{1_{A+A}}(\gamma)| \leq 1$)} \\ &\leq .005 \alpha \cdot \sum_{\gamma \not \in H} \widehat{\varphi_A}(\gamma)^2 \tag*{(by taking $k = \Theta(\log(1/\alpha))$ )} \\ &\leq .005 \alpha \cdot \|\varphi_A\|_2^2 = .005. \end{align*} This completes the proof of \eqref{eqn:sanders-goal} and hence Sanders’s Theorem. $\Box$

11 comments to Additive Combinatorics and the Polynomial Freiman–Rusza Conjecture

Leave a Reply

  

  

  

You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>