# Additive Combinatorics and the Polynomial Freiman–Rusza Conjecture

I originally planned for Chapter 9 of the book to be about additive combinatorics. However in the interests of completing the book before I die of old age, I am planning to make some cuts. It now looks like Chapter 9 (being a rather standalone topic in the context of the book) will hit the cutting room floor. Drafts of some parts of it are already written, and I decided to publish here a “deleted scene” — the Quasipolynomial Freiman–Ruzsa Theorem highlight. It seemed timely to me in light of several other very nice surveys/expositions/extensions that have appeared recently. Shachar Lovett has posted one survey here, and Eli Ben-Sasson, Noga Ron-Zewi, Madhur Tulsiani, and Julia Wolf have just posted a paper on the topic. Also, Sanders himself has a 1.5-page proof of his result in ${\mathbb F}_2^n$ in a manuscript called On the Bogolyubov–Ruzsa Lemma in ${\mathbb F}_2^n$; I don’t know if it’s appeared anywhere, but I’m sure he’ll send you a copy if you ask nicely

Below is the draft of what was to appear in the book on this topic. I would like to emphasize that its interpretation/proof of the Croot–Sisask result is based on joint discussions with Eric Blais and Shachar Lovett.

In this section we give Sanders’s proof [San10,San10b] of the following “almost-polynomial Bogolyubov conjecture”:

Sanders’s Theorem Let $A \subseteq {\mathbb F}_2^n$ have volume $0 < \alpha \leq 1/2$. Then $A+A$ contains $99$% of an affine subspace of codimension $O(\log^{4}(1/\alpha))$.

The proof of Sanders’s Theorem has two key components. One is a Fourier component — Chang’s Lemma, a slight generalization of the Level-$1$ Inequality. The other is a purely probabilistic result due to Croot and Sisask [CS10]. Let’s first give Chang’s Lemma:

Chang’s Lemma Let $A \subseteq {\mathbb F}_2^n$ be a set of volume $\alpha$. Let $\Gamma = \{\gamma \in {\widehat{{\mathbb F}_2^n}} : |\widehat{\varphi_A}(\gamma)| \geq \epsilon\}$. Then $\dim(\mathrm{span}(\Gamma)) \leq 2\ln(1/\alpha)/\epsilon^2$.

Proof: Let $\Gamma’$ be a maximal linear independent subset of $\Gamma$. By applying a suitable invertible linear transformation to the domain ${\mathbb F}_2^n$ we may assume that $\Gamma’ = \{e_1, e_2, \dots, e_d\}$. Note that $d = \dim(\mathrm{span}(\Gamma))$. Now the Level-$1$ Inequality implies that \begin{multline*} 2\alpha^2\ln(1/\alpha) \geq \mathbf{W}^{1}[1_A] = \alpha^2 \mathbf{W}^{1}[\varphi_A] \\ \Rightarrow\quad 2\ln(1/\alpha) \geq \mathbf{W}^{1}[\varphi_A] \geq \widehat{\varphi_A}(e_1)^2 + \cdots + \widehat{\varphi_A}(e_d)^2 \geq d\epsilon^2; \end{multline*} hence $d \leq 2\ln(1/\alpha)/\epsilon^2$ as claimed. $\Box$

The second key ingredient, due to Croot and Sisask [CS10], is concerned with the “smoothing” effect of convolution by $\varphi_A$ whenever $A$ is a “large” set. Roughly speaking the result says that for a fixed function $f$, the function $\varphi_A * f$ is sufficiently smoothed that it must be roughly the same as $\varphi_{z+A} * f$ for many translates $z$. Here we present a variant of the Croot–Sisask result:

Theorem 1 Fix $0 < \alpha, \epsilon, \delta \leq 1/2$. Let $\mathcal{F}$ be a collection of functions $f : {\mathbb F}_2^n \to [-1,1]$. Let $\Phi$ be a collection of probability density functions on ${\mathbb F}_2^n$, where each $\varphi \in \Phi$ satisfies $\|\varphi\|_\infty \leq 1/\alpha$. Then there is a subcollection $\Phi’ \subseteq \Phi$ with $|\Phi’|/|\Phi| \geq \alpha^{O(\log(1/\delta)/\epsilon^2)}$ such that for all $\varphi, \psi \in \Phi’$, $| \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \psi}[f(\boldsymbol{y})]| \leq \epsilon$ holds for at least a $1-\delta$ fraction of the $f \in \mathcal{F}$.

Proof: We give the proof assuming that the collection $\Phi$ consists of uniform densities $\varphi_B$ over sets $B$ of volume at least $\alpha$. The extension to general densities of “high min-entropy” is left for the exercises.

First, suppose we fix some $\varphi_B \in \Phi$ and some $f \in \mathcal{F}$. Now imagine we choose $m$ uniformly random elements from $B$ and let $\boldsymbol{S}$ be the (multi-)set thus formed. A simple Chernoff bound tells us that the average of $f$ over $\boldsymbol{S}$ is likely to be close to the average of $f$ over $B$. More precisely, for a certain $m = O(\log(1/\delta)/\epsilon^2)$ we have that with probability at least $1-\delta$ over the choice $\boldsymbol{S} = S$, $|\mathop{\bf E}_{\boldsymbol{y} \sim S}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim B}[f(\boldsymbol{y})]| \leq \epsilon.$ Let’s say that $S$ is a “good $B$-sampler” for $f$ when this event occurs. Our observations show that when $\boldsymbol{S}$ is randomly chosen, the expected fraction of $f \in \mathcal{F}$ for which it is a bad $B$-sampler is at most $\delta$. Thus by Markov’s inequality, $\mathop{\bf Pr}_{\substack{\boldsymbol{S} \text{ a random} \\ \text{m-multiset of B}}}[\boldsymbol{S} \text{ is a bad B-sampler for more than a 2\delta fraction of f \in \mathcal{F}}] \leq \frac12.$ Let’s say that $S$ is a “good $B$-sampler for $\mathcal{F}$” when the event above does not occur. Thus $\mathop{\bf Pr}_{\substack{\boldsymbol{S} \text{ a random} \\ \text{m-multiset of B}}}[\boldsymbol{S} \text{ is a good B-sampler for \mathcal{F}}] \geq \frac12.$

Next, let $\boldsymbol{T}$ be a random (multi-)set formed by choosing $m$ uniformly random elements of ${\mathbb F}_2^n$. Consider the event $E_B$ that $\boldsymbol{T} \subseteq B$. Since $B$ has volume at least $\alpha$ we have $\mathop{\bf Pr}[E_B] \geq \alpha^m$. Further, conditioned on $E_B$ the distribution of $\boldsymbol{T}$ is precisely that of the random multiset $\boldsymbol{S}$ we just analyzed. Thus we may conclude that for each fixed $\varphi_B \in \Phi$, $\mathop{\bf Pr}_{\boldsymbol{T}}[\boldsymbol{T} \text{ is a good B-sampler for \mathcal{F}}] \geq \frac12 \alpha^m.$ It follows immediately that there must exist some $T$ which is a good $B$-sampler for $\mathcal{F}$ for at least a $\frac12 \alpha^m = \alpha^{O(\log(1/\delta)/\epsilon^2)}$ fraction of $\varphi_B \in \Phi$. These $\varphi_B$’s will constitute our subcollection $\Phi’$. What we have just shown is that for any $\varphi \in \Phi’$, $| \mathop{\bf E}_{\boldsymbol{y} \sim T}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})]| \leq \epsilon \quad \text{for all but at most a 2\delta fraction of f \in \mathcal{F}}.$ Using the triangle inequality we may conclude that for any $\varphi, \psi \in \Phi’$, $| \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \psi}[f(\boldsymbol{y})]| \leq 2\epsilon \quad \text{for all but at most a 4\delta fraction of f \in \mathcal{F}}.$ This completes the proof (after adjusting the constants on $\epsilon$ and $\delta$). $\Box$

An almost immediate corollary is that if $A$ is a large set then there is a somewhat large set $Z$ such that $\varphi_{z+A} * f$ is “close” to $\varphi_{A} * f$ for all $z \in Z$. Let’s first define the appropriate notion of closeness, then give the corollary.

Definition 2 Let $f, g : {\mathbb F}_2^n \to {\mathbb R}$. We write $\displaystyle f \mathop{\approx}^{\epsilon,\delta} g$ to denote that $|f(x) – g(x)| \leq \epsilon$ for all but at most a $\delta$ fraction of $x \in {\mathbb F}_2^n$.

Corollary 3 Fix $0 < \epsilon, \delta \leq 1/2$. Let $f : {\mathbb F}_2^n \to [-1,1]$ and let $A \subseteq {\mathbb F}_2^n$ have volume $0 < \alpha \leq 1/2$. Then there exists a set $Z \subseteq {\mathbb F}_2^n$ of volume at least $\alpha^{O(\log(1/\delta)/\epsilon^2)}$ such that $\varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{A} * f$ for all $z \in Z$.

Proof: Apply Theorem 1 with $\Phi = \{\varphi_{z+A} : z \in {\mathbb F}_2^n\}$ and $\mathcal{F} = \{f^{+t} : t \in {\mathbb F}_2^n\}$. The resulting subcollection $\Phi’$ is naturally associated with a set of translates $Z’$ of the claimed volume, and the theorem tells us that for all $z, z’ \in Z’$, $| \langle \varphi_{z+A}, f^{+t} \rangle - \langle \varphi_{z'+A}, f^{+t} \rangle| \leq \epsilon$ holds for all but at most a $\delta$ fraction of $t \in {\mathbb F}_2^n$. But $\langle \varphi, f^{+t} \rangle$ is simply $\varphi * f(t)$ and hence we have $\varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{z' + A} * f$ for all $z, z’ \in Z’$. Finally, fix any $z_0 \in Z’$. We know for any $z \in Z’$ that \begin{align*} \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{z_0 + A} * f \quad&\Rightarrow\quad (\varphi_{z+A} * f)^{+z_0} \ \ \mathop{\approx}^{\epsilon, \delta} \ \ (\varphi_{z_0 + A} * f)^{+z_0} \\ &\Rightarrow\quad \varphi_{z_0 + z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{A} * f, \end{align*} the first implication holding because the relation $\displaystyle f \mathop{\approx}^{\epsilon,\delta} g$ is clearly preserved under translation. Taking $Z = z_0 + Z’$ completes the proof. $\Box$

By a simple “triangle inequality” argument (see the exercises), we can further deduce:

Corollary 4 Let $k \in {\mathbb N}^+$. In the setting of Corollary 3, for any $w \in kZ$, $\varphi_{w+A} * f \ \ \mathop{\approx}^{k\epsilon, k\delta} \ \ \varphi_{A} * f.$

This is (essentially) the result of Croot and Sisask that Sanders uses in his proof. Let’s now give that proof.

Proof: Given $A \subseteq {\mathbb F}_2^n$ of volume at least $\alpha$, let $Z$ be as in Corollaries 3, 4 (we will specify $\epsilon$, $\delta$, and $k$ later). Thus for all $w \in kZ$, $\varphi_{w+A} * f \ \ \mathop{\approx}^{k\epsilon, k\delta} \ \ \varphi_{A} * f.$ The first observation in the proof is that if we convolve these functions with another probability density $\varphi$ having small $\|\varphi\|_\infty$, we can dispense the “$\delta$” part of the closeness. Specifically (see the exercises) we get $|\langle \varphi_A, \varphi_{w+A} * f \rangle - \langle \varphi_A, \varphi_{A} * f \rangle| \leq k\epsilon + 2k\delta\|\varphi_A\|_\infty \leq k(\epsilon + 2\delta/\alpha)$ for all $w \in kZ$. Now we select $\epsilon$ and $\delta$ in terms of $k$: we’ll take $\delta = \epsilon \alpha$ and $\epsilon = .005/(3k)$. Then $|\langle \varphi_A, \varphi_{w+A} * f \rangle - \langle \varphi_A, \varphi_{A} * f \rangle| \leq .005 \ \ \text{for all w \in kZ,}$ where $Z$ has volume at least $\alpha^{O(k^2 \log(k/\alpha))}$. Next, we select $f = 1_{A+A}$. Obviously, $\langle \varphi_A, \varphi_{A} * 1_{A+A} \rangle = \langle \varphi_A * \varphi_A, 1_{A+A} \rangle = \mathop{\bf E}_{\substack{\boldsymbol{a} \sim A \\ \boldsymbol{a}' \sim A}}[1_{A+A}(\boldsymbol{a}+\boldsymbol{a}')] = 1;$ hence we deduce $\langle \varphi_A, \varphi_{w+A} * 1_{A+A} \rangle = \langle \varphi_{w+A} * \varphi_A, 1_{A+A} \rangle \geq .995 \ \ \text{for all w \in kZ}.$ In particular, the above holds (in expectation) if $\boldsymbol{w}$ is the sum of $k$ independently chosen uniformly random elements of $Z$. Thus $$\label{eqn:sanders-key} \langle \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle \geq .995.$$ We now would like to replace $Z$ with a subspace of ${\mathbb F}_2^n$ of roughly the same cardinality. To that end, define $\Gamma = \{\gamma \in {\widehat{{\mathbb F}_2^n}} : |\widehat{\varphi_Z}(\gamma)| \geq 1/2\}$ and define $H = \mathrm{span}(\Gamma)$. By Chang’s Lemma we have $\dim(H) \leq O(k^2 \log(k/\alpha) \log(1/\alpha))$. We will eventually choose $k = \Theta(\log(1/\alpha))$, yielding $\dim(H) \leq O(\log^4(1/\alpha))$. To complete the proof we will show that $A+A$ contains $99$% of the points in a translate of $H^\perp$, which indeed has codimension $O(\log^4(1/\alpha))$. We claim this follows from $$\label{eqn:sanders-goal} |\langle \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle – \langle \varphi_{H^\perp} * \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle| \leq .005.$$ To see this claim, combine \eqref{eqn:sanders-goal} with \eqref{eqn:sanders-key} obtaining \begin{align*} .99 &\leq \langle \varphi_{H^\perp} * \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle \\ &= \mathop{\bf E}_{\boldsymbol{h} \sim H^\perp} \ \ \mathop{\bf E}_{\boldsymbol{z}^{(1)}, \dots, \boldsymbol{z}^{(k)}\sim Z} \ \ \mathop{\bf E}_{\boldsymbol{a} \sim A}\ \ \mathop{\bf E}_{\boldsymbol{a}’ \sim A}\ \ [1_{A+A}(\boldsymbol{h} + \boldsymbol{z}^{(1)} + \cdots + \boldsymbol{z}^{(k)} + \boldsymbol{a} + \boldsymbol{a}')]. \end{align*} It follows that there exists a fixed outcome $y$ for $\boldsymbol{z}^{(1)} + \cdots + \boldsymbol{z}^{(k)} + \boldsymbol{a} + \boldsymbol{a}’$ such that $\mathop{\bf E}_{\boldsymbol{h} \sim H^\perp}[1_{A+A}(\boldsymbol{h} + y)] = \mathop{\bf Pr}_{\boldsymbol{h} \sim H^\perp}[\boldsymbol{h} + y \in A+A] \geq .99$. I.e., $A+A$ contains $99$% of the elements of $y + H^\perp$.

It therefore remains to verify \eqref{eqn:sanders-goal}, assuming $k = \Theta(\log(1/\alpha))$. Applying the Fourier formula for convolution, the left-hand side of \eqref{eqn:sanders-goal} is \begin{align*} &\phantom{=} \Bigl| \sum_{\gamma \in {\widehat{{\mathbb F}_2^n}}} \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma) – \sum_{\gamma \in {\widehat{{\mathbb F}_2^n}}} \widehat{\varphi_{H^\perp}}(\gamma) \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma)\Bigr| \\ &= \Bigl| \sum_{\gamma \not \in H} \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma)\Bigr| \tag*{(by Proposition 3.11)} \\ &\leq \sum_{\gamma \not \in H} |\widehat{\varphi_Z}(\gamma)|^k \widehat{\varphi_A}(\gamma)^2 |\widehat{1_{A+A}}(\gamma)| \\ &< \sum_{\gamma \not \in H} (1/2)^k \widehat{\varphi_A}(\gamma)^2 \tag*{(by definition of $H$ and $|\widehat{1_{A+A}}(\gamma)| \leq 1$)} \\ &\leq .005 \alpha \cdot \sum_{\gamma \not \in H} \widehat{\varphi_A}(\gamma)^2 \tag*{(by taking $k = \Theta(\log(1/\alpha))$ )} \\ &\leq .005 \alpha \cdot \|\varphi_A\|_2^2 = .005. \end{align*} This completes the proof of \eqref{eqn:sanders-goal} and hence Sanders’s Theorem. $\Box$

• Sankardeep

Both the links are to the same paper i.e to that of Lovett’s.

• Thanks, fixed.

• Sankardeep

very minor one in the 2nd line of first paragraph,”I am planning to makes some cuts”–>” I am planning to make some cuts”..

• Sankardeep

• Ryan O'Donnell

No, I had fixed it silently Thanks!

• Pravesh Kothari

I think Sanders’s manuscript is available on the arxiv now: http://arxiv.org/pdf/1011.0107v2.pdf.

• Ah, this is the full paper, but he once sent me a 1.25-page manuscript that just did the result in ${\mathbb F}_2^n$. I’m not sure that’s appeared anywhere.

• Deepak

I think it basically appears as the appendix of that paper.

• Oh yeah, so it seems. Thanks Deepak and Pravesh!

• igors

The equation after “we can dispense the $\delta$ part”, there is $\phi_A$ instead of $\varphi_A$.

• Thank you!