I originally planned for Chapter 9 of the book to be about additive combinatorics. However in the interests of completing the book before I die of old age, I am planning to make some cuts. It now looks like Chapter 9 (being a rather standalone topic in the context of the book) will hit the cutting room floor. Drafts of some parts of it are already written, and I decided to publish here a “deleted scene” — the Quasipolynomial Freiman–Ruzsa Theorem highlight. It seemed timely to me in light of several other very nice surveys/expositions/extensions that have appeared recently. Shachar Lovett has posted one survey here, and Eli Ben-Sasson, Noga Ron-Zewi, Madhur Tulsiani, and Julia Wolf have just posted a paper on the topic. Also, Sanders himself has a 1.5-page proof of his result in ${\mathbb F}_2^n$ in a manuscript called *On the Bogolyubov–Ruzsa Lemma in* ${\mathbb F}_2^n$; I don’t know if it’s appeared anywhere, but I’m sure he’ll send you a copy if you ask nicely

Below is the draft of what was to appear in the book on this topic. I would like to emphasize that its interpretation/proof of the Croot–Sisask result is based on joint discussions with Eric Blais and Shachar Lovett.

In this section we give Sanders’s proof **[San10,San10b]** of the following “almost-polynomial Bogolyubov conjecture”:

Sanders’s TheoremLet $A \subseteq {\mathbb F}_2^n$ have volume $0 < \alpha \leq 1/2$. Then $A+A$ contains $99$% of an affine subspace of codimension $O(\log^{4}(1/\alpha))$.

The proof of Sanders’s Theorem has two key components. One is a Fourier component — Chang’s Lemma, a slight generalization of the Level-$1$ Inequality. The other is a purely probabilistic result due to Croot and Sisask **[CS10]**. Let’s first give Chang’s Lemma:

Chang’s LemmaLet $A \subseteq {\mathbb F}_2^n$ be a set of volume $\alpha$. Let $\Gamma = \{\gamma \in {\widehat{{\mathbb F}_2^n}} : |\widehat{\varphi_A}(\gamma)| \geq \epsilon\}$. Then $\dim(\mathrm{span}(\Gamma)) \leq 2\ln(1/\alpha)/\epsilon^2$.

*Proof:* Let $\Gamma’$ be a maximal linear independent subset of $\Gamma$. By applying a suitable invertible linear transformation to the domain ${\mathbb F}_2^n$ we may assume that $\Gamma’ = \{e_1, e_2, \dots, e_d\}$. Note that $d = \dim(\mathrm{span}(\Gamma))$. Now the Level-$1$ Inequality implies that \begin{multline*} 2\alpha^2\ln(1/\alpha) \geq \mathbf{W}^{1}[1_A] = \alpha^2 \mathbf{W}^{1}[\varphi_A] \\ \Rightarrow\quad 2\ln(1/\alpha) \geq \mathbf{W}^{1}[\varphi_A] \geq \widehat{\varphi_A}(e_1)^2 + \cdots + \widehat{\varphi_A}(e_d)^2 \geq d\epsilon^2; \end{multline*} hence $d \leq 2\ln(1/\alpha)/\epsilon^2$ as claimed. $\Box$

The second key ingredient, due to Croot and Sisask **[CS10]**, is concerned with the “smoothing” effect of convolution by $\varphi_A$ whenever $A$ is a “large” set. Roughly speaking the result says that for a fixed function $f$, the function $\varphi_A * f$ is sufficiently smoothed that it must be roughly the same as $\varphi_{z+A} * f$ for many translates $z$. Here we present a variant of the Croot–Sisask result:

Theorem 1Fix $0 < \alpha, \epsilon, \delta \leq 1/2$. Let $\mathcal{F}$ be a collection of functions $f : {\mathbb F}_2^n \to [-1,1]$. Let $\Phi$ be a collection of probability density functions on ${\mathbb F}_2^n$, where each $\varphi \in \Phi$ satisfies $\|\varphi\|_\infty \leq 1/\alpha$. Then there is a subcollection $\Phi’ \subseteq \Phi$ with $|\Phi’|/|\Phi| \geq \alpha^{O(\log(1/\delta)/\epsilon^2)}$ such that for all $\varphi, \psi \in \Phi’$, \[ | \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \psi}[f(\boldsymbol{y})]| \leq \epsilon \] holds for at least a $1-\delta$ fraction of the $f \in \mathcal{F}$.

*Proof:* We give the proof assuming that the collection $\Phi$ consists of uniform densities $\varphi_B$ over sets $B$ of volume at least $\alpha$. The extension to general densities of “high min-entropy” is left for the exercises.

First, suppose we fix some $\varphi_B \in \Phi$ and some $f \in \mathcal{F}$. Now imagine we choose $m$ uniformly random elements from $B$ and let $\boldsymbol{S}$ be the (multi-)set thus formed. A simple Chernoff bound tells us that the average of $f$ over $\boldsymbol{S}$ is likely to be close to the average of $f$ over $B$. More precisely, for a certain $m = O(\log(1/\delta)/\epsilon^2)$ we have that with probability at least $1-\delta$ over the choice $\boldsymbol{S} = S$, \[ |\mathop{\bf E}_{\boldsymbol{y} \sim S}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim B}[f(\boldsymbol{y})]| \leq \epsilon. \] Let’s say that $S$ is a “good $B$-sampler” for $f$ when this event occurs. Our observations show that when $\boldsymbol{S}$ is randomly chosen, the expected fraction of $f \in \mathcal{F}$ for which it is a *bad* $B$-sampler is at most $\delta$. Thus by Markov’s inequality, \[ \mathop{\bf Pr}_{\substack{\boldsymbol{S} \text{ a random} \\ \text{$m$-multiset of $B$}}}[\boldsymbol{S} \text{ is a bad $B$-sampler for more than a $2\delta$ fraction of $f \in \mathcal{F}$}] \leq \frac12. \] Let’s say that $S$ is a “good $B$-sampler for $\mathcal{F}$” when the event above does *not* occur. Thus \[ \mathop{\bf Pr}_{\substack{\boldsymbol{S} \text{ a random} \\ \text{$m$-multiset of $B$}}}[\boldsymbol{S} \text{ is a good $B$-sampler for $\mathcal{F}$}] \geq \frac12. \]

Next, let $\boldsymbol{T}$ be a random (multi-)set formed by choosing $m$ uniformly random elements of ${\mathbb F}_2^n$. Consider the event $E_B$ that $\boldsymbol{T} \subseteq B$. Since $B$ has volume at least $\alpha$ we have $\mathop{\bf Pr}[E_B] \geq \alpha^m$. Further, conditioned on $E_B$ the distribution of $\boldsymbol{T}$ is precisely that of the random multiset $\boldsymbol{S}$ we just analyzed. Thus we may conclude that for each fixed $\varphi_B \in \Phi$, \[ \mathop{\bf Pr}_{\boldsymbol{T}}[\boldsymbol{T} \text{ is a good $B$-sampler for $\mathcal{F}$}] \geq \frac12 \alpha^m. \] It follows immediately that there must exist some $T$ which is a good $B$-sampler for $\mathcal{F}$ for at least a $\frac12 \alpha^m = \alpha^{O(\log(1/\delta)/\epsilon^2)}$ fraction of $\varphi_B \in \Phi$. These $\varphi_B$’s will constitute our subcollection $\Phi’$. What we have just shown is that for any $\varphi \in \Phi’$, \[ | \mathop{\bf E}_{\boldsymbol{y} \sim T}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})]| \leq \epsilon \quad \text{for all but at most a $2\delta$ fraction of $f \in \mathcal{F}$}. \] Using the triangle inequality we may conclude that for any $\varphi, \psi \in \Phi’$, \[ | \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[f(\boldsymbol{y})] – \mathop{\bf E}_{\boldsymbol{y} \sim \psi}[f(\boldsymbol{y})]| \leq 2\epsilon \quad \text{for all but at most a $4\delta$ fraction of $f \in \mathcal{F}$}. \] This completes the proof (after adjusting the constants on $\epsilon$ and $\delta$). $\Box$

An almost immediate corollary is that if $A$ is a large set then there is a somewhat large set $Z$ such that $\varphi_{z+A} * f$ is “close” to $\varphi_{A} * f$ for all $z \in Z$. Let’s first define the appropriate notion of closeness, then give the corollary.

Definition 2Let $f, g : {\mathbb F}_2^n \to {\mathbb R}$. We write $\displaystyle f \mathop{\approx}^{\epsilon,\delta} g$ to denote that $|f(x) – g(x)| \leq \epsilon$ for all but at most a $\delta$ fraction of $x \in {\mathbb F}_2^n$.

Corollary 3Fix $0 < \epsilon, \delta \leq 1/2$. Let $f : {\mathbb F}_2^n \to [-1,1]$ and let $A \subseteq {\mathbb F}_2^n$ have volume $0 < \alpha \leq 1/2$. Then there exists a set $Z \subseteq {\mathbb F}_2^n$ of volume at least $\alpha^{O(\log(1/\delta)/\epsilon^2)}$ such that \[ \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{A} * f \] for all $z \in Z$.

*Proof:* Apply Theorem 1 with $\Phi = \{\varphi_{z+A} : z \in {\mathbb F}_2^n\}$ and $\mathcal{F} = \{f^{+t} : t \in {\mathbb F}_2^n\}$. The resulting subcollection $\Phi’$ is naturally associated with a set of translates $Z’$ of the claimed volume, and the theorem tells us that for all $z, z’ \in Z’$, \[ | \langle \varphi_{z+A}, f^{+t} \rangle - \langle \varphi_{z'+A}, f^{+t} \rangle| \leq \epsilon \] holds for all but at most a $\delta$ fraction of $t \in {\mathbb F}_2^n$. But $\langle \varphi, f^{+t} \rangle$ is simply $\varphi * f(t)$ and hence we have \[ \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{z' + A} * f \] for all $z, z’ \in Z’$. Finally, fix any $z_0 \in Z’$. We know for any $z \in Z’$ that \begin{align*} \varphi_{z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{z_0 + A} * f \quad&\Rightarrow\quad (\varphi_{z+A} * f)^{+z_0} \ \ \mathop{\approx}^{\epsilon, \delta} \ \ (\varphi_{z_0 + A} * f)^{+z_0} \\ &\Rightarrow\quad \varphi_{z_0 + z+A} * f \ \ \mathop{\approx}^{\epsilon, \delta} \ \ \varphi_{A} * f, \end{align*} the first implication holding because the relation $\displaystyle f \mathop{\approx}^{\epsilon,\delta} g$ is clearly preserved under translation. Taking $Z = z_0 + Z’$ completes the proof. $\Box$

By a simple “triangle inequality” argument (see the exercises), we can further deduce:

Corollary 4Let $k \in {\mathbb N}^+$. In the setting of Corollary 3, for any $w \in kZ$, \[ \varphi_{w+A} * f \ \ \mathop{\approx}^{k\epsilon, k\delta} \ \ \varphi_{A} * f. \]

This is (essentially) the result of Croot and Sisask that Sanders uses in his proof. Let’s now give that proof.

*Proof:* Given $A \subseteq {\mathbb F}_2^n$ of volume at least $\alpha$, let $Z$ be as in Corollaries 3, 4 (we will specify $\epsilon$, $\delta$, and $k$ later). Thus for all $w \in kZ$, \[ \varphi_{w+A} * f \ \ \mathop{\approx}^{k\epsilon, k\delta} \ \ \varphi_{A} * f. \] The first observation in the proof is that if we convolve these functions with *another* probability density $\varphi$ having small $\|\varphi\|_\infty$, we can dispense the “$\delta$” part of the closeness. Specifically (see the exercises) we get \[ |\langle \varphi_A, \varphi_{w+A} * f \rangle - \langle \varphi_A, \varphi_{A} * f \rangle| \leq k\epsilon + 2k\delta\|\varphi_A\|_\infty \leq k(\epsilon + 2\delta/\alpha) \] for all $w \in kZ$. Now we select $\epsilon$ and $\delta$ in terms of $k$: we’ll take $\delta = \epsilon \alpha$ and $\epsilon = .005/(3k)$. Then \[ |\langle \varphi_A, \varphi_{w+A} * f \rangle - \langle \varphi_A, \varphi_{A} * f \rangle| \leq .005 \ \ \text{for all $w \in kZ$,} \] where $Z$ has volume at least $\alpha^{O(k^2 \log(k/\alpha))}$. Next, we select $f = 1_{A+A}$. Obviously, \[ \langle \varphi_A, \varphi_{A} * 1_{A+A} \rangle = \langle \varphi_A * \varphi_A, 1_{A+A} \rangle = \mathop{\bf E}_{\substack{\boldsymbol{a} \sim A \\ \boldsymbol{a}' \sim A}}[1_{A+A}(\boldsymbol{a}+\boldsymbol{a}')] = 1; \] hence we deduce \[ \langle \varphi_A, \varphi_{w+A} * 1_{A+A} \rangle = \langle \varphi_{w+A} * \varphi_A, 1_{A+A} \rangle \geq .995 \ \ \text{for all $w \in kZ$}. \] In particular, the above holds (in expectation) if $\boldsymbol{w}$ is the sum of $k$ independently chosen uniformly random elements of $Z$. Thus \begin{equation} \label{eqn:sanders-key} \langle \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle \geq .995. \end{equation} We now would like to replace $Z$ with a subspace of ${\mathbb F}_2^n$ of roughly the same cardinality. To that end, define $\Gamma = \{\gamma \in {\widehat{{\mathbb F}_2^n}} : |\widehat{\varphi_Z}(\gamma)| \geq 1/2\}$ and define $H = \mathrm{span}(\Gamma)$. By Chang’s Lemma we have $\dim(H) \leq O(k^2 \log(k/\alpha) \log(1/\alpha))$. We will eventually choose $k = \Theta(\log(1/\alpha))$, yielding $\dim(H) \leq O(\log^4(1/\alpha))$. To complete the proof we will show that $A+A$ contains $99$% of the points in a translate of $H^\perp$, which indeed has codimension $O(\log^4(1/\alpha))$. We claim this follows from \begin{equation} \label{eqn:sanders-goal} |\langle \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle – \langle \varphi_{H^\perp} * \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle| \leq .005. \end{equation} To see this claim, combine \eqref{eqn:sanders-goal} with \eqref{eqn:sanders-key} obtaining \begin{align*} .99 &\leq \langle \varphi_{H^\perp} * \varphi_{Z}^{* k} * \varphi_{A} * \varphi_A, 1_{A+A} \rangle \\ &= \mathop{\bf E}_{\boldsymbol{h} \sim H^\perp} \ \ \mathop{\bf E}_{\boldsymbol{z}^{(1)}, \dots, \boldsymbol{z}^{(k)}\sim Z} \ \ \mathop{\bf E}_{\boldsymbol{a} \sim A}\ \ \mathop{\bf E}_{\boldsymbol{a}’ \sim A}\ \ [1_{A+A}(\boldsymbol{h} + \boldsymbol{z}^{(1)} + \cdots + \boldsymbol{z}^{(k)} + \boldsymbol{a} + \boldsymbol{a}')]. \end{align*} It follows that there exists a fixed outcome $y$ for $\boldsymbol{z}^{(1)} + \cdots + \boldsymbol{z}^{(k)} + \boldsymbol{a} + \boldsymbol{a}’$ such that $\mathop{\bf E}_{\boldsymbol{h} \sim H^\perp}[1_{A+A}(\boldsymbol{h} + y)] = \mathop{\bf Pr}_{\boldsymbol{h} \sim H^\perp}[\boldsymbol{h} + y \in A+A] \geq .99$. I.e., $A+A$ contains $99$% of the elements of $y + H^\perp$.

It therefore remains to verify \eqref{eqn:sanders-goal}, assuming $k = \Theta(\log(1/\alpha))$. Applying the Fourier formula for convolution, the left-hand side of \eqref{eqn:sanders-goal} is \begin{align*} &\phantom{=} \Bigl| \sum_{\gamma \in {\widehat{{\mathbb F}_2^n}}} \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma) – \sum_{\gamma \in {\widehat{{\mathbb F}_2^n}}} \widehat{\varphi_{H^\perp}}(\gamma) \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma)\Bigr| \\ &= \Bigl| \sum_{\gamma \not \in H} \widehat{\varphi_Z}(\gamma)^k \widehat{\varphi_A}(\gamma)^2 \widehat{1_{A+A}}(\gamma)\Bigr| \tag*{(by Proposition 3.11)} \\ &\leq \sum_{\gamma \not \in H} |\widehat{\varphi_Z}(\gamma)|^k \widehat{\varphi_A}(\gamma)^2 |\widehat{1_{A+A}}(\gamma)| \\ &< \sum_{\gamma \not \in H} (1/2)^k \widehat{\varphi_A}(\gamma)^2 \tag*{(by definition of $H$ and $|\widehat{1_{A+A}}(\gamma)| \leq 1$)} \\ &\leq .005 \alpha \cdot \sum_{\gamma \not \in H} \widehat{\varphi_A}(\gamma)^2 \tag*{(by taking $k = \Theta(\log(1/\alpha))$ )} \\ &\leq .005 \alpha \cdot \|\varphi_A\|_2^2 = .005. \end{align*} This completes the proof of \eqref{eqn:sanders-goal} and hence Sanders’s Theorem. $\Box$

Both the links are to the same paper i.e to that of Lovett’s.

Thanks, fixed.

very minor one in the 2nd line of first paragraph,”I am planning to makes some cuts”–>” I am planning to make some cuts”..

oops,my bad..

No, I had fixed it silently Thanks!

I think Sanders’s manuscript is available on the arxiv now: http://arxiv.org/pdf/1011.0107v2.pdf.

Ah, this is the full paper, but he once sent me a 1.25-page manuscript that just did the result in ${\mathbb F}_2^n$. I’m not sure that’s appeared anywhere.

I think it basically appears as the appendix of that paper.

Oh yeah, so it seems. Thanks Deepak and Pravesh!

The equation after “we can dispense the $\delta$ part”, there is $\phi_A$ instead of $\varphi_A$.

Thank you!