In linear algebra there are two equivalent definitions of what it means for a function to be linear:

Definition 29A function $f : {\mathbb F}_2^n \to {\mathbb F}_2$ islinearif either of the following equivalent conditions hold:

In an exercise you are asked to verify that the conditions are indeed equivalent. If we encode the output of $f$ by $\pm 1 \in {\mathbb R}$ in the usual way then the “linear” functions $f : {\mathbb F}_2^n \to \{-1,1\}$ are precisely the $2^n$ parity functions $(\chi_S)_{S \subseteq [n]}$.

Let’s think of what it might mean for a function $f : {\mathbb F}_2^n \to {\mathbb F}_2$ to be *approximately* linear. Definition 29 suggests two possibilities:

- $f(x+y) = f(x) + f(y)$ for
*most*pairs $x, y \in {\mathbb F}_2^n$; - there is some $S \subseteq [n]$ such that $f(x) = \sum_{i \in S} x_i$ for
*most*$x \in {\mathbb F}_2^n$.

Are these equivalent? The proof of (2) $\Rightarrow$ (1) in Definition 29 is “robust”: it easily extends to show (II) $\Rightarrow$ (I) (see the exercises). But the natural proof of (1) $\Rightarrow$ (2) in Definition 29 does not have this robustness property. The goal of this section is to show that (I) $\Rightarrow$ (II) nevertheless holds.

Motivation for this problem comes from an area of theoretical computer science called *property testing* which we will discuss in more detail in Chapter 7. Imagine that you have “black-box” access to a function $f : {\mathbb F}_2^n \to {\mathbb F}_2$, meaning that the function $f$ is unknown to you but you can “query” its value on inputs $x \in {\mathbb F}_2^n$ of your choosing. The function $f$ is “supposed” to be a linear function and you would like to try to verify this.

The only way you can be certain $f$ is indeed a linear function is to query its value on all $2^n$ inputs; unfortunately, this is very expensive. The idea behind “property testing” is to try to verify that $f$ has a certain property — in this case, linearity — by querying its value on just a few random inputs. In exchange for efficiency, we need to be willing to only approximately verify the property.

Definition 30If $f$ and $g$ are boolean-valued functions we say they are$\epsilon$-closeif $\mathrm{dist}(f,g) \leq \epsilon$. If $\mathcal{P}$ is a property of $n$-bit boolean functions, we say that $f$ is$\epsilon$-closeto $\mathcal{P}$ if $f$ is $\epsilon$-close to some $g$ satisfying $\mathcal{P}$.

In particular, in property testing we take property (II) above to be the notion of “approximately linear”: we say $f$ is $\epsilon$-close to being linear if $\mathrm{dist}(f,g) \leq \epsilon$ for some truly linear $g(x) = \sum_{i \in S} x_i$.

Blum, Luby, and Rubinfeld **[BLR90]** showed that indeed (I) $\Rightarrow$ (II) holds, giving the following “test” for the property of linearity that makes just $3$ queries:

BLR TestGiven query access to $f : {\mathbb F}_2^n \to {\mathbb F}_2$:

- Choose ${\boldsymbol{x}} \sim {\mathbb F}_2^n$ and $\boldsymbol{y} \sim {\mathbb F}_2^n$ independently.
- Query $f$ at ${\boldsymbol{x}}$, $\boldsymbol{y}$, and ${\boldsymbol{x}} + \boldsymbol{y}$.
- “Accept” if $f({\boldsymbol{x}}) + f(\boldsymbol{y}) = f({\boldsymbol{x}} + \boldsymbol{y})$.

We now show that if the BLR Test accepts $f$ with high probability then $f$ is close to being linear. The proof works by directly relating the acceptance probability to the quantity $\sum_{S} \widehat{f}(S)^3$; see equation \eqref{eqn:BLR} below.

Theorem 31Suppose the BLR Test accepts $f : {\mathbb F}_2^n \to {\mathbb F}_2$ with probability $1 – \epsilon$. Then $f$ is $\epsilon$-close to being linear.

*Proof:* In order to use the Fourier transform we encode $f$’s output by $\pm 1 \in {\mathbb R}$; thus the acceptance condition of the BLR Test becomes $f({\boldsymbol{x}})f(\boldsymbol{y}) = f({\boldsymbol{x}} + \boldsymbol{y})$. Since \[ \tfrac{1}{2} + \tfrac{1}{2} f({\boldsymbol{x}})f(\boldsymbol{y})f({\boldsymbol{x}}+\boldsymbol{y}) = \begin{cases} 1 & \text{if $f({\boldsymbol{x}})f(\boldsymbol{y}) = f({\boldsymbol{x}} + \boldsymbol{y})$}, \\ 0 & \text{if $f({\boldsymbol{x}})f(\boldsymbol{y}) \neq f({\boldsymbol{x}} + \boldsymbol{y})$,} \end{cases} \] we conclude \begin{align*} 1 – \epsilon = \mathop{\bf Pr}[\text{BLR accepts $f$}] &= \mathop{\bf E}_{{\boldsymbol{x}}, \boldsymbol{y}}[\tfrac{1}{2} + \tfrac{1}{2} f({\boldsymbol{x}})f(\boldsymbol{y})f({\boldsymbol{x}}+\boldsymbol{y})] \\ &= \tfrac{1}{2} + \tfrac{1}{2} \mathop{\bf E}_{{\boldsymbol{x}}}[f({\boldsymbol{x}}) \cdot \mathop{\bf E}_{\boldsymbol{y}}[f(\boldsymbol{y})f({\boldsymbol{x}}+\boldsymbol{y})]]\\ &= \tfrac{1}{2} + \tfrac{1}{2} \mathop{\bf E}_{{\boldsymbol{x}}}[f({\boldsymbol{x}}) \cdot (f * f)({\boldsymbol{x}})] \tag{by definition}\\ &= \tfrac{1}{2} + \tfrac{1}{2} \sum_{S \subseteq [n]} \widehat{f}(S) \widehat{f * f}(S) \tag{Plancherel} \\ &= \tfrac{1}{2} + \tfrac{1}{2} \sum_{S \subseteq [n]} \widehat{f}(S)^3 \tag*{(Theorem 28).} \end{align*} We rearrange this equality and then continue: \begin{align} 1 – 2\epsilon &= \sum_{S \subseteq [n]} \widehat{f}(S)^3 \label{eqn:BLR}\\ &\leq \max_{S \subseteq [n]} \{\widehat{f}(S)\} \cdot \sum_{S \subseteq [n]} \widehat{f}(S)^2 \nonumber\\ &= \max_{S \subseteq [n]} \{\widehat{f}(S)\} \tag*{(Parseval).} \end{align} But $\widehat{f}(S) = \langle f, \chi_S \rangle = 1 – 2\mathrm{dist}(f, \chi_S)$ (Proposition 9). Hence there exists some $S^* \subseteq [n]$ such that $1 – 2\epsilon \leq 1 – 2\mathrm{dist}(f,\chi_{S*})$; i.e., $f$ is $\epsilon$-close to the linear function $\chi_{S*}$. $\Box$

In fact, for small $\epsilon$ one can show that $f$ is more like $(\epsilon/3)$-close to linear, and this is sharp. See the exercises.

The BLR Test shows that given black-box access to $f : {\mathbb F}_2^n \to \{-1,1\}$, we can “test” whether $f$ is close to some linear function $\chi_S$ using just $3$ queries. The test does not reveal *which* linear function $\chi_S$ is close to (indeed, determining this takes at least $n$ queries; see the exercises). Nevertheless, we can still determine the value of $\chi_S(x)$ with high probability for *every* $x \in {\mathbb F}_2^n$ of our choosing using just $2$ queries. This property is called *local correctability* of linear functions.

Proposition 32Suppose $f : {\mathbb F}_2^n \to \{-1,1\}$ is $\epsilon$-close to the linear function $\chi_S$. Then for every $x \in {\mathbb F}_2^n$, the following algorithm outputs $\chi_S(x)$ with probability at least $1-2\epsilon$:

- Choose $\boldsymbol{y} \sim {\mathbb F}_2^n$.
- Query $f$ at $\boldsymbol{y}$ and $x+\boldsymbol{y}$.
- Output $f(\boldsymbol{y})f(x+\boldsymbol{y})$.

We emphasize the order of quantifiers here: if we just output $f(x)$ then this will equal $\chi_S(x)$ for *most* $x$; however, the above “local correcting” algorithm determines $\chi_S(x)$ (with high probability) for *every* $x$.

*Proof:* Since $\boldsymbol{y}$ and $x+\boldsymbol{y}$ are both uniformly distributed on ${\mathbb F}_2^n$ (though not independently) we have $\mathop{\bf Pr}[f(\boldsymbol{y}) \neq \chi_S(\boldsymbol{y})] \leq \epsilon$ and $\mathop{\bf Pr}[f(x+\boldsymbol{y}) \neq \chi_S(x+\boldsymbol{y})] \leq \epsilon$ by assumption. By the union bound, the probability of either event occurring is at most $2\epsilon$; when neither occurs, \[ f(\boldsymbol{y}) f(x+\boldsymbol{y}) = \chi_S(\boldsymbol{y}) \chi_S(x + \boldsymbol{y}) = \chi_S(x) \] as desired. $\Box$

In the lead-in to Prop 32, you have “…we can still still determine…”. Also, in Chrome $f($ renders in an unpleasant fashion, for what it’s worth. Thanks!

-Zach

Thanks! Yeah, I’m not too satisfied with how $\widehat{f}(S)$ renders in Chrome, either, but I guess we’ll have to live with it.

Does this fourier-analytic proof of soundness of the BLR test extend to testing linear maps F_2^n->F_2^m, for arbitrary n,m? I know that the majority-correction proof of BLR works for any groups.

Hi Michael, not as far as I am aware. Hopefully, I will be writing about the “1%” version of the $\mathbb{F}_2^n \to \mathbb{F}_2^m$ problem — i.e., Polynomial Freiman-Ruzsa — in a later chapter.

In I and II ,I think it is supposed to be

“for almost all” instead of “most” because

if we take $f\left(x\right)=\begin{cases}

1 & x\neq0,x_{1}=0\\

0 & \mbox{ otherwise}

\end{cases} then for most x ‘s f\left(x\right)=0x$ but f\left(x+y\right)=f\left(x\right)+f\left(y\right) only if one of x,y,x+y is zero.

In I and II ,I think it is supposed to be

“for almost all” instead of “most” because if we take $f\left(x\right)=\begin{cases}

1 & x\neq0,x_{1}=0\\

0 & \mbox{ otherwise}

\end{cases} $ then for most $x$ ‘s $f\left(x\right)=0x$ but $f\left(x+y\right)=f\left(x\right)+f\left(y\right)$ only if one of $x,y,x+y$ is zero.

I see what you mean. I guess by “most” I, informally, meant “almost all”. But you’re right, it’s probably clearly if I actually write that. Thanks.