The most obvious spectral property of a truly random function $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ is that all of its Fourier coefficients are very small (as we saw in Exercise 5.8).

Let’s switch notation to $\boldsymbol{f} : \{-1,1\}^n \to \{0,1\}$; in this case $\boldsymbol{f}(\emptyset)$ will not be very small but rather very close to $1/2$. Generalizing:

Proposition 1Let $n > 1$ and let $\boldsymbol{f} : \{-1,1\}^n \to \{0,1\}$ be a $p$-biased random function; i.e., each $\boldsymbol{f}(x)$ is $1$ with probability $p$ and $0$ with probability $1-p$, independently for all $x \in \{-1,1\}^n$. Then except with probability at most $2^{-n}$, all of the following hold: \[ |\widehat{\boldsymbol{f}}(\emptyset) - p| \leq 2\sqrt{n} 2^{-n/2}, \qquad \forall S \neq \emptyset \quad |\widehat{\boldsymbol{f}}(S)| \leq 2\sqrt{n} 2^{-n/2}. \]

*Proof:* We have $\widehat{\boldsymbol{f}}(S) = \sum_{x} \frac{1}{2^n} x^S \boldsymbol{f}(x)$, where the random variables $\boldsymbol{f}(x)$ are independent. If $S = \emptyset$ then the coefficients $\frac{1}{2^n} x^S$ sum to $1$ and the mean of $\widehat{\boldsymbol{f}}(S)$ is $p$; otherwise the coefficients sum to $0$ and the mean of $\widehat{\boldsymbol{f}}(S)$ is $0$. Either way we may apply the Hoeffding bound to conclude that \[ \mathop{\bf Pr}[|\widehat{\boldsymbol{f}}(S) - \mathop{\bf E}[\widehat{\boldsymbol{f}}(S)]| \geq t] \leq 2\exp(-t^2 \cdot 2^{n-1}) \] for any $t > 0$. Selecting $t = 2\sqrt{n} 2^{-n/2}$, the above bound is $2\exp(-2n) \leq 4^{-n}$. The result follows by taking a union bound over all $S \subseteq [n]$. $\Box$

This proposition motivates the following basic notion of “pseudorandomness”:

Definition 2A function $f : \{-1,1\}^n \to {\mathbb R}$ is$\epsilon$-regular(sometimes called$\epsilon$-uniform) if $|\widehat{f}(S)| \leq \epsilon$ for all $S \neq \emptyset$.

Remark 3By Exercise 3.8, every function $f$ is $\epsilon$-regular for $\epsilon = \|f\|_1$. We are often concerned with $f : \{-1,1\}^n \to [-1,1]$, in which case we focus on $\epsilon \leq 1$.

Examples 4Proposition 1 states that a random $p$-biased function is $(2\sqrt{n} 2^{-n/2})$-regular with very high probability. A function is $0$-regular if and only if it is constant (even though you might not think of a constant function as very “random”). If $A \subseteq {\mathbb F}_2^n$ is an affine subspace of codimension $k$ then $1_A$ is $2^{-k}$-regular (Proposition 3.12). For $n$ even the inner product mod $2$ function and the complete quadratic function, $\mathrm{IP}_{n}, \mathrm{CQ}_n : {\mathbb F}_2^{n} \to \{0,1\}$, are $2^{-n/2-1}$-regular (Exercise 1.1). On the other hand, the parity functions $\chi_S : \{-1,1\}^n \to \{-1,1\}$ are not $\epsilon$-regular for any $\epsilon < 1$ (except for $S = \emptyset$). By Exercise 5.22, $\mathrm{Maj}_n$ is $\tfrac{1}{\sqrt{n}}$-regular.

The notion of regularity can be particularly useful for probability density functions; in this case it is traditional to use an alternate name:

Definition 5If $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ is a probability density which is $\epsilon$-regular, we call it an$\epsilon$-biased density. Equivalently, $\varphi$ is $\epsilon$-biased if and only if $|\mathop{\bf E}_{{\boldsymbol{x}} \sim \varphi}[\chi_\gamma({\boldsymbol{x}})]| \leq \epsilon$ for all $\gamma \in {\widehat{{\mathbb F}_2^n}} \setminus \{0\}$; thus one can think of “$\epsilon$-biased” as meaning “at most $\epsilon$-biased on subspaces”. Note that the marginal of such a distribution on any set of coordinates $J \subseteq [n]$ is also $\epsilon$-biased. If $\varphi$ is $\varphi_A = 1_A/\mathop{\bf E}[1_A]$ for some $A \subseteq {\mathbb F}_2^n$ we call $A$ an$\epsilon$-biased set.

Examples 6For $\varphi$ a probability density we have $\|\varphi\|_1 = \mathop{\bf E}[\varphi] = 1$, so every density is $1$-biased. The density corresponding to the uniform distribution on ${\mathbb F}_2^n$, namely $\varphi \equiv 1$, is the only $0$-biased density. Densities corresponding to the uniform distribution on smaller affine subspaces are “maximally biased”: if $A \subseteq {\mathbb F}_2^n$ is an affine subspace of codimension less than $n$ then $\varphi_A$ is not $\epsilon$-biased for any $\epsilon < 1$ (Proposition 3.12 again). If $E = \{(0, \dots, 0), (1, \dots, 1)\}$ then $E$ is a $1/2$-biased set (an easy computation, see also Exercise 1.1(h)).

There is a “combinatorial” property of functions $f$ which is roughly equivalent to $\epsilon$-regularity. Recall from Exercise 1.28 that $\hat{\lVert} f \hat{\rVert}_4^4$ has an equivalent non-Fourier formula: $\mathop{\bf E}_{{\boldsymbol{x}},\boldsymbol{y},\boldsymbol{z}}[f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z})f({\boldsymbol{x}}+\boldsymbol{y}+\boldsymbol{z})]$. We show (roughly speaking) that $f$ is regular if and only if this expectation is not much bigger than $\mathop{\bf E}[f]^4 = \mathop{\bf E}_{{\boldsymbol{x}},\boldsymbol{y},\boldsymbol{z},\boldsymbol{w}}[f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z})f(\boldsymbol{w})]$:

Proposition 7Let $f : {\mathbb F}_2^n \to {\mathbb R}$. Then

- If $f$ is $\epsilon$-regular then $\hat{\lVert} f \hat{\rVert}_4^4 – \mathop{\bf E}[f]^4 \leq \epsilon^2 \cdot \mathop{\bf Var}[f]$.
- If $f$ is not $\epsilon$-regular then $\hat{\lVert} f \hat{\rVert}_4^4 – \mathop{\bf E}[f]^4 \geq \epsilon^4$.

*Proof:* If $f$ is $\epsilon$-regular then \[ \hat{\lVert} f \hat{\rVert}_4^4 - \mathop{\bf E}[f]^4 = \sum_{S \neq \emptyset} \widehat{f}(S)^4 \leq \max_{S \neq \emptyset} \{\widehat{f}(S)^2\} \cdot \sum_{S \neq \emptyset} \widehat{f}(S)^2 \leq \epsilon^2 \cdot \mathop{\bf Var}[f]. \] On the other hand, if $f$ is not $\epsilon$-regular then $|\widehat{f}(T)| \geq \epsilon$ for some $T \neq \emptyset$; hence $\hat{\lVert} f \hat{\rVert}_4^4$ is at least $\widehat{f}(\emptyset)^4 + \widehat{f}(T)^4 \geq \mathop{\bf E}[f]^4 + \epsilon^4$. $\Box$

The condition of $\epsilon$-regularity — that *all* non empty-set coefficients are small — is quite strong. As we saw when investigating the $\frac{2}{\pi}$ Theorem in Chapter 5.4 it’s also interesting to consider $f$ that merely have $|\widehat{f}(i)| \leq \epsilon$ for all $i \in [n]$; for monotone $f$ this is the same as saying $\mathbf{Inf}_i[f] \leq \epsilon$ for $i$. This suggests two weaker possible notions of pseudorandomness: having all low-degree Fourier coefficients small, and having all influences small. We will consider both possibilities, starting with the second.

Now a randomly chosen $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ will *not* have all of its influences small; in fact as we saw in Exercise 2.10, each $\mathbf{Inf}_i[\boldsymbol{f}]$ is $1/2$ in expectation. However for any $\delta > 0$ it will have all of its *$(1-\delta)$-stable* influences exponentially small (recall Definition 2.51). In the exercises you will show:

Fact 8Fix $\delta \in [0,1]$ and let $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ be a randomly chosen function. Then for any $i \in [n]$, \[ \mathop{\bf E}[\mathbf{Inf}_i^{(1-\delta)}[f]] = \frac{(1-\delta/2)^n}{2-\delta}. \]

This motivates a very important notion of pseudorandomness in the analysis of boolean functions: having all stable-influences small. Recalling the discussion surrounding Proposition 2.53, we can also describe this as having no “notable” coordinates.

Definition 9We say that $f : \{-1,1\}^n \to {\mathbb R}$ has$(\epsilon,\delta)$-small stable influences, orno $(\epsilon,\delta)$-notable coordinates, if $\mathbf{Inf}_{i}^{(1-\delta)}[f] \leq \epsilon$ for each $i \in [n]$. This condition gets stronger as $\epsilon$ and $\delta$ decrease: when $\delta = 0$, meaning $\mathbf{Inf}_{i}[f] \leq \epsilon$ for all $i$, we simply say $f$ has$\epsilon$-small influences.

Examples 10Besides random functions, important examples of boolean-valued functions with no notable coordinates are constants, majority, and large parities. Constant functions are the ultimate in this regard: they have $(0,0)$-small stable influences. (Indeed, constant functions are the only ones with $0$-small influences.) The $\mathrm{Maj}_n$ function has $\frac{1}{\sqrt{n}}$-small influences. To see the distinction between influences and stable influences, consider the parity functions $\chi_S$. Any parity function $\chi_S$ (with $S \neq \emptyset$) has at least one coordinate with maximal influence, $1$. But if $|S|$ is “large” then all of itsstableinfluences will be small: we have $\mathbf{Inf}^{(1-\delta)}_i[\chi_S]$ equal to $(1-\delta)^{|S|-1}$ when $i \in S$ and equal to $0$ otherwise. I.e., $\chi_S$ has $((1-\delta)^{|S|-1}, \delta)$-small stable influences. In particular, $\chi_S$ has $(\epsilon, \delta)$-small stable influences whenever $|S| \geq \frac{\ln(e/\epsilon)}{\delta}$.The prototypical example of a function $f : \{-1,1\}^n \to \{-1,1\}$ which does

nothave small stable influences is an unbiased $k$-junta. Such a function has $\mathop{\bf Var}[f] = 1$ and hence from Fact 2.52 the sum of its $(1-\delta)$-stable influences is at least $(1-\delta)^{k-1}$. Thus $\mathbf{Inf}^{(1-\delta)}_i[f] \geq (1-\delta)^{k-1}/k$ for at least one $i$; hence $f$ doesnothave $((1-\delta)^k/k, \delta)$-small stable influences for any $\delta \in (0,1)$. A somewhat different example is the function $f(x) = x_0 \mathrm{Maj}_n(x_1, \dots, x_n)$, which has $\mathbf{Inf}_0^{(1-\delta)}[f] \geq 1 – \sqrt{\delta}$; see the exercises.

Let’s return to considering the interesting condition that $|\widehat{f}(i)| \leq \epsilon$ for all $i \in [n]$. We will call this condition *$(\epsilon,1)$-regularity*. It is equivalent to saying that $f^{\leq 1}$ is $\epsilon$-regular, or that $f$ has at most $\epsilon$ “correlation” with every dictator: $|\langle f, \pm \chi_i \rangle| \leq \epsilon$ for all $i$. Our third notion of pseudorandomness extends this condition to higher degrees:

Definition 11A function $f : \{-1,1\}^n \to {\mathbb R}$ is$(\epsilon,k)$-regularif $|\widehat{f}(S)| \leq \epsilon$ for all $0 < |S| \leq k$; equivalently, if $f^{\leq k}$ is $\epsilon$-regular. For $k = n$ (or $k = \infty$), this condition coincides with $\epsilon$-regularity. When $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ is an $(\epsilon,k)$-regular probability density, it is more usual to call $\varphi$ (and the associated probability distribution)$(\epsilon,k)$-wise independent.

Below we give two alternate characterizations of $(\epsilon, k)$-regularity; however they are fairly “rough” in the sense that they have exponential losses on $k$. This can be acceptable if $k$ is thought of as a constant. The first characterization is that $f$ is $(\epsilon, k)$-regular if and only if fixing $k$ input coordinates changes $f$’s mean by at most $O(\epsilon)$. The second characterization is the condition that $f$ has $O(\epsilon)$ covariance with every $k$-junta.

Proposition 12Let $f : \{-1,1\}^n \to {\mathbb R}$ and let $\epsilon \geq 0$, $k \in {\mathbb N}$.

- If $f$ is $(\epsilon,k)$-regular then any restriction of at most $k$ coordinates changes $f$’s mean by at most $2^{k} \epsilon$.
- If $f$ is not $(\epsilon,k)$-regular then some restriction to at most $k$ coordinates changes $f$’s mean by more than $\epsilon$.

Proposition 13Let $f : \{-1,1\}^n \to {\mathbb R}$ and let $\epsilon \geq 0$, $k \in {\mathbb N}$.

- If $f$ is $(\epsilon,k)$-regular then $\mathop{\bf Cov}[f,h] \leq \hat{\lVert} h \hat{\rVert}_1 \epsilon$ for any $h : \{-1,1\}^n \to {\mathbb R}$ with $\deg(h) \leq k$. In particular, $\mathop{\bf Cov}[f,h] \leq 2^{k/2} \epsilon$ for any $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.
- If $f$ is not $(\epsilon,k)$-regular then $\mathop{\bf Cov}[f,h] > \epsilon$ for some $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.

We will prove Proposition 12, leaving the proof of Proposition 13 to the exercises.

*Proof:* For the first statement, suppose $f$ is $(\epsilon,k)$-regular and let $J \subseteq [n]$, $z \in \{-1,1\}^{J}$, where $|J| \leq k$. Then the statement holds because \[ \mathop{\bf E}[f_{\overline{J} \mid z}] = \widehat{f}(\emptyset) + \sum_{\emptyset \neq T \subseteq J} \widehat{f}(T)\,z^T \] (Exercise 1.16) and each of the at most $2^{k}$ terms $|\widehat{f}(T)\,z^T| = |\widehat{f}(T)|$ is at most $\epsilon$.

For the second statement, by subtracting a constant from $f$ we may assume without loss of generality that its mean is $0$. Suppose now that $|\widehat{f}(U)| > \epsilon$ where $0 < |U| \leq k$. Let $\boldsymbol{z} \sim \{-1,1\}^U$ be a random restriction to the coordinates $U$. By Corollary 3.22 we have

\[

\mathop{\bf E}[\widehat{f_{\mid \boldsymbol{z}}}(\emptyset)^2] = \sum_{T \subseteq U} \widehat{f}(T)^2 \geq \widehat{f}(U)^2 > \epsilon^2.

\] Thus there must exist a particular $z$ such that $f_{\mid z}(\emptyset)^2 > \epsilon^2$; i.e., $|f_{\mid z}(\emptyset)| > \epsilon$. This restriction changes $f$’s mean by more than $\epsilon$.$\Box$

Taking $\epsilon = 0$ in the above two propositions we obtain:

Corollary 14For $f : \{-1,1\}^n \to {\mathbb R}$, the following are equivalent:

- $f$ is $(0, k)$-regular.
- Every restriction of at most $k$ coordinates leaves $f$’s mean unchanged.
- $\mathbf{Cov}[f,h] = 0$ for every $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.
If $f$ is a probability density, condition (3) is equivalent to $\mathop{\bf E}_{{\boldsymbol{x}} \sim f}[h({\boldsymbol{x}})] = \mathop{\bf E}[h]$ for every $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.

For such functions, additional terminology is used:

Definition 15If $f : \{-1,1\}^n \to \{-1,1\}$ is $(0,k)$-regular, it is also called$k$th-order correlation immune. If $f$ is in addition unbiased then it is called$k$-resilient. Finally, if $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ is a $(0,k)$-regular probability density then we call $\varphi$ (and the associated probability distribution)$k$-wise independent.

Examples 16Any parity function $\chi_S : \{-1,1\}^n \to \{-1,1\}$ with $|S| = k+1$ is $k$-resilient. More generally, so is $\chi_S \cdot g$ for any $g : \{-1,1\}^n \to \{-1,1\}$ which does not depend on the coordinates in $S$. For a good example of a correlation immune function which is not resilient, consider $h : \{-1,1\}^{3m} \to \{-1,1\}$ defined by $h = \chi_{\{1, \dots, 2m\}} \wedge \chi_{\{m+1, \dots, 3m\}}$. This $h$ is not unbiased, being $\mathsf{True}$ on only a $1/4$-fraction of inputs. However, its bias does not change unless at least $2m$ input bits are fixed; hence $h$ is $(2m-1)$th-order correlation immune.

We conclude this section with a diagram indicating how our various notions of pseudorandomness compare:

For precise quantitative statements, counterexamples showing that no other relationships are possible, and explanations for why these notions essentially coincide for monotone functions, see the exercises.

In the proof of proposition 1,

the R.H.S. of the first inequality should be 2\exp(-t^2 2^{n+1}) (without /), and the choice of t should be \sqrt{n}2^{-n/2} (without the constant 2).

Thanks, definitely a bug in there. Hopefully it’s accurate now.

Yes, it is correct now. I think the bound can be improved by a bit since for each x in {-1, 1}^n, the random variable x^S f(x) is either from {0, -1}, or {0, 1} and so the R.H.S. can be 2\exp(-t^2 2^n)?

Sorry I meant to say 2\exp(-t^2 2^{n+1}).

A few small things:

- In Examples 6, “uniform distribution on $F_2$” should be on “$F_2^n$” (exponent ^n is missing).

- In the proof of Proposition 12, the restricted function in the displayed formula $f_{J | z}$ I guess should be $f_{\overline{J} | z}$. Why not use $J$ instead of $\overline{J}$ everywhere?

- There is a lost curly bracket right after the beginning of that proof.

- In the statement of Corollary 14, what is “condition (3)”?

Thanks a lot AA, I think I fixed everything!

in the proof of the second statement of prop. 12 . I am probably missing somethitg but I think

$\hat{g}\left(j\right)=\sum_{j\in S\subseteq U}\pm\hat{f}\left(s\right)$ rather than $\hat{f}\left(U\right)$

Hmm, you’re right, this is a bad proof. The simplest correct (?) proof I can think of right now is the following:

Assume WOLOG that the mean of f is 0. Let U be a Fourier coefficient of cardinality between 0 and k and magnitude at least $\epsilon$. Consider restricting the coordinates in U at random, forming the function g. Now $\mathbf{E}[\widehat{g}(\emptyset)^2] = \sum_{S \subseteq U} \widehat{f}(S)^2$ (this is Corollary 3.22), which is at least $\epsilon^2$ because of the $S = U$ term in the sum. Thus there exists a restriction z such that $\widehat{f|_z}(\emptyset)^2 \geq \epsilon^2$, meaning $|\widehat{f|_z}(\emptyset)| \geq |\epsilon|$, so this restriction does the trick.

Do you see a simpler proof? I think you can also get the required restriction by a deterministic process (set the bits one at a time to make sure the coefficient exceeding $\epsilon$ always stays at least $\epsilon$ in magnitude). But maybe that’s too annoying to explain.

Thanks for finding this mistake!

I don’t know if it’s simpler but here is another proof:

if we restrict U ‘s coordinates to z to get the function g then :

U,f as in your proof if we restrict U ‘s coordinates to z to get the function g then : $h\left(z\right):=\hat{g}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\hat{f}\left(T\right)z^{T} $ as a function of $z $ it must be $\left|h\right|_{\infty} $ regular but $\hat{h}\left(U\right)>\epsilon $

Less messy proof:

Assume $\left|\widehat{f}\left(U\right)\right|>\epsilon $ restrict U ‘s coordinates to z to get a function g then : $h\left(z\right):=\widehat{g}\left(\varnothing\right)-\widehat{f}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\widehat{f}\left(T\right)z^{T} $ We need to show that \left|h\right|_{\infty}>\epsilon to be done. This follows from the fact that \left|h\right|_{\infty}>\left|E\left(h\chi_{U}\right)\right|=\left|\widehat{h}\left(U\right)\right|=\left|\widehat{f}\left(U\right)\right|>\epsilon $

and now hopefully with no LaTeX errors:

Assume $\left|\widehat{f}\left(U\right)\right|>\epsilon $ restrict U ‘s coordinates to z to get a function g then : $h\left(z\right):=\widehat{g}\left(\varnothing\right)-\widehat{f}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\widehat{f}\left(T\right)z^{T} $ We need to show that \left|h\right|_{\infty}>\epsilon to be done. This follows from the fact that $\left|h\right|_{\infty}>\left|E\left(h\chi_{U}\right)\right|=\left|\widehat{h}\left(U\right)\right|=\left|\widehat{f}\left(U\right)\right|>\epsilon $

Assume $\left|\widehat{f}\left(U\right)\right|>\epsilon $ restrict $U $’s coordinates to $z $ to get a function $g $ then : $h\left(z\right):=\widehat{g}\left(\varnothing\right)-\widehat{f}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\widehat{f}\left(T\right)z^{T} $ We need to show that $\left|h\right|_{\infty}>\epsilon $ to be done. This follows from the fact that $\left|h\right|_{\infty}\geq\left|E\left(h\chi_{U}\right)\right|=\left|\widehat{h}\left(U\right)\right|=\left|\widehat{f}\left(U\right)\right|>\epsilon $

Great, thanks — I’ll use this proof in the final version of the book (with acknowledgment)!

There may be a small typo at the start of the proof of the second statement on Prop. 6.12

(middle of p. 135 of book): change “… where $0 < |U| \leq k$" to "… where $0 < |J| \leq k".

Great catch, thanks!