§6.1: Notions of pseudorandomness

The most obvious spectral property of a truly random function $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ is that all of its Fourier coefficients are very small (as we saw in Exercise 5.8).

Let’s switch notation to $\boldsymbol{f} : \{-1,1\}^n \to \{0,1\}$; in this case $\boldsymbol{f}(\emptyset)$ will not be very small but rather very close to $1/2$. Generalizing:

Proposition 1 Let $n > 1$ and let $\boldsymbol{f} : \{-1,1\}^n \to \{0,1\}$ be a $p$-biased random function; i.e., each $\boldsymbol{f}(x)$ is $1$ with probability $p$ and $0$ with probability $1-p$, independently for all $x \in \{-1,1\}^n$. Then except with probability at most $2^{-n}$, all of the following hold: \[ |\widehat{\boldsymbol{f}}(\emptyset) - p| \leq 2\sqrt{n} 2^{-n/2}, \qquad \forall S \neq \emptyset \quad |\widehat{\boldsymbol{f}}(S)| \leq 2\sqrt{n} 2^{-n/2}. \]

Proof: We have $\widehat{\boldsymbol{f}}(S) = \sum_{x} \frac{1}{2^n} x^S \boldsymbol{f}(x)$, where the random variables $\boldsymbol{f}(x)$ are independent. If $S = \emptyset$ then the coefficients $\frac{1}{2^n} x^S$ sum to $1$ and the mean of $\widehat{\boldsymbol{f}}(S)$ is $p$; otherwise the coefficients sum to $0$ and the mean of $\widehat{\boldsymbol{f}}(S)$ is $0$. Either way we may apply the Hoeffding bound to conclude that \[ \mathop{\bf Pr}[|\widehat{\boldsymbol{f}}(S) - \mathop{\bf E}[\widehat{\boldsymbol{f}}(S)]| \geq t] \leq 2\exp(-t^2 \cdot 2^{n-1}) \] for any $t > 0$. Selecting $t = 2\sqrt{n} 2^{-n/2}$, the above bound is $2\exp(-2n) \leq 4^{-n}$. The result follows by taking a union bound over all $S \subseteq [n]$. $\Box$

This proposition motivates the following basic notion of “pseudorandomness”:

Definition 2 A function $f : \{-1,1\}^n \to {\mathbb R}$ is $\epsilon$-regular (sometimes called $\epsilon$-uniform) if $|\widehat{f}(S)| \leq \epsilon$ for all $S \neq \emptyset$.

Remark 3 By Exercise 3.8, every function $f$ is $\epsilon$-regular for $\epsilon = \|f\|_1$. We are often concerned with $f : \{-1,1\}^n \to [-1,1]$, in which case we focus on $\epsilon \leq 1$.

Examples 4 Proposition 1 states that a random $p$-biased function is $(2\sqrt{n} 2^{-n/2})$-regular with very high probability. A function is $0$-regular if and only if it is constant (even though you might not think of a constant function as very “random”). If $A \subseteq {\mathbb F}_2^n$ is an affine subspace of codimension $k$ then $1_A$ is $2^{-k}$-regular (Proposition 3.12). For $n$ even the inner product mod $2$ function and the complete quadratic function, $\mathrm{IP}_{n}, \mathrm{CQ}_n : {\mathbb F}_2^{n} \to \{0,1\}$, are $2^{-n/2-1}$-regular (Exercise 1.1). On the other hand, the parity functions $\chi_S : \{-1,1\}^n \to \{-1,1\}$ are not $\epsilon$-regular for any $\epsilon < 1$ (except for $S = \emptyset$). By Exercise 5.22, $\mathrm{Maj}_n$ is $\tfrac{1}{\sqrt{n}}$-regular.

The notion of regularity can be particularly useful for probability density functions; in this case it is traditional to use an alternate name:

Definition 5 If $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ is a probability density which is $\epsilon$-regular, we call it an $\epsilon$-biased density. Equivalently, $\varphi$ is $\epsilon$-biased if and only if $|\mathop{\bf E}_{{\boldsymbol{x}} \sim \varphi}[\chi_\gamma({\boldsymbol{x}})]| \leq \epsilon$ for all $\gamma \in {\widehat{{\mathbb F}_2^n}} \setminus \{0\}$; thus one can think of “$\epsilon$-biased” as meaning “at most $\epsilon$-biased on subspaces”. Note that the marginal of such a distribution on any set of coordinates $J \subseteq [n]$ is also $\epsilon$-biased. If $\varphi$ is $\varphi_A = 1_A/\mathop{\bf E}[1_A]$ for some $A \subseteq {\mathbb F}_2^n$ we call $A$ an $\epsilon$-biased set.

Examples 6 For $\varphi$ a probability density we have $\|\varphi\|_1 = \mathop{\bf E}[\varphi] = 1$, so every density is $1$-biased. The density corresponding to the uniform distribution on ${\mathbb F}_2^n$, namely $\varphi \equiv 1$, is the only $0$-biased density. Densities corresponding to the uniform distribution on smaller affine subspaces are “maximally biased”: if $A \subseteq {\mathbb F}_2^n$ is an affine subspace of codimension less than $n$ then $\varphi_A$ is not $\epsilon$-biased for any $\epsilon < 1$ (Proposition 3.12 again). If $E = \{(0, \dots, 0), (1, \dots, 1)\}$ then $E$ is a $1/2$-biased set (an easy computation, see also Exercise 1.1(h)).

There is a “combinatorial” property of functions $f$ which is roughly equivalent to $\epsilon$-regularity. Recall from Exercise 1.28 that $\hat{\lVert} f \hat{\rVert}_4^4$ has an equivalent non-Fourier formula: $\mathop{\bf E}_{{\boldsymbol{x}},\boldsymbol{y},\boldsymbol{z}}[f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z})f({\boldsymbol{x}}+\boldsymbol{y}+\boldsymbol{z})]$. We show (roughly speaking) that $f$ is regular if and only if this expectation is not much bigger than $\mathop{\bf E}[f]^4 = \mathop{\bf E}_{{\boldsymbol{x}},\boldsymbol{y},\boldsymbol{z},\boldsymbol{w}}[f({\boldsymbol{x}})f(\boldsymbol{y})f(\boldsymbol{z})f(\boldsymbol{w})]$:

Proposition 7 Let $f : {\mathbb F}_2^n \to {\mathbb R}$. Then

  1. If $f$ is $\epsilon$-regular then $\hat{\lVert} f \hat{\rVert}_4^4 – \mathop{\bf E}[f]^4 \leq \epsilon^2 \cdot \mathop{\bf Var}[f]$.
  2. If $f$ is not $\epsilon$-regular then $\hat{\lVert} f \hat{\rVert}_4^4 – \mathop{\bf E}[f]^4 \geq \epsilon^4$.

Proof: If $f$ is $\epsilon$-regular then \[ \hat{\lVert} f \hat{\rVert}_4^4 - \mathop{\bf E}[f]^4 = \sum_{S \neq \emptyset} \widehat{f}(S)^4 \leq \max_{S \neq \emptyset} \{\widehat{f}(S)^2\} \cdot \sum_{S \neq \emptyset} \widehat{f}(S)^2 \leq \epsilon^2 \cdot \mathop{\bf Var}[f]. \] On the other hand, if $f$ is not $\epsilon$-regular then $|\widehat{f}(T)| \geq \epsilon$ for some $T \neq \emptyset$; hence $\hat{\lVert} f \hat{\rVert}_4^4$ is at least $\widehat{f}(\emptyset)^4 + \widehat{f}(T)^4 \geq \mathop{\bf E}[f]^4 + \epsilon^4$. $\Box$

The condition of $\epsilon$-regularity — that all non empty-set coefficients are small — is quite strong. As we saw when investigating the $\frac{2}{\pi}$ Theorem in Chapter 5.4 it’s also interesting to consider $f$ that merely have $|\widehat{f}(i)| \leq \epsilon$ for all $i \in [n]$; for monotone $f$ this is the same as saying $\mathbf{Inf}_i[f] \leq \epsilon$ for $i$. This suggests two weaker possible notions of pseudorandomness: having all low-degree Fourier coefficients small, and having all influences small. We will consider both possibilities, starting with the second.

Now a randomly chosen $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ will not have all of its influences small; in fact as we saw in Exercise 2.10, each $\mathbf{Inf}_i[\boldsymbol{f}]$ is $1/2$ in expectation. However for any $\delta > 0$ it will have all of its $(1-\delta)$-stable influences exponentially small (recall Definition 2.51). In the exercises you will show:

Fact 8 Fix $\delta \in [0,1]$ and let $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ be a randomly chosen function. Then for any $i \in [n]$, \[ \mathop{\bf E}[\mathbf{Inf}_i^{(1-\delta)}[f]] = \frac{(1-\delta/2)^n}{2-\delta}. \]

This motivates a very important notion of pseudorandomness in the analysis of boolean functions: having all stable-influences small. Recalling the discussion surrounding Proposition 2.53, we can also describe this as having no “notable” coordinates.

Definition 9 We say that $f : \{-1,1\}^n \to {\mathbb R}$ has $(\epsilon,\delta)$-small stable influences, or no $(\epsilon,\delta)$-notable coordinates, if $\mathbf{Inf}_{i}^{(1-\delta)}[f] \leq \epsilon$ for each $i \in [n]$. This condition gets stronger as $\epsilon$ and $\delta$ decrease: when $\delta = 0$, meaning $\mathbf{Inf}_{i}[f] \leq \epsilon$ for all $i$, we simply say $f$ has $\epsilon$-small influences.

Examples 10 Besides random functions, important examples of boolean-valued functions with no notable coordinates are constants, majority, and large parities. Constant functions are the ultimate in this regard: they have $(0,0)$-small stable influences. (Indeed, constant functions are the only ones with $0$-small influences.) The $\mathrm{Maj}_n$ function has $\frac{1}{\sqrt{n}}$-small influences. To see the distinction between influences and stable influences, consider the parity functions $\chi_S$. Any parity function $\chi_S$ (with $S \neq \emptyset$) has at least one coordinate with maximal influence, $1$. But if $|S|$ is “large” then all of its stable influences will be small: we have $\mathbf{Inf}^{(1-\delta)}_i[\chi_S]$ equal to $(1-\delta)^{|S|-1}$ when $i \in S$ and equal to $0$ otherwise. I.e., $\chi_S$ has $((1-\delta)^{|S|-1}, \delta)$-small stable influences. In particular, $\chi_S$ has $(\epsilon, \delta)$-small stable influences whenever $|S| \geq \frac{\ln(e/\epsilon)}{\delta}$.

The prototypical example of a function $f : \{-1,1\}^n \to \{-1,1\}$ which does not have small stable influences is an unbiased $k$-junta. Such a function has $\mathop{\bf Var}[f] = 1$ and hence from Fact 2.52 the sum of its $(1-\delta)$-stable influences is at least $(1-\delta)^{k-1}$. Thus $\mathbf{Inf}^{(1-\delta)}_i[f] \geq (1-\delta)^{k-1}/k$ for at least one $i$; hence $f$ does not have $((1-\delta)^k/k, \delta)$-small stable influences for any $\delta \in (0,1)$. A somewhat different example is the function $f(x) = x_0 \mathrm{Maj}_n(x_1, \dots, x_n)$, which has $\mathbf{Inf}_0^{(1-\delta)}[f] \geq 1 – \sqrt{\delta}$; see the exercises.

Let’s return to considering the interesting condition that $|\widehat{f}(i)| \leq \epsilon$ for all $i \in [n]$. We will call this condition $(\epsilon,1)$-regularity. It is equivalent to saying that $f^{\leq 1}$ is $\epsilon$-regular, or that $f$ has at most $\epsilon$ “correlation” with every dictator: $|\langle f, \pm \chi_i \rangle| \leq \epsilon$ for all $i$. Our third notion of pseudorandomness extends this condition to higher degrees:

Definition 11 A function $f : \{-1,1\}^n \to {\mathbb R}$ is $(\epsilon,k)$-regular if $|\widehat{f}(S)| \leq \epsilon$ for all $0 < |S| \leq k$; equivalently, if $f^{\leq k}$ is $\epsilon$-regular. For $k = n$ (or $k = \infty$), this condition coincides with $\epsilon$-regularity. When $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ is an $(\epsilon,k)$-regular probability density, it is more usual to call $\varphi$ (and the associated probability distribution) $(\epsilon,k)$-wise independent.

Below we give two alternate characterizations of $(\epsilon, k)$-regularity; however they are fairly “rough” in the sense that they have exponential losses on $k$. This can be acceptable if $k$ is thought of as a constant. The first characterization is that $f$ is $(\epsilon, k)$-regular if and only if fixing $k$ input coordinates changes $f$’s mean by at most $O(\epsilon)$. The second characterization is the condition that $f$ has $O(\epsilon)$ covariance with every $k$-junta.

Proposition 12 Let $f : \{-1,1\}^n \to {\mathbb R}$ and let $\epsilon \geq 0$, $k \in {\mathbb N}$.

  1. If $f$ is $(\epsilon,k)$-regular then any restriction of at most $k$ coordinates changes $f$’s mean by at most $2^{k} \epsilon$.
  2. If $f$ is not $(\epsilon,k)$-regular then some restriction to at most $k$ coordinates changes $f$’s mean by more than $\epsilon$.

Proposition 13 Let $f : \{-1,1\}^n \to {\mathbb R}$ and let $\epsilon \geq 0$, $k \in {\mathbb N}$.

  1. If $f$ is $(\epsilon,k)$-regular then $\mathop{\bf Cov}[f,h] \leq \hat{\lVert} h \hat{\rVert}_1 \epsilon$ for any $h : \{-1,1\}^n \to {\mathbb R}$ with $\deg(h) \leq k$. In particular, $\mathop{\bf Cov}[f,h] \leq 2^{k/2} \epsilon$ for any $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.
  2. If $f$ is not $(\epsilon,k)$-regular then $\mathop{\bf Cov}[f,h] > \epsilon$ for some $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.

We will prove Proposition 12, leaving the proof of Proposition 13 to the exercises.

Proof: For the first statement, suppose $f$ is $(\epsilon,k)$-regular and let $J \subseteq [n]$, $z \in \{-1,1\}^{J}$, where $|J| \leq k$. Then the statement holds because \[ \mathop{\bf E}[f_{\overline{J} \mid z}] = \widehat{f}(\emptyset) + \sum_{\emptyset \neq T \subseteq J} \widehat{f}(T)\,z^T \] (Exercise 1.16) and each of the at most $2^{k}$ terms $|\widehat{f}(T)\,z^T| = |\widehat{f}(T)|$ is at most $\epsilon$.

For the second statement, by subtracting a constant from $f$ we may assume without loss of generality that its mean is $0$. Suppose now that $|\widehat{f}(U)| > \epsilon$ where $0 < |U| \leq k$. Let $\boldsymbol{z} \sim \{-1,1\}^U$ be a random restriction to the coordinates $U$. By Corollary 3.22 we have
\mathop{\bf E}[\widehat{f_{\mid \boldsymbol{z}}}(\emptyset)^2] = \sum_{T \subseteq U} \widehat{f}(T)^2 \geq \widehat{f}(U)^2 > \epsilon^2.
\] Thus there must exist a particular $z$ such that $f_{\mid z}(\emptyset)^2 > \epsilon^2$; i.e., $|f_{\mid z}(\emptyset)| > \epsilon$. This restriction changes $f$’s mean by more than $\epsilon$.$\Box$

Taking $\epsilon = 0$ in the above two propositions we obtain:

Corollary 14 For $f : \{-1,1\}^n \to {\mathbb R}$, the following are equivalent:

  1. $f$ is $(0, k)$-regular.
  2. Every restriction of at most $k$ coordinates leaves $f$’s mean unchanged.
  3. $\mathbf{Cov}[f,h] = 0$ for every $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.

If $f$ is a probability density, condition (3) is equivalent to $\mathop{\bf E}_{{\boldsymbol{x}} \sim f}[h({\boldsymbol{x}})] = \mathop{\bf E}[h]$ for every $k$-junta $h : \{-1,1\}^n \to \{-1,1\}$.

For such functions, additional terminology is used:

Definition 15 If $f : \{-1,1\}^n \to \{-1,1\}$ is $(0,k)$-regular, it is also called $k$th-order correlation immune. If $f$ is in addition unbiased then it is called $k$-resilient. Finally, if $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ is a $(0,k)$-regular probability density then we call $\varphi$ (and the associated probability distribution) $k$-wise independent.

Examples 16 Any parity function $\chi_S : \{-1,1\}^n \to \{-1,1\}$ with $|S| = k+1$ is $k$-resilient. More generally, so is $\chi_S \cdot g$ for any $g : \{-1,1\}^n \to \{-1,1\}$ which does not depend on the coordinates in $S$. For a good example of a correlation immune function which is not resilient, consider $h : \{-1,1\}^{3m} \to \{-1,1\}$ defined by $h = \chi_{\{1, \dots, 2m\}} \wedge \chi_{\{m+1, \dots, 3m\}}$. This $h$ is not unbiased, being $\mathsf{True}$ on only a $1/4$-fraction of inputs. However, its bias does not change unless at least $2m$ input bits are fixed; hence $h$ is $(2m-1)$th-order correlation immune.

We conclude this section with a diagram indicating how our various notions of pseudorandomness compare:

Comparing notions of pseudorandomness: arrows go from stronger notions to (strictly) weaker ones

For precise quantitative statements, counterexamples showing that no other relationships are possible, and explanations for why these notions essentially coincide for monotone functions, see the exercises.

17 comments to §6.1: Notions of pseudorandomness

  • Chin Ho Lee

    In the proof of proposition 1,

    the R.H.S. of the first inequality should be 2\exp(-t^2 2^{n+1}) (without /), and the choice of t should be \sqrt{n}2^{-n/2} (without the constant 2).

  • Chin Ho Lee

    Yes, it is correct now. I think the bound can be improved by a bit since for each x in {-1, 1}^n, the random variable x^S f(x) is either from {0, -1}, or {0, 1} and so the R.H.S. can be 2\exp(-t^2 2^n)?

  • AA

    A few small things:

    - In Examples 6, “uniform distribution on $F_2$” should be on “$F_2^n$” (exponent ^n is missing).

    - In the proof of Proposition 12, the restricted function in the displayed formula $f_{J | z}$ I guess should be $f_{\overline{J} | z}$. Why not use $J$ instead of $\overline{J}$ everywhere?

    - There is a lost curly bracket right after the beginning of that proof.

    - In the statement of Corollary 14, what is “condition (3)”?

  • Noam Lifshitz

    in the proof of the second statement of prop. 12 . I am probably missing somethitg but I think
    $\hat{g}\left(j\right)=\sum_{j\in S\subseteq U}\pm\hat{f}\left(s\right)$ rather than $\hat{f}\left(U\right)$

    • Hmm, you’re right, this is a bad proof. The simplest correct (?) proof I can think of right now is the following:

      Assume WOLOG that the mean of f is 0. Let U be a Fourier coefficient of cardinality between 0 and k and magnitude at least $\epsilon$. Consider restricting the coordinates in U at random, forming the function g. Now $\mathbf{E}[\widehat{g}(\emptyset)^2] = \sum_{S \subseteq U} \widehat{f}(S)^2$ (this is Corollary 3.22), which is at least $\epsilon^2$ because of the $S = U$ term in the sum. Thus there exists a restriction z such that $\widehat{f|_z}(\emptyset)^2 \geq \epsilon^2$, meaning $|\widehat{f|_z}(\emptyset)| \geq |\epsilon|$, so this restriction does the trick.

      Do you see a simpler proof? I think you can also get the required restriction by a deterministic process (set the bits one at a time to make sure the coefficient exceeding $\epsilon$ always stays at least $\epsilon$ in magnitude). But maybe that’s too annoying to explain.

      Thanks for finding this mistake!

  • Noam Lifshitz

    I don’t know if it’s simpler but here is another proof:
    if we restrict U ‘s coordinates to z to get the function g then :
    U,f as in your proof if we restrict U ‘s coordinates to z to get the function g then : $h\left(z\right):=\hat{g}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\hat{f}\left(T\right)z^{T} $ as a function of $z $ it must be $\left|h\right|_{\infty} $ regular but $\hat{h}\left(U\right)>\epsilon $

  • Noam Lifshitz

    Less messy proof:
    Assume $\left|\widehat{f}\left(U\right)\right|>\epsilon $ restrict U ‘s coordinates to z to get a function g then : $h\left(z\right):=\widehat{g}\left(\varnothing\right)-\widehat{f}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\widehat{f}\left(T\right)z^{T} $ We need to show that \left|h\right|_{\infty}>\epsilon to be done. This follows from the fact that \left|h\right|_{\infty}>\left|E\left(h\chi_{U}\right)\right|=\left|\widehat{h}\left(U\right)\right|=\left|\widehat{f}\left(U\right)\right|>\epsilon $

  • Noam Lifshitz

    and now hopefully with no LaTeX errors:

    Assume $\left|\widehat{f}\left(U\right)\right|>\epsilon $ restrict U ‘s coordinates to z to get a function g then : $h\left(z\right):=\widehat{g}\left(\varnothing\right)-\widehat{f}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\widehat{f}\left(T\right)z^{T} $ We need to show that \left|h\right|_{\infty}>\epsilon to be done. This follows from the fact that $\left|h\right|_{\infty}>\left|E\left(h\chi_{U}\right)\right|=\left|\widehat{h}\left(U\right)\right|=\left|\widehat{f}\left(U\right)\right|>\epsilon $

  • Noam Lifshitz

    Assume $\left|\widehat{f}\left(U\right)\right|>\epsilon $ restrict $U $’s coordinates to $z $ to get a function $g $ then : $h\left(z\right):=\widehat{g}\left(\varnothing\right)-\widehat{f}\left(\varnothing\right)=\underset{T\subseteq U}{\sum}\widehat{f}\left(T\right)z^{T} $ We need to show that $\left|h\right|_{\infty}>\epsilon $ to be done. This follows from the fact that $\left|h\right|_{\infty}\geq\left|E\left(h\chi_{U}\right)\right|=\left|\widehat{h}\left(U\right)\right|=\left|\widehat{f}\left(U\right)\right|>\epsilon $

  • Matt Franklin

    There may be a small typo at the start of the proof of the second statement on Prop. 6.12
    (middle of p. 135 of book): change “… where $0 < |U| \leq k$" to "… where $0 < |J| \leq k".

  • Ohad Klein

    In example 6, should “of codimension less than n” be “of positive codimension”?

Leave a Reply




You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>