For $x \in \{-1,1\}^n$, the number $\chi_S(x) = \prod_{i \in S} x_i$ is in $\{-1,1\}$. Thus $\chi_S : \{-1,1\}^n \to \{-1,1\}$ is a boolean function; it computes the logical *parity*, or *exclusive-or* (XOR), of the bits $(x_i)_{i \in S}$. The parity functions play a special role in the analysis of boolean functions: the Fourier expansion \begin{equation} \label{eqn:parity-basis-expanion} f = \sum_{S \subseteq [n]} \widehat{f}(S)\,\chi_S \end{equation} shows that any $f$ can be represented as a linear combination of parity functions (over the reals).

It’s useful to explore this idea further from the perspective of linear algebra. The set of all functions $f : \{-1,1\}^n \to {\mathbb R}$ forms a vector space $V$, since we can add two functions (pointwise) and we can multiply a function by a real scalar. The vector space $V$ is $2^n$-dimensional: if we like we can think of the functions in this vector space as vectors in ${\mathbb R}^{2^n}$, where we stack the $2^n$ values $f(x)$ into a tall column vector (in some fixed order). Here we illustrate the Fourier expansion of the ${\textstyle \min_2}$ function (equation (1) in the previous section) from this perspective: \begin{equation} \label{eqn:min2-basis-epansion} {\textstyle \min_2} = \begin{bmatrix} +1 \\ -1 \\ -1 \\ -1 \end{bmatrix} \ =\ (-1/2) \begin{bmatrix} +1 \\ +1 \\ +1 \\ +1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ -1 \\ +1 \\ -1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ +1 \\ -1 \\ -1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ -1 \\ -1 \\ +1 \end{bmatrix}. \end{equation}

More generally, the Fourier expansion \eqref{eqn:parity-basis-expanion} shows that every function $f : \{-1,1\}^n \to {\mathbb R}$ in $V$ is a linear combination of the parity functions; i.e., the parity functions are a *spanning set* for $V$. Since the number of parity functions is $2^n = \dim V$, we can deduce that they are in fact a *linearly independent basis* for $V$. In particular this justifies the uniqueness of the Fourier expansion stated in Theorem 1.

We can also introduce an inner product on pairs of function $f, g : \{-1,1\}^n \to {\mathbb R}$ in $V$. The usual inner product on ${\mathbb R}^{2^n}$ would correspond to $\sum_{x \in \{-1,1\}^n} f(x)g(x)$, but it’s more convenient to scale this by a factor of $2^{-n}$, making it an average rather than a sum. In this way, a boolean function $f : \{-1,1\}^n \to \{-1,1\}$ will have $\langle f, f \rangle = 1$, i.e., be a “unit vector”.

Definition 3We define an inner product $\langle \cdot , \cdot \rangle$ on pairs of function $f, g : \{-1,1\}^n \to {\mathbb R}$ by \begin{equation} \label{eqn:inner-product} \langle f, g \rangle = 2^{-n} \sum_{x \in \{-1,1\}^n} f(x) g(x) = \mathop{\bf E}_{\boldsymbol{x} \sim \{-1,1\}^n} \left[f(\boldsymbol{x}) g(\boldsymbol{x})\right]. \end{equation} We also use the notation $\|f\|_2 = \sqrt{\langle f, f \rangle}$, and more generally, \[ \|f\|_p = \mathop{\bf E}[|f(\boldsymbol{x})|^p]^{1/p}. \]

Here we have introduced probabilistic notation which will be used heavily throughout the book:

Notation 4We write $\boldsymbol{x} \sim \{-1,1\}^n$ to denote that $\boldsymbol{x}$ is a uniformly chosen random string from $\{-1,1\}^n$. Equivalently, the $n$ coordinates $\boldsymbol{x}_i$ are independently chosen to be $+1$ with probability $1/2$ and $-1$ with probability $1/2$. We always write random variables inboldface. Probabilities $\mathop{\bf Pr}$ and expectations $\mathop{\bf E}$ will always be with respect to a uniformly random $\boldsymbol{x} \sim \{-1,1\}^n$ unless otherwise specified. Thus we might write the expectation in \eqref{eqn:inner-product} as $\mathop{\bf E}_{\boldsymbol{x}}[f(\boldsymbol{x}) g(\boldsymbol{x})]$ or $\mathop{\bf E}[f(\boldsymbol{x}) g(\boldsymbol{x})]$ or even $\mathop{\bf E}[f g]$.

Returning to the basis of parity functions for $V$, the crucial fact underlying all analysis of boolean functions is that this is an *orthonormal basis*.

Theorem 5The $2^n$ parity functions $\chi_S : \{-1,1\}^n \to \{-1,1\}$ form an orthonormal basis for the vector space $V$ of functions $\{-1,1\}^n \to {\mathbb R}$. That is, \begin{equation*} \langle \chi_S, \chi_T \rangle = \begin{cases} 1 & \text{if $S = T$,}\\ 0 & \text{if $S \neq T$.} \end{cases} \end{equation*}

Recalling the definition $\langle \chi_S, \chi_T \rangle = \mathop{\bf E}[\chi_S(\boldsymbol{x}) \chi_T(\boldsymbol{x})]$, Theorem 5 follows immediately from two facts:

Fact 6For $x \in \{-1,1\}^n$ it holds that $\chi_S(x) \chi_T(x) = \chi_{S \triangle T}(x)$, where $S \triangle T$ denotes symmetric difference.

*Proof:* $\displaystyle \chi_S(x) \chi_T(x) = \prod_{i \in S} x_i \prod_{i \in T} x_i = \prod_{i \in S \triangle T} x_i \prod_{i \in S \cap T} x_i^2 = \prod_{i \in S \triangle T} x_i = \chi_{S \triangle T}(x).$ $\Box$

Fact 7\[ \mathop{\bf E}[\chi_S(\boldsymbol{x})] = \mathop{\bf E}\Bigl[\prod_{i \in S} \boldsymbol{x}_i\Bigr] = \begin{cases} 1 & \text{if $S = \emptyset$,}\\ 0 & \text{if $S \neq \emptyset$.} \end{cases} \]

*Proof:* If $S = \emptyset$ then $\mathop{\bf E}[\chi_S(\boldsymbol{x})] = \mathop{\bf E}[1] = 1$. Otherwise, \[ \mathop{\bf E}\Bigl[\prod_{i \in S} \boldsymbol{x}_i\Bigr] = \prod_{i \in S} \mathop{\bf E}[\boldsymbol{x}_i] \] because the random bits $\boldsymbol{x}_1, \dots, \boldsymbol{x}_n$ are independent. But each of the factors $\mathop{\bf E}[\boldsymbol{x}_i]$ in the above (nonempty) product is $(1/2) (+1) + (1/2) (-1) = 0$. $\Box$

Maybe it’s just me, but I think it will be helpful to explain what a logical parity function is.

Definition 3: LHS of the inner product -> is it really g(x)?

No, it’s $\langle f, g \rangle = 2^{-n} \sum_{x \in \{-1,1\}^n} f(x) g(x) = \mathop{\bf E}_{\boldsymbol{x} \sim \{-1,1\}^n} \left[f(\boldsymbol{x}) g(\boldsymbol{x})\right]$. But yeah, on Chrome for me the first part of the equation is getting cut off and it looks like it just says $g(x) = \mathop{\bf E}_{\boldsymbol{x} \sim \{-1,1\}^n} \left[f(\boldsymbol{x}) g(\boldsymbol{x})\right]$. Weird. It looks okay on Firefox… Not sure what to say. This website is kind of held together with tape and strings Wait for the free pdf is my best suggestion.

Will do, thanks. Just wondered if it was a typo.

I confuse the notation S in Fact 1.7.

I wonder that the symbol S is a subset of [n] or a random variable.

If S is a subset of [n], E[\chi_S(x)]=Pr{S}=2^{-n} by definition 1.3.

What is the key point of my mistakes?

Thanks!

Hi Ming. Here S stands for a fixed (non-random) subset of [n].