For $x \in \{-1,1\}^n$, the number $\chi_S(x) = \prod_{i \in S} x_i$ is in $\{-1,1\}$. Thus $\chi_S : \{-1,1\}^n \to \{-1,1\}$ is a boolean function; it computes the logical *parity*, or *exclusive-or* (XOR), of the bits $(x_i)_{i \in S}$. The parity functions play a special role in the analysis of boolean functions: the Fourier expansion \begin{equation} \label{eqn:parity-basis-expanion} f = \sum_{S \subseteq [n]} \widehat{f}(S)\,\chi_S \end{equation} shows that any $f$ can be represented as a linear combination of parity functions (over the reals).

It’s useful to explore this idea further from the perspective of linear algebra. The set of all functions $f : \{-1,1\}^n \to {\mathbb R}$ forms a vector space $V$, since we can add two functions (pointwise) and we can multiply a function by a real scalar. The vector space $V$ is $2^n$-dimensional: if we like we can think of the functions in this vector space as vectors in ${\mathbb R}^{2^n}$, where we stack the $2^n$ values $f(x)$ into a tall column vector (in some fixed order). Here we illustrate the Fourier expansion of the ${\textstyle \min_2}$ function (equation (1) in the previous section) from this perspective: \begin{equation} \label{eqn:min2-basis-epansion} {\textstyle \min_2} = \begin{bmatrix} +1 \\ -1 \\ -1 \\ -1 \end{bmatrix} \ =\ (-1/2) \begin{bmatrix} +1 \\ +1 \\ +1 \\ +1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ -1 \\ +1 \\ -1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ +1 \\ -1 \\ -1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ -1 \\ -1 \\ +1 \end{bmatrix}. \end{equation}

More generally, the Fourier expansion \eqref{eqn:parity-basis-expanion} shows that every function $f : \{-1,1\}^n \to {\mathbb R}$ in $V$ is a linear combination of the parity functions; i.e., the parity functions are a *spanning set* for $V$. Since the number of parity functions is $2^n = \dim V$, we can deduce that they are in fact a *linearly independent basis* for $V$. In particular this justifies the uniqueness of the Fourier expansion stated in Theorem 1.

We can also introduce an inner product on pairs of function $f, g : \{-1,1\}^n \to {\mathbb R}$ in $V$. The usual inner product on ${\mathbb R}^{2^n}$ would correspond to $\sum_{x \in \{-1,1\}^n} f(x)g(x)$, but it’s more convenient to scale this by a factor of $2^{-n}$, making it an average rather than a sum. In this way, a boolean function $f : \{-1,1\}^n \to \{-1,1\}$ will have $\langle f, f \rangle = 1$, i.e., be a “unit vector”.

Definition 3We define an inner product $\langle \cdot , \cdot \rangle$ on pairs of function $f, g : \{-1,1\}^n \to {\mathbb R}$ by \begin{equation} \label{eqn:inner-product} \langle f, g \rangle = 2^{-n} \sum_{x \in \{-1,1\}^n} f(x) g(x) = \mathop{\bf E}_{\boldsymbol{x} \sim \{-1,1\}^n} \left[f(\boldsymbol{x}) g(\boldsymbol{x})\right]. \end{equation} We also use the notation $\|f\|_2 = \sqrt{\langle f, f \rangle}$, and more generally, \[ \|f\|_p = \mathop{\bf E}[|f(\boldsymbol{x})|^p]^{1/p}. \]

Here we have introduced probabilistic notation which will be used heavily throughout the book:

Notation 4We write $\boldsymbol{x} \sim \{-1,1\}^n$ to denote that $\boldsymbol{x}$ is a uniformly chosen random string from $\{-1,1\}^n$. Equivalently, the $n$ coordinates $\boldsymbol{x}_i$ are independently chosen to be $+1$ with probability $1/2$ and $-1$ with probability $1/2$. We always write random variables inboldface. Probabilities $\mathop{\bf Pr}$ and expectations $\mathop{\bf E}$ will always be with respect to a uniformly random $\boldsymbol{x} \sim \{-1,1\}^n$ unless otherwise specified. Thus we might write the expectation in \eqref{eqn:inner-product} as $\mathop{\bf E}_{\boldsymbol{x}}[f(\boldsymbol{x}) g(\boldsymbol{x})]$ or $\mathop{\bf E}[f(\boldsymbol{x}) g(\boldsymbol{x})]$ or even $\mathop{\bf E}[f g]$.

Returning to the basis of parity functions for $V$, the crucial fact underlying all analysis of boolean functions is that this is an *orthonormal basis*.

Theorem 5The $2^n$ parity functions $\chi_S : \{-1,1\}^n \to \{-1,1\}$ form an orthonormal basis for the vector space $V$ of functions $\{-1,1\}^n \to {\mathbb R}$. That is, \begin{equation*} \langle \chi_S, \chi_T \rangle = \begin{cases} 1 & \text{if $S = T$,}\\ 0 & \text{if $S \neq T$.} \end{cases} \end{equation*}

Recalling the definition $\langle \chi_S, \chi_T \rangle = \mathop{\bf E}[\chi_S(\boldsymbol{x}) \chi_T(\boldsymbol{x})]$, Theorem 5 follows immediately from two facts:

Fact 6For $x \in \{-1,1\}^n$ it holds that $\chi_S(x) \chi_T(x) = \chi_{S \triangle T}(x)$, where $S \triangle T$ denotes symmetric difference.

*Proof:* $\displaystyle \chi_S(x) \chi_T(x) = \prod_{i \in S} x_i \prod_{i \in T} x_i = \prod_{i \in S \triangle T} x_i \prod_{i \in S \cap T} x_i^2 = \prod_{i \in S \triangle T} x_i = \chi_{S \triangle T}(x).$ $\Box$

Fact 7\[ \mathop{\bf E}[\chi_S(\boldsymbol{x})] = \mathop{\bf E}\Bigl[\prod_{i \in S} \boldsymbol{x}_i\Bigr] = \begin{cases} 1 & \text{if $S = \emptyset$,}\\ 0 & \text{if $S \neq \emptyset$.} \end{cases} \]

*Proof:* If $S = \emptyset$ then $\mathop{\bf E}[\chi_S(\boldsymbol{x})] = \mathop{\bf E}[1] = 1$. Otherwise, \[ \mathop{\bf E}\Bigl[\prod_{i \in S} \boldsymbol{x}_i\Bigr] = \prod_{i \in S} \mathop{\bf E}[\boldsymbol{x}_i] \] because the random bits $\boldsymbol{x}_1, \dots, \boldsymbol{x}_n$ are independent. But each of the factors $\mathop{\bf E}[\boldsymbol{x}_i]$ in the above (nonempty) product is $(1/2) (+1) + (1/2) (-1) = 0$. $\Box$

Maybe it’s just me, but I think it will be helpful to explain what a logical parity function is.