§1.3: The orthonormal basis of parity functions

For $x \in \{-1,1\}^n$, the number $\chi_S(x) = \prod_{i \in S} x_i$ is in $\{-1,1\}$. Thus $\chi_S : \{-1,1\}^n \to \{-1,1\}$ is a boolean function; it computes the logical parity, or exclusive-or (XOR), of the bits $(x_i)_{i \in S}$. The parity functions play a special role in the analysis of boolean functions: the Fourier expansion \begin{equation} \label{eqn:parity-basis-expanion} f = \sum_{S \subseteq [n]} \widehat{f}(S)\,\chi_S \end{equation} shows that any $f$ can be represented as a linear combination of parity functions (over the reals).

It’s useful to explore this idea further from the perspective of linear algebra. The set of all functions $f : \{-1,1\}^n \to {\mathbb R}$ forms a vector space $V$, since we can add two functions (pointwise) and we can multiply a function by a real scalar. The vector space $V$ is $2^n$-dimensional: if we like we can think of the functions in this vector space as vectors in ${\mathbb R}^{2^n}$, where we stack the $2^n$ values $f(x)$ into a tall column vector (in some fixed order). Here we illustrate the Fourier expansion of the ${\textstyle \min_2}$ function (equation (1) in the previous section) from this perspective: \begin{equation} \label{eqn:min2-basis-epansion} {\textstyle \min_2} = \begin{bmatrix} +1 \\ -1 \\ -1 \\ -1 \end{bmatrix} \ =\ (-1/2) \begin{bmatrix} +1 \\ +1 \\ +1 \\ +1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ -1 \\ +1 \\ -1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ +1 \\ -1 \\ -1 \end{bmatrix} \ +(1/2) \begin{bmatrix} +1 \\ -1 \\ -1 \\ +1 \end{bmatrix}. \end{equation}

More generally, the Fourier expansion \eqref{eqn:parity-basis-expanion} shows that every function $f : \{-1,1\}^n \to {\mathbb R}$ in $V$ is a linear combination of the parity functions; i.e., the parity functions are a spanning set for $V$. Since the number of parity functions is $2^n = \dim V$, we can deduce that they are in fact a linearly independent basis for $V$. In particular this justifies the uniqueness of the Fourier expansion stated in Theorem 1.

We can also introduce an inner product on pairs of function $f, g : \{-1,1\}^n \to {\mathbb R}$ in $V$. The usual inner product on ${\mathbb R}^{2^n}$ would correspond to $\sum_{x \in \{-1,1\}^n} f(x)g(x)$, but it’s more convenient to scale this by a factor of $2^{-n}$, making it an average rather than a sum. In this way, a boolean function $f : \{-1,1\}^n \to \{-1,1\}$ will have $\langle f, f \rangle = 1$, i.e., be a “unit vector”.

Definition 3 We define an inner product $\langle \cdot , \cdot \rangle$ on pairs of function $f, g : \{-1,1\}^n \to {\mathbb R}$ by \begin{equation} \label{eqn:inner-product} \langle f, g \rangle = 2^{-n} \sum_{x \in \{-1,1\}^n} f(x) g(x) = \mathop{\bf E}_{\boldsymbol{x} \sim \{-1,1\}^n} \left[f(\boldsymbol{x}) g(\boldsymbol{x})\right]. \end{equation} We also use the notation $\|f\|_2 = \sqrt{\langle f, f \rangle}$, and more generally, \[ \|f\|_p = \mathop{\bf E}[|f(\boldsymbol{x})|^p]^{1/p}. \]

Here we have introduced probabilistic notation which will be used heavily throughout the book:

Notation 4 We write $\boldsymbol{x} \sim \{-1,1\}^n$ to denote that $\boldsymbol{x}$ is a uniformly chosen random string from $\{-1,1\}^n$. Equivalently, the $n$ coordinates $\boldsymbol{x}_i$ are independently chosen to be $+1$ with probability $1/2$ and $-1$ with probability $1/2$. We always write random variables in boldface. Probabilities $\mathop{\bf Pr}$ and expectations $\mathop{\bf E}$ will always be with respect to a uniformly random $\boldsymbol{x} \sim \{-1,1\}^n$ unless otherwise specified. Thus we might write the expectation in \eqref{eqn:inner-product} as $\mathop{\bf E}_{\boldsymbol{x}}[f(\boldsymbol{x}) g(\boldsymbol{x})]$ or $\mathop{\bf E}[f(\boldsymbol{x}) g(\boldsymbol{x})]$ or even $\mathop{\bf E}[f g]$.

Returning to the basis of parity functions for $V$, the crucial fact underlying all analysis of boolean functions is that this is an orthonormal basis.

Theorem 5 The $2^n$ parity functions $\chi_S : \{-1,1\}^n \to \{-1,1\}$ form an orthonormal basis for the vector space $V$ of functions $\{-1,1\}^n \to {\mathbb R}$. That is, \begin{equation*} \langle \chi_S, \chi_T \rangle = \begin{cases} 1 & \text{if $S = T$,}\\ 0 & \text{if $S \neq T$.} \end{cases} \end{equation*}

Recalling the definition $\langle \chi_S, \chi_T \rangle = \mathop{\bf E}[\chi_S(\boldsymbol{x}) \chi_T(\boldsymbol{x})]$, Theorem 5 follows immediately from two facts:

Fact 6 For $x \in \{-1,1\}^n$ it holds that $\chi_S(x) \chi_T(x) = \chi_{S \triangle T}(x)$, where $S \triangle T$ denotes symmetric difference.

Proof: $\displaystyle \chi_S(x) \chi_T(x) = \prod_{i \in S} x_i \prod_{i \in T} x_i = \prod_{i \in S \triangle T} x_i \prod_{i \in S \cap T} x_i^2 = \prod_{i \in S \triangle T} x_i = \chi_{S \triangle T}(x).$ $\Box$

Fact 7 \[ \mathop{\bf E}[\chi_S(\boldsymbol{x})] = \mathop{\bf E}\Bigl[\prod_{i \in S} \boldsymbol{x}_i\Bigr] = \begin{cases} 1 & \text{if $S = \emptyset$,}\\ 0 & \text{if $S \neq \emptyset$.} \end{cases} \]

Proof: If $S = \emptyset$ then $\mathop{\bf E}[\chi_S(\boldsymbol{x})] = \mathop{\bf E}[1] = 1$. Otherwise, \[ \mathop{\bf E}\Bigl[\prod_{i \in S} \boldsymbol{x}_i\Bigr] = \prod_{i \in S} \mathop{\bf E}[\boldsymbol{x}_i] \] because the random bits $\boldsymbol{x}_1, \dots, \boldsymbol{x}_n$ are independent. But each of the factors $\mathop{\bf E}[\boldsymbol{x}_i]$ in the above (nonempty) product is $(1/2) (+1) + (1/2) (-1) = 0$. $\Box$

1 comment to §1.3: The orthonormal basis of parity functions

Leave a Reply




You can use most standard LaTeX, as well as these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>