# §8.3: Orthogonal decomposition

In this section we describe a basis-free kind of “Fourier expansion” for functions on general product domains. We will refer to it as the orthogonal decomposition of $f \in L^2(\Omega^n, \pi^{\otimes n})$ though it goes by several other names in the literature: e.g., Hoeffding, Efron–Stein, or ANOVA decomposition.

The general idea is to express \begin{equation} \label{eqn:pm1-orthog-decomp} f = \sum_{S \subseteq [n]} f^{=S} \end{equation} where each function $f^{=S} \in L^2(\Omega^n, \pi^{\otimes n})$ gives the “contribution to $f$ coming from coordinates $S$ (but not from any subset of $S$)”.

To make this more precise, let’s start with the familiar case of $f : \{-1,1\}^n \to {\mathbb R}$. Here it is possible to define the functions $f^{=S} : \{-1,1\}^n \to {\mathbb R}$ simply by $f^{=S} = \widehat{f}(S)\,\chi_S$. (Later we will give an equivalent definition which doesn’t involve the Fourier basis.) This definition satisfies \eqref{eqn:pm1-orthog-decomp} as well as the following two properties:

1. $f^{=S}$ depends only on the coordinates in $S$.
2. If $T \subsetneq S$ and $g$ is a function depending only on the coordinates in $T$, then $\langle f^{=S}, g \rangle = 0$.

These properties describe what we mean precisely when we say that $f^{=S}$ is the “contribution to $f$ coming from coordinates $S$ (but not from any subset of $S$)”. Furthermore, the decomposition \eqref{eqn:pm1-orthog-decomp} is orthogonal, meaning $\langle f^{=S}, f^{=T} \rangle = 0$ whenever $S \neq T$.

To make this definition basis-free, recall the “projection of $f$ onto coordinates $J$”, $f^{\subseteq J}$ Definition 17. You can think of $f^{\subseteq J}$ as the “contribution to $f$ coming from coordinates $J$ (collectively)”. It has a probabilistic definition not depending on any basis, and with the definition $f^{=S} = \widehat{f}(S)\,\chi_S$ we have from Proposition 19 that \begin{equation} \label{eqn:orthog-invert-me} f^{\subseteq J} = \sum_{S \subseteq J} f^{=S}. \end{equation} It is precisely by inverting \eqref{eqn:orthog-invert-me} that we can give a basis-free definition of the functions $f^{=S}$.

Let’s do this inversion for a general $f \in L^2(\Omega^n, \pi^{\otimes n})$. The projection functions $f^{\subseteq J} \in L^2(\Omega^n, \pi^{\otimes n})$ can be defined as in Definition 17. If we want \eqref{eqn:orthog-invert-me} to hold for $J = \emptyset$ then we should define $f^{=\emptyset} = f^{\subseteq \emptyset}$ (which is the constant function equal to $\mathop{\bf E}[f]$). Given this, if we want \eqref{eqn:orthog-invert-me} to hold for singleton sets $J = \{j\}$ then we need $f^{\subseteq \{j\}} = f^{=\emptyset} + f^{=\{j\}} \quad\Leftrightarrow\quad f^{=\{j\}} = f^{\subseteq \{j\}} - f^{\subseteq \emptyset}.$ In other words, $f^{= \{j\}}(x) = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} [f \mid {\boldsymbol{x}}_j = x_j] – \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} [f({\boldsymbol{x}})].$ Notice this function only depends on the input value $x_j$; it measures the change in expectation of $f$ if you know the value $x_j$. Moving on to sets of cardinality $2$, if we want \eqref{eqn:orthog-invert-me} to hold for $J = \{i,j\}$ then we need \begin{align*} f^{\subseteq \{i,j\}} &= f^{=\emptyset} + f^{=\{i\}} + f^{=\{j\}} + f^{=\{i,j\}} \\ &= f^{\subseteq \emptyset} + (f^{\subseteq \{i\}} – f^{\subseteq \emptyset}) + (f^{\subseteq \{j\}} – f^{\subseteq \emptyset}) + f^{=\{i,j\}} \end{align*} and hence $f^{=\{i,j\}} = f^{\subseteq \{i,j\}} - f^{\subseteq \{i\}} - f^{\subseteq \{j\}} + f^{\subseteq \emptyset}.$ It’s clear that we can continue this and define all the functions $f^{=S}$ by the principle of inclusion-exclusion. To show this definition leads to an orthogonal decomposition we will need the following lemma:

Lemma 34 Let $f, g \in L^2(\Omega^n, \pi^{\otimes n})$. Assume that $f$ does not depend on any coordinate outside $I \subseteq [n]$, and $g$ does not depend on any coordinate outside $J \subseteq [n]$. Then $\langle f, g \rangle = \langle f^{\subseteq I \cap J}, g^{\subseteq I \cap J} \rangle$.

Proof: We may assume without loss of generality that $I \cup J = [n]$. Given any $x \in \Omega^n$ we can break it into the parts $(x_{I \cap J}, x_{I \setminus J}, x_{J \setminus I})$. We then have \begin{align*} \langle f, g\rangle = \mathop{\bf E}_{{\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J}, {\boldsymbol{x}}_{J \setminus I}}[f({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J}) \cdot g({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{J \setminus I})], \end{align*} where we have abused notation slightly by writing $f$ and $g$ as functions just of the coordinates on which they actually depend. Since ${\boldsymbol{x}}_{I \setminus J}$ and ${\boldsymbol{x}}_{J \setminus I}$ are independent, the above equals $\mathop{\bf E}_{{\boldsymbol{x}}_{I \cap J}}\left[\mathop{\bf E}_{{\boldsymbol{x}}_{I \setminus J}}[f({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J})] \cdot \mathop{\bf E}_{{\boldsymbol{x}}_{J \setminus I}}[g({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{J \setminus I})] \right].$ But now $\mathop{\bf E}_{{\boldsymbol{x}}_{I \setminus J}}[f({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J})]$ is nothing more than $f^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J})$, and similarly $\mathop{\bf E}_{{\boldsymbol{x}}_{J \setminus I}}[g({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{J \setminus I})] = g^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J})$. Thus the above equals $\mathop{\bf E}_{{\boldsymbol{x}}_{I \cap J}}[f^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J}) \cdot g^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J})] = \langle f^{\subseteq I \cap J}, g^{\subseteq I \cap J} \rangle. \qquad \Box$

We can now give the main theorem on orthogonal decomposition:

Theorem 35 Let $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then $f$ has a unique decomposition as $f = \sum_{S \subseteq [n]} f^{=S}$ where the functions $f^{=S} \in L^2(\Omega^n, \pi^{\otimes n})$ satisfy the following:

1. $f^{=S}$ depends only on the coordinates in $S$.
2. If $T \subsetneq S$ and $g \in L^2(\Omega^n, \pi^{\otimes n})$ depends only on the coordinates in $T$ then $\langle f^{=S}, g \rangle = 0$.

This decomposition has the following additional properties:

1. Condition (ii) additionally holds whenever $S \not \subseteq T$.
2. The decomposition is orthogonal: $\langle f^{=S}, f^{=T} \rangle = 0$ for $S \neq T$.
3. $\sum_{S \subseteq T} f^{=S} = f^{\subseteq T}$.
4. For each $S \subseteq [n]$, the mapping $f \mapsto f^{=S}$ is a linear operator.

Proof: We first show the existence of a decomposition satisfying (i)–(vi). We then show uniqueness for decompositions satisfying (i) and (ii). As suggested above, for each $S \subseteq [n]$ we define $f^{=S} = \sum_{J \subseteq S} (-1)^{|S| - |J|} f^{\subseteq J},$ where the functions $f^{\subseteq J} \in L^2(\Omega^n, \pi^{\otimes n})$ are as in Definition 17. Since each $f^{\subseteq J}$ depends only on the coordinates in $J$, condition (i) certainly holds. It is also immediate that condition (v) holds by inclusion-exclusion; you are asked to prove this explicitly in the exercises. Condition (vi) also follows because each $f \mapsto f^{\subseteq J}$ is a linear operator, as discussed after Definition 17.

We now verify (ii). Assume $T \subsetneq S$ and that $g \in L^2(\Omega^n, \pi^{\otimes n})$ only depends on the coordinates in $T$. We have \begin{equation} \label{eqn:pair-me} \langle f^{=S}, g \rangle = \sum_{J \subseteq S} (-1)^{|S| – |J|} \langle f^{\subseteq J}, g \rangle. \end{equation} Take any $i \in S \setminus T$ and pair up the summands in \eqref{eqn:pair-me} as $J’$, $J”$, where $J’ \not \ni i$ and $J” = J’ \cup \{i\}$. By Lemma 34 we have $\langle f^{\subseteq J''}, g \rangle = \langle f^{\subseteq J'' \cap T}, g^{\subseteq T} \rangle = \langle f^{\subseteq J' \cap T}, g^{\subseteq T} \rangle,$ the latter equality using $i \not \in T$. But the signs $(-1)^{|S| – |J’|}$ and $(-1)^{|S| – |J”|}$ are opposite, so the summands in \eqref{eqn:pair-me} cancel in pairs. This shows the sum is $0$, confirming (ii).

We complete the existence proof by noting that (ii) $\Rightarrow$ (iii) $\Rightarrow$ (iv) (assuming (i)). The first implication is because $\langle f^{=S}, g \rangle = \langle f^{= S}, g^{\subseteq S \cap T} \rangle$ when $g$ depends only on the coordinates in $T$ (Lemma 34), and $S \cap T \subsetneq S$ when $S \not \subseteq T$. The second implication is because $S \neq T$ implies either $S \not \subseteq T$ or $T \not \subseteq S$.

It remains to prove the uniqueness statement. Suppose $f$ has two representations satisfying (i) and (ii). By subtracting them we get a decomposition of the $0$ function that satisfies (i) and (ii); our goal is to show that each function in this decomposition is the $0$ function. We can do this by showing that any decomposition satisfying (i) and (ii) also satisfies “Parseval’s Theorem”: $\langle f, f \rangle = \sum_{S \subseteq [n]} \|f^{=S}\|_2^2$. But this is an easy consequence of (iv), which we just noted is itself a consequence of (i) and (ii). $\Box$

We can connect the orthogonal decomposition of $f$ to its expansion under Fourier bases as follows:

Proposition 36 Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ have orthogonal decomposition $f = \sum_{S \subseteq [n]} f^{=S}$. Fix any Fourier basis $\phi_0, \dots, \phi_{m-1}$ for $L^2(\Omega, \pi)$. Then \begin{equation} \label{eqn:orthog-decomp-basis} f^{=S} = \sum_{\substack{\alpha \in {\mathbb N}^n_{< m} \\ \mathrm{supp}(\alpha) = S}} \widehat{f}(\alpha)\,\phi_{\alpha}. \end{equation}

Proof: This follows easily from the uniqueness part of Theorem 35. If we take \eqref{eqn:orthog-decomp-basis} as the definition of functions $f^{=S}$, it is immediate that $\sum_S f^{=S} = f$ and that $f^{=S}$ depends only on the coordinates in $S$. Further, if $g$ depends only on coordinates $T \subsetneq S$ then $f^{=S}$ and $g$ have disjoint Fourier support by Corollary 20; hence $\langle f^{=S}, g \rangle = 0$ by Plancherel (Proposition 16). $\Box$

Example 37 Let’s compute the orthogonal decomposition of the function $f : \{a,b,c\}^2 \to \{0,1\}$ from Example 15. Recall that in this example $\{a,b,c\}$ has the uniform distribution and $f(x_1,x_2) = 1$ if and only if $x_1 = x_2 = c$. First, $f^{=\emptyset} = \mathop{\bf E}[f] = \tfrac{1}{9}.$ Next, for $i = 1,2$ we have that $f^{\subseteq \{i\}}(x)$ is $\frac13$ if $x_i = c$ and $0$ otherwise; hence $f^{= \{i\}}(x_1,x_2) = \begin{cases} +\frac{2}{9} & \text{if } x_i = c, \\ -\frac{1}{9} & \text{else.} \end{cases}$ Finally, it’s easiest to compute $f^{=\{1,2\}}$ as $f – f^{=\emptyset} – f^{=\{1\}} – f^{=\{2\}}$; this yields $f^{= \{1,2\}}(x_1,x_2) = \begin{cases} +\frac{4}{9} & \text{if } x_1 = x_2 = c, \\ -\frac{2}{9} & \text{if exactly one of } x_1,\ x_2 \text{ is } c, \\ +\frac{1}{9} & \text{if } x_1, x_2 \neq c. \end{cases}$ You can check (exercise) that this is consistent with Proposition 36 and the Fourier expansion from Example 15.

We can write all of the Fourier formulas from Section 2 in terms of the orthogonal decomposition; e.g., $\langle f, g \rangle = \sum_{S \subseteq [n]} \langle f^{=S}, g^{=S} \rangle, \quad \mathbf{Inf}_i[f] = \sum_{S \ni i} \|f^{=S}\|_2^2, \quad \mathrm{T}_\rho f = \sum_{S \subseteq [n]} \rho^{|S|} f^{=S}.$ These formulas can be proved either by using the connection from Proposition 36 or by reasoning directly from the defining Theorem 35; see the exercises. The orthogonal decomposition also gives us the natural way of stratifying $f$ by degree: we end this section by generalizing some more definitions from Chapter 1.4:

Definition 38 For $f \in L^2(\Omega^n, \pi^{\otimes n})$ and $k \in {\mathbb N}$ we define the degree $k$ part of $f$ to be $f^{=k} = \sum_{|S| = k} f^{=S}$ and the weight of $f$ at degree $k$ to be $\mathbf{W}^{k}[f] = \|f^{=k}\|_2^2$. We also use notation like $f^{\leq k} = \sum_{|S| \leq k} f^{=S}$ and $\mathbf{W}^{> k}[f] = \sum_{|S| > k} \|f^{=S}\|_2^2$.

### 4 comments to §8.3: Orthogonal decomposition

• Tim Black

I think your $j$ should be an $i$ (or vice-versa) in the sentence after equation $\eqref{eqn:pair-me}$.

• Ryan O'Donnell

Correct, thanks!

• Albert Atserias

In equation (4), $\phi(\alpha)$ should be $\phi_{\alpha}$.

• Ryan O'Donnell

Thanks!