# §1.5: Probability densities and convolution

For variety’s sake, in this section we write the Hamming cube as ${\mathbb F}_2^n$ rather than $\{-1,1\}^n$. In developing the Fourier expansion, we have generalized from boolean-valued boolean functions $f : {\mathbb F}_2^n \to \{-1,1\}$ to real-valued boolean functions $f : {\mathbb F}_2^n \to {\mathbb R}$. Boolean-valued functions arise more often in combinatorial problems, but there are important classes of real-valued boolean functions. One example is probability densities.

Definition 20 A (probability) density function on the Hamming cube ${\mathbb F}_2^n$ is any nonnegative function $\varphi : {\mathbb F}_2^n \to {\mathbb R}^{\geq 0}$ satisfying $\mathop{\bf E}_{\boldsymbol{x} \sim {\mathbb F}_2^n}[\varphi(\boldsymbol{x})] = 1.$ We write $\boldsymbol{y} \sim \varphi$ to denote that $\boldsymbol{y}$ is a random string drawn from the associated probability distribution, defined by $\phantom{ \quad \forall y \in {\mathbb F}_2^n.} \mathop{\bf Pr}_{\boldsymbol{y} \sim \varphi}[\boldsymbol{y} = y] = \varphi(y) \frac{1}{2^n} \quad \forall y \in {\mathbb F}_2^n.$

Here you should think of $\varphi(y)$ as being the relative density of $y$ with respect to the uniform distribution on ${\mathbb F}_2^n$. For example, we have:

Fact 21 If $\varphi$ is a density function and $g : \{-1,1\}^n \to {\mathbb R}$, then $\mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[g(\boldsymbol{y})] = \langle \varphi, g \rangle = \mathop{\bf E}_{\boldsymbol{x} \sim {\mathbb F}_2^n}[\varphi(\boldsymbol{x}) g(\boldsymbol{x})].$

The simplest example of a probability density is just the constant function $1$, which corresponds to the uniform probability distribution on ${\mathbb F}_2^n$. The most common case arises from the uniform distribution over some subset $A \subseteq {\mathbb F}_2^n$.

Definition 22 If $A \subseteq {\mathbb F}_2^n$ we write $1_A : {\mathbb F}_2^n \to \{0,1\}$ for the $0$-$1$ indicator function of $A$; i.e., \begin{equation*} 1_A(x) = \begin{cases} 1 & \text{if $x \in A$,} \\ 0 & \text{if $x \notin A$.} \end{cases} \end{equation*} Assuming $A \neq \emptyset$ we write $\varphi_A$ for the density function associated to the uniform distribution on $A$; i.e., \begin{equation*} \varphi_A = \tfrac{1}{\mathop{\bf E}[1_A]} 1_A. \end{equation*} We typically write $\boldsymbol{y} \sim A$ rather than $\boldsymbol{y} \sim \varphi_A$.

A simple but useful example is when $A$ is the singleton set $A = \{0\}$. (Here $0$ is denoting the vector $(0, 0, \dots, 0) \in {\mathbb F}_2^n$.) In this case the function $\varphi_{\{0\}}$ takes value $2^n$ on input $0 \in {\mathbb F}_2^n$ and is zero elsewhere on ${\mathbb F}_2^n$. In the exercises you will verify the Fourier expansion of $\varphi_{\{0\}}$:

Fact 23 Every Fourier coefficient of $\varphi_{\{0\}}$ is $1$; i.e., its Fourier expansion is $\varphi_{\{0\}}(y) = \sum_{S \subseteq [n]} \chi_S(y).$

We record here some basic facts about “distances” between probability distributions (we will give reminders on the formal definitions of these distances later).

Proposition 24 If $\varphi$ and $\psi$ are probability densities on ${\mathbb F}_2^n$ (which we identify with their associated probability distributions) then:

1. their total variation distance is $d_{\mathrm{TV}}(\varphi,\psi) = \tfrac{1}{2} \|\varphi – \psi\|_1$;
2. the collision probability of $\varphi$ is $\|\varphi\|_2^2/2^n$;
3. the $\chi^2$-distance of $\varphi$ from uniform is $d_{\chi^2}(\varphi,1) = \|\varphi – 1\|_2^2 = \mathop{\bf Var}[\varphi]$;
4. the total variation distance of $\varphi$ from uniform satisfies $d_{\mathrm{TV}}(\varphi,1) \leq \tfrac{1}{2} \sqrt{\mathop{\bf Var}[\varphi]}$.

The proof of Proposition 24 is left to the exercises.

We now introduce an operation on functions which interacts particularly nicely with density functions, namely convolution.

Definition 25 Let $f, g : {\mathbb F}_2^n \to {\mathbb R}$. Their convolution is the function $f * g : {\mathbb F}_2^n \to {\mathbb R}$ defined by \begin{equation*} (f * g)(x) = \mathop{\bf E}_{\boldsymbol{y} \sim {\mathbb F}_2^n}[f(\boldsymbol{y}) g(x - \boldsymbol{y})] = \mathop{\bf E}_{\boldsymbol{y} \sim {\mathbb F}_2^n}[f(x-\boldsymbol{y})g(\boldsymbol{y})]. \end{equation*} Since subtraction is equivalent to addition in ${\mathbb F}_2^n$ we may also write \begin{equation*} (f * g)(x) = \mathop{\bf E}_{\boldsymbol{y} \sim {\mathbb F}_2^n}[f(\boldsymbol{y}) g(x + \boldsymbol{y})] = \mathop{\bf E}_{\boldsymbol{y} \sim {\mathbb F}_2^n}[f(x+\boldsymbol{y})g(\boldsymbol{y})]. \end{equation*} If we were representing the Hamming cube by $\{-1,1\}^n$ rather than ${\mathbb F}_2^n$ we would replace $x + \boldsymbol{y}$ with $x \circ \boldsymbol{y}$, where $\circ$ denotes entry-wise multiplication.

In the exercises you are asked to verify that convolution is associative and commutative: $f * (g * h) = (f * g) * h, \qquad f * g = g * f.$

Using Fact 21 we can deduce the following two simple results:

Proposition 26 If $\varphi$ is a density function on ${\mathbb F}_2^n$ and $g : {\mathbb F}_2^n \to {\mathbb R}$ then $\varphi * g(x) = \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[g(x - \boldsymbol{y})] = \mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[g(x + \boldsymbol{y})].$ In particular, $\mathop{\bf E}_{\boldsymbol{y} \sim \varphi}[g(\boldsymbol{y})] = \varphi * g(0)$.

Proposition 27 If $g = \psi$ is itself a probability density function then so is $\varphi * \psi$; it represents the distribution on $\boldsymbol{x} \in {\mathbb F}_2^n$ given by choosing $\boldsymbol{y} \sim \varphi$ and $\boldsymbol{z} \sim \psi$ independently and setting $\boldsymbol{x} = \boldsymbol{y} + \boldsymbol{z}$.

The most important theorem about convolution is that it corresponds to multiplication of Fourier coefficients:

Theorem 28 Let $f, g : {\mathbb F}_2^n \to {\mathbb R}$. Then for all $S \subseteq [n]$, \begin{equation*} \widehat{f * g}(S) = \widehat{f}(S) \widehat{g}(S). \end{equation*}

Proof: We have \begin{align*} \label{eqn:} \widehat{f * g}(S) &= \mathop{\bf E}_{\boldsymbol{x} \sim {\mathbb F}_2^n}[ (f * g)(\boldsymbol{x}) \chi_S(\boldsymbol{x})] \tag{the Fourier formula}\\ &= \mathop{\bf E}_{\boldsymbol{x} \sim {\mathbb F}_2^n}\left[ \mathop{\bf E}_{\boldsymbol{y} \sim {\mathbb F}_2^n}[f(\boldsymbol{y}) g(\boldsymbol{x} - \boldsymbol{y})] \chi_S(\boldsymbol{x})\right] \tag{by definition}\\ &= \mathop{\bf E}_{\substack{\boldsymbol{y}, \boldsymbol{z} \sim {\mathbb F}_2^n \\ \text{independently}}}[f(\boldsymbol{y}) g(\boldsymbol{z}) \chi_S(\boldsymbol{y} + \boldsymbol{z})] \quad \qquad\qquad\qquad\qquad \tag{as $x – \boldsymbol{y}$ is uniform on ${\mathbb F}_2^n$ $\forall x$}\\ &= \mathop{\bf E}_{\boldsymbol{y}, \boldsymbol{z} \sim {\mathbb F}_2^n}[f(\boldsymbol{y}) \chi_S(\boldsymbol{y}) g(\boldsymbol{z}) \chi_S(\boldsymbol{z})] \tag{by a basic identity from §1.2}\\ &= \widehat{f}(S) \widehat{g}(S) \tag*{(Fourier formula, independence),} \end{align*} as claimed. $\Box$

### 5 comments to §1.5: Probability densities and convolution

• miforbes

I think that Prop 26 could use some parentheses around the convolutions.

• Ryan O'Donnell

I think it’s okay as-is, but if it causes too much confusion I’ll change it.

thanks…

• Stefan Schneider

In the second formula of definition 20, is the factor of $1/2^n$ a typo?

• Stefan Schneider

Never mind, I misread the definition.

• Yongzhi

I think that the domain of the function g in Fact 21 should be F^n_2.