# Chapter 5 exercises

1. Suppose $f : \{-1,1\}^n \to \{-1,1\}$ is an LTF. Show that it can be expressed as $f(x) = \mathrm{sgn}(a_0 + a_1 x_1 + \cdots a_n x_n)$ where the $a_i$’s are integers. (Hint: first obtain rational $a_i$’s by a perturbation.)
2. Show also that a degree-$d$ PTF has a representation in which all of the degree-$d$ polynomial’s coefficients are integers.
1. Let $f(x) = \mathrm{sgn}(a_0 + a_1 x_1 + \cdots a_n x_n)$ be an LTF.
1. Show that if $a_0 = 0$ then $\mathop{\bf E}[f] = 0$. (Hint: show that $f$ is in fact an odd function.)
2. Show that if $a_0 \geq 0$ then $\mathop{\bf E}[f] \geq 0$. Show that the converse need not hold.
3. Suppose $g : \{-1,1\}^n \to \{-1,1\}$ is an LTF with $\mathop{\bf E}[f] = 0$. Show that $g$ can be represented as $g(x) = \mathrm{sgn}(c_1 x_1 + \cdots + c_n x_n)$.
2. Suppose $f(x) = \mathrm{sgn}(a_0 + a_1 x_1 + \cdots a_n x_n)$ is an LTF with $|a_1| \geq |a_2| \geq \cdots \geq |a_n|$. Show that $\mathbf{Inf}_1[f] \geq \mathbf{Inf}_2[f] \geq \cdots \geq \mathbf{Inf}_n[f]$. (Hint: why does it suffice to prove this for $n=2$?)
1. Show that the number of functions $f : \{-1,1\}^n \to \{-1,1\}$ which are LTFs is at most $2^{n^2 + O(n)}$. (Hint: Chow’s Theorem.)
2. More generally, show that the number of functions $f : \{-1,1\}^n \to \{-1,1\}$ which are degree-$k$ PTFs is at most $2^{n^{k+1} + O(n)}$.
1. Suppose $\ell : \{-1,1\}^n \to {\mathbb R}$ is defined by $\ell(x) = a_0 + a_1 x_1 + \cdots a_n x_n$. Define $\widetilde{\ell} : \{-1,1\}^{n+1} \to {\mathbb R}$ by $\widetilde{\ell}(x_0, \dots, x_n) = a_0 x_0 + a_1 x_1 + \cdots a_n x_n$. Show that $\|\widetilde{\ell}\|_1 = \|\ell\|_1$ and $\|\widetilde{\ell}\|^2_2 = \|\ell\|_2^2$.
2. Complete the proof of Theorem 2.
3. Let $f : \{-1,1\}^n \to \{-1,1\}$ be an unbiased linear threshold function. Show that $\mathbf{Inf}_i[f] \geq \frac{1}{\sqrt{2 n}}$ for some $i \in [n]$, improving the KKL Theorem for LTFs.
4. Consider the following “correlation distillation” problem (cf. Exercise 2.49). For each $i \in [n]$ there is a number $\rho_i \in [-1,1]$ and an independent sequence of pairs of $\rho_i$-correlated bits, $(\boldsymbol{a}_1^{(1)}, \boldsymbol{b}_2^{(1)})$, $(\boldsymbol{a}_1^{(2)}, \boldsymbol{b}_2^{(2)})$, $(\boldsymbol{a}_1^{(3)}, \boldsymbol{b}_2^{(3)})$, etc. Party $A$ on Earth has access to the stream of bits $\boldsymbol{a}_1^{(1)}$, $\boldsymbol{a}_1^{(2)}$, $\boldsymbol{a}_1^{(3)}$,$\dots$. and a party $B$ on Venus has access to the stream $\boldsymbol{b}_1^{(1)}$, $\boldsymbol{b}_1^{(2)}$, $\boldsymbol{b}_1^{(3)}$, $\dots$. Neither party knows the numbers $\rho_1, \dots, \rho_n$. The goal is for $B$ to estimate these correlations. To assist in this, $A$ can send a small number of bits to $B$. A reasonable strategy is for $A$ to send $f(\boldsymbol{a}^{(1)})$, $f(\boldsymbol{a}^{(2)})$, $f(\boldsymbol{a}^{(3)})$, $\dots$ to $B$, where $f : \{-1,1\}^n \to \{-1,1\}$ is some boolean function. Using this information $B$ can try to estimate $\mathop{\bf E}[f(\boldsymbol{a}) \boldsymbol{b}_i]$ for each $i$.
1. Show that $\mathop{\bf E}[f(\boldsymbol{a}) \boldsymbol{b}_i] = \widehat{f}(i) \rho_i$.
2. This motivates choosing an $f$ for which all $\widehat{f}(i)$ are large. If we also insist all $\widehat{f}(i)$ be equal, show that majority functions $f$ maximize this common value.
5. For $n \geq 2$, let $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ be a randomly chosen function (as in Exercise 1.8). Show that $\hat{\lVert} \boldsymbol{f} \hat{\rVert}_\infty \leq 2\sqrt{n} 2^{-n/2}$ except with probability at most $2^{-n}$.
6. Prove Theorem 6.
1. Give as simple a proof as you can that the parity function $\chi_{[n]} : \{-1,1\}^n \to \{-1,1\}$ is not a PTF of degree $n-1$.
2. Show that if $f : \{-1,1\}^n \to \{-1,1\}$ is not $\pm \chi_{[n]}$ then it is a PTF of degree $n-1$. (Hint: consider $f^{\leq n-1}$.)
7. For each $k \in {\mathbb N}^+$, show that there is a degree-$k$ PTF $f$ with $\mathbf{W}^{\leq k}[f] < 2^{1-k}$.
8. In this exercise you will show that threshold-of-parities circuits can be effectively simulated by threshold-of-threshold circuits, but not the converse.
1. Let $f : \{-1,1\}^n \to \{-1,1\}$ be a symmetric function. Show that $f$ is computable as the sum of at most $2n$ LTFs, plus a constant.
2. Deduce that if $f : \{-1,1\}^n \to \{-1,1\}$ is computable by a size-$s$ threshold-of-parities circuit then it is also computable by a size-$2ns$ threshold-of-thresholds circuit.
3. Show that the complete quadratic function $\mathrm{CQ}_n : {\mathbb F}_2^{n} \to \{-1,1\}$ (see Exercise 1.1) is computable by a size-$2n$ threshold-of-thresholds circuit.
4. Assume $n$ even. Show that any threshold-of-parities circuit for $\mathrm{CQ}_n$ requires size $2^{n/2}$.
9. Let $f : \{-1,1\}^n \to \{-1,1\}$ be computable by a DNF of size $s$. Show that $f$ has a PTF representation of sparsity $O(n s^2)$. (Hint: approximate the ANDs using Theorem 10.)
10. In contrast to the previous exercise, show that there is a function $f : \{-1,1\}^n \to \{-1,1\}$ computable by a depth-$3$ $\mathsf{AC}^0$ circuit (see Chapter 4.5) but requiring threshold-of-parities circuits of size at least $n^{\log n}$. (Hint: involve the inner product mod $2$ function and Exercise 4.12.)
11. Let $\mathcal{F}$ be a nonempty collection of subsets $S \subseteq [n]$. For each $a \in \{-1,1\}^n$, write $1_{\{a\}} : \{-1,1\}^n \to \{0,1\}$ for the indicator of $\{a\}$, write $1_{\{a\}}^\mathcal{F} : \{-1,1\}^n \to {\mathbb R}$ for $\sum_{S \in \mathcal{F}} \widehat{1_{\{a\}}}(S)\,\chi_S$, and write $\psi_a = \frac{2^n}{|\mathcal{F}|} \cdot 1_{\{a\}}^\mathcal{F}$.

1. Show that $\psi_a(a) = 1$ and $\mathop{\bf E}[\psi_a^2] = \tfrac{1}{|\mathcal{F}|}$. Show also that for all $x \in \{-1,1\}^n$, $\psi_a(x) = \psi_x(a)$ and $\sum_{a : a \neq x} \psi_a(x)^2 = \frac{2^n}{|\mathcal{F}|} – 1$.
2. Fix $0 < \epsilon < 1$ and suppose that $|\mathcal{F}| \geq (1 – \tfrac{\epsilon^2}{6n}) 2^n$. Let $\boldsymbol{f} : \{-1,1\}^n \to \{-1,1\}$ be a random function as in Exercise 1.8. Show that for each $x \in \{-1,1\}^n$, except with probability at most $4^{-n}$ it holds that $|\sum_{a : a \neq x} \boldsymbol{f}(a) \psi_a(x)| < \epsilon$.
3. Deduce that for all but a $2^{-n}$ fraction of functions $f : \{-1,1\}^n \to \{-1,1\}$, there a multilinear polynomial $q : \{-1,1\}^n \to {\mathbb R}$ supported on the monomials $\{\chi_S : S \in \mathcal{F}\}$ such that $\|f – q\|_\infty < \epsilon$.
4. Deduce that all but a $2^{-n}$ fraction of functions $f : \{-1,1\}^n \to \{-1,1\}$ have PTF representation of degree at most $n/2 + O(\sqrt{n \log n})$.
1. Show that in the Berry–Esseen Theorem we can also conclude $|\mathop{\bf Pr}[\boldsymbol{S} < u] – \mathop{\bf Pr}[\boldsymbol{Z} < u]| \leq B \epsilon.$ (Hint: you’ll need that $\lim_{\delta \to 0^+} \mathop{\bf Pr}[\boldsymbol{Z} \leq u - \delta] = \mathop{\bf Pr}[\boldsymbol{Z} \leq u]$.)
2. Deduce that if $I \subseteq {\mathbb R}$ is any interval we can also conclude $|\mathop{\bf Pr}[\boldsymbol{S} \in I] – \mathop{\bf Pr}[\boldsymbol{Z} \in I]| \leq 2B \beta.$
12. Show that the assumptions $\mathop{\bf E}[\boldsymbol{X}_i] = 0$ and $\sum_{i=1}^n \mathop{\bf Var}[\boldsymbol{X}_i] = 1$ in the Berry–Esseen Theorem are not restrictive, as follows. Let $\boldsymbol{X}_1, \dots, \boldsymbol{X}_n$ be independent random variables with finite means and variances. Let $\boldsymbol{S} = \sum_{i=1}^n \boldsymbol{X}_i$ and let $\boldsymbol{Z} \sim \mathrm{N}(\mu,\sigma^2)$, where $\mu = \sum_{i=1}^n \mathop{\bf E}[\boldsymbol{X}_i]$ and $\sigma^2 = \sum_{i=1}^n \mathop{\bf Var}[\boldsymbol{X}_i]$. Assuming $\sigma^2 > 0$, show that for all $u \in {\mathbb R}$, $|\mathop{\bf Pr}[\boldsymbol{S} \leq u] – \mathop{\bf Pr}[\boldsymbol{Z} \leq u]| \leq B \epsilon / \sigma^3,$ where $\epsilon = \sum_{i=1}^n \|\boldsymbol{X}_i - \mathop{\bf E}[\boldsymbol{X}_i]\|_3^3.$
1. Use the generalized Binomial Theorem to compute the power series for $(1-z^2)^{-1/2}$, valid for $|z| < 1$.
2. Integrate to obtain the power series for $\arcsin z$ given in Section 3, valid for $|z| < 1$.
3. Confirm that equality holds also for $z = \pm 1$.
13. Verify that the random vector $\vec{\boldsymbol{S}}$ defined in equation (6) of Section 2 has $\mathop{\bf E}[\vec{\boldsymbol{S}}_1] = \mathop{\bf E}[\vec{\boldsymbol{S}}_2] = 0$, $\mathop{\bf E}[\vec{\boldsymbol{S}}_1^2] = \mathop{\bf E}[\vec{\boldsymbol{S}}_2^2] = 1$, $\mathop{\bf E}[\vec{\boldsymbol{S}}_1\vec{\boldsymbol{S}}_2] = \rho$; i.e., $\mathop{\bf E}[\vec{\boldsymbol{S}}] = \begin{bmatrix} 0 \\ 0 \end{bmatrix}$ and $\mathop{\bf Cov}[\vec{\boldsymbol{S}}] = \begin{bmatrix} 1 & \rho \\ \rho & 1 \end{bmatrix}$.
14. In this exercise you will prove Sheppard’s Formula.

1. Let $\boldsymbol{g}_1, \boldsymbol{g}_2$ be independent standard Gaussians and let $\boldsymbol{g}_1' = \boldsymbol{g}_1, \qquad \boldsymbol{g}_2' = \rho \boldsymbol{g}_1 + \sqrt{1-\rho^2} \boldsymbol{g}_2.$ Show that $(\boldsymbol{g}_1′, \boldsymbol{g}_2′)$ has the same distribution as the random vector $(\boldsymbol{z}_1, \boldsymbol{z}_2)$ in Sheppard’s formula.
2. Assume for now that $\rho \in [0,1)$. Show that $\boldsymbol{g}_1′, \boldsymbol{g}_2′ > 0$ if and only if $(\boldsymbol{g}_1, \boldsymbol{g}_2)$ is in a certain region $R \subseteq {\mathbb R}^2$ consisting of a quadrant plus a wedge of angle $\arcsin \rho$.
3. Using the circular symmetry of the random vector $(\boldsymbol{g}_1, \boldsymbol{g}_2)$, deduce Sheppard’s formula for $\rho \in [0,1)$.
4. Treat the cases $\rho \in (-1,0)$, $\rho = \pm 1$.
15. Prove Corollary 17.
16. Fix $n$ odd. Using Theorem 16 show that $|\widehat{\mathrm{Maj}_n}(S)|$ is a decreasing function of $|S|$ for odd $1 \leq |S| \leq \frac{n-1}{2}$. Deduce (using also Corollary 17) that $\hat{\lVert} \mathrm{Maj}_n \hat{\rVert}_\infty = \mathrm{Maj}_n(\{1\}) \sim \frac{\sqrt{2/\pi}}{\sqrt{n}}$.
17. Prove Corollary 18.
18. Prove Theorem 15. (Hint: Corollary 18).

### 10 comments to Chapter 5 exercises

• Andy D

Hey Ryan, enjoying your posts; wish I had time to work on these!

Looks like you want a “sgn” in Ex. 1.a.

• Ryan O'Donnell

Yep, fixed. Thanks Andy.

• Amirali

In the second sentence of exercise 8, have you misplaced the \hat symbols or was that deliberate?

• Ryan O'Donnell

Thanks Amirali — I think it’s okay though… There are hats on the norm signs, to denote the spectral norm. Perhaps its hard to see on your browser? Or maybe I’m missing something.

• Noam Lifshitz

I think you can take $n+1$ instead of $2n$ in ex 12 a.

• Ryan O'Donnell

Hi Noam — that may be so (I didn’t think about it too carefully). But for safety I think I will stick with 2n, which is at least n+1 anyway.

thanks!
Ryan

• Ohad Klein

Are the indexing in (the start of) 7 OK?

• Ryan O'Donnell

No, it’s definitely messed up. Will fix!

• Chengyu

Ex 2.c
It should be “Suppose … is an LTF with $\textbf{E} [g] = 0$.”

• Ryan O'Donnell

Thank you! Sorry for the delay in replying.