
Chapter 5 exercises

 Suppose $f : \{1,1\}^n \to \{1,1\}$ is an LTF. Show that it can be expressed as $f(x) = \mathrm{sgn}(a_0 + a_1 x_1 + \cdots a_n x_n)$ where the $a_i$’s are integers. (Hint: first obtain rational $a_i$’s by a perturbation.)
 Show also that a degree$d$ PTF has a representation in which all of the degree$d$ polynomial’s coefficients are integers.
 Let $f(x) = \mathrm{sgn}(a_0 + a_1 x_1 + \cdots a_n x_n)$ be an LTF.
 Show that if $a_0 = 0$ then $\mathop{\bf E}[f] = 0$. (Hint: show that $f$ is in fact an odd function.)
 Show that if $a_0 \geq 0$ then $\mathop{\bf E}[f] \geq 0$. Show that the converse need not hold.
 Suppose $g : \{1,1\}^n \to \{1,1\}$ is an LTF with $\mathop{\bf E}[f] = 0$. Show that $g$ can be represented as $g(x) = \mathrm{sgn}(c_1 x_1 + \cdots + c_n x_n)$.
 Suppose $f(x) = \mathrm{sgn}(a_0 + a_1 x_1 + \cdots a_n x_n)$ is an LTF with $a_1 \geq a_2 \geq \cdots \geq a_n$. Show that $\mathbf{Inf}_1[f] \geq \mathbf{Inf}_2[f] \geq \cdots \geq \mathbf{Inf}_n[f]$. (Hint: why does it suffice to prove this for $n=2$?)

 Show that the number of functions $f : \{1,1\}^n \to \{1,1\}$ which are LTFs is at most $2^{n^2 + O(n)}$. (Hint: Chow’s Theorem.)
 More generally, show that the number of functions $f : \{1,1\}^n \to \{1,1\}$ which are degree$k$ PTFs is at most $2^{n^{k+1} + O(n)}$.

 Suppose $\ell : \{1,1\}^n \to {\mathbb R}$ is defined by $\ell(x) = a_0 + a_1 x_1 + \cdots a_n x_n$. Define $\widetilde{\ell} : \{1,1\}^{n+1} \to {\mathbb R}$ by $\widetilde{\ell}(x_0, \dots, x_n) = a_0 x_0 + a_1 x_1 + \cdots a_n x_n$. Show that $\\widetilde{\ell}\_1 = \\ell\_1$ and $\\widetilde{\ell}\^2_2 = \\ell\_2^2$.
 Complete the proof of Theorem 2.
 Let $f : \{1,1\}^n \to \{1,1\}$ be an unbiased linear threshold function. Show that $\mathbf{Inf}_i[f] \geq \frac{1}{\sqrt{2 n}}$ for some $i \in [n]$, improving the KKL Theorem for LTFs.
 Consider the following “correlation distillation” problem (cf. Exercise 2.49). For each $i \in [n]$ there is a number $\rho_i \in [1,1]$ and an independent sequence of pairs of $\rho_i$correlated bits, $(\boldsymbol{a}_1^{(1)}, \boldsymbol{b}_2^{(1)})$, $(\boldsymbol{a}_1^{(2)}, \boldsymbol{b}_2^{(2)})$, $(\boldsymbol{a}_1^{(3)}, \boldsymbol{b}_2^{(3)})$, etc. Party $A$ on Earth has access to the stream of bits $\boldsymbol{a}_1^{(1)}$, $\boldsymbol{a}_1^{(2)}$, $\boldsymbol{a}_1^{(3)}$,$\dots$. and a party $B$ on Venus has access to the stream $\boldsymbol{b}_1^{(1)}$, $\boldsymbol{b}_1^{(2)}$, $\boldsymbol{b}_1^{(3)}$, $\dots$. Neither party knows the numbers $\rho_1, \dots, \rho_n$. The goal is for $B$ to estimate these correlations. To assist in this, $A$ can send a small number of bits to $B$. A reasonable strategy is for $A$ to send $f(\boldsymbol{a}^{(1)})$, $f(\boldsymbol{a}^{(2)})$, $f(\boldsymbol{a}^{(3)})$, $\dots$ to $B$, where $f : \{1,1\}^n \to \{1,1\}$ is some boolean function. Using this information $B$ can try to estimate $\mathop{\bf E}[f(\boldsymbol{a}) \boldsymbol{b}_i]$ for each $i$.
 Show that $\mathop{\bf E}[f(\boldsymbol{a}) \boldsymbol{b}_i] = \widehat{f}(i) \rho_i$.
 This motivates choosing an $f$ for which all $\widehat{f}(i)$ are large. If we also insist all $\widehat{f}(i)$ be equal, show that majority functions $f$ maximize this common value.
 For $n \geq 2$, let $\boldsymbol{f} : \{1,1\}^n \to \{1,1\}$ be a randomly chosen function (as in Exercise 1.8). Show that $\hat{\lVert} \boldsymbol{f} \hat{\rVert}_\infty \leq 2\sqrt{n} 2^{n/2}$ except with probability at most $2^{n}$.
 Prove Theorem 6.

 Give as simple a proof as you can that the parity function $\chi_{[n]} : \{1,1\}^n \to \{1,1\}$ is not a PTF of degree $n1$.
 Show that if $f : \{1,1\}^n \to \{1,1\}$ is not $\pm \chi_{[n]}$ then it is a PTF of degree $n1$. (Hint: consider $f^{\leq n1}$.)
 For each $k \in {\mathbb N}^+$, show that there is a degree$k$ PTF $f$ with $\mathbf{W}^{\leq k}[f] < 2^{1k}$.
 In this exercise you will show that thresholdofparities circuits can be effectively simulated by thresholdofthreshold circuits, but not the converse.
 Let $f : \{1,1\}^n \to \{1,1\}$ be a symmetric function. Show that $f$ is computable as the sum of at most $2n$ LTFs, plus a constant.
 Deduce that if $f : \{1,1\}^n \to \{1,1\}$ is computable by a size$s$ thresholdofparities circuit then it is also computable by a size$2ns$ thresholdofthresholds circuit.
 Show that the complete quadratic function $\mathrm{CQ}_n : {\mathbb F}_2^{n} \to \{1,1\}$ (see Exercise 1.1) is computable by a size$2n$ thresholdofthresholds circuit.
 Assume $n$ even. Show that any thresholdofparities circuit for $\mathrm{CQ}_n$ requires size $2^{n/2}$.
 Let $f : \{1,1\}^n \to \{1,1\}$ be computable by a DNF of size $s$. Show that $f$ has a PTF representation of sparsity $O(n s^2)$. (Hint: approximate the ANDs using Theorem 10.)
 In contrast to the previous exercise, show that there is a function $f : \{1,1\}^n \to \{1,1\}$ computable by a depth$3$ $\mathsf{AC}^0$ circuit (see Chapter 4.5) but requiring thresholdofparities circuits of size at least $n^{\log n}$. (Hint: involve the inner product mod $2$ function and Exercise 4.12.)
 Let $\mathcal{F}$ be a nonempty collection of subsets $S \subseteq [n]$. For each $a \in \{1,1\}^n$, write $1_{\{a\}} : \{1,1\}^n \to \{0,1\}$ for the indicator of $\{a\}$, write $1_{\{a\}}^\mathcal{F} : \{1,1\}^n \to {\mathbb R}$ for $\sum_{S \in \mathcal{F}} \widehat{1_{\{a\}}}(S)\,\chi_S$, and write $\psi_a = \frac{2^n}{\mathcal{F}} \cdot 1_{\{a\}}^\mathcal{F}$.
 Show that $\psi_a(a) = 1$ and $\mathop{\bf E}[\psi_a^2] = \tfrac{1}{\mathcal{F}}$. Show also that for all $x \in \{1,1\}^n$, $\psi_a(x) = \psi_x(a)$ and $\sum_{a : a \neq x} \psi_a(x)^2 = \frac{2^n}{\mathcal{F}} – 1$.
 Fix $0 < \epsilon < 1$ and suppose that $\mathcal{F} \geq (1 – \tfrac{\epsilon^2}{6n}) 2^n$. Let $\boldsymbol{f} : \{1,1\}^n \to \{1,1\}$ be a random function as in Exercise 1.8. Show that for each $x \in \{1,1\}^n$, except with probability at most $4^{n}$ it holds that $\sum_{a : a \neq x} \boldsymbol{f}(a) \psi_a(x) < \epsilon$.
 Deduce that for all but a $2^{n}$ fraction of functions $f : \{1,1\}^n \to \{1,1\}$, there a multilinear polynomial $q : \{1,1\}^n \to {\mathbb R}$ supported on the monomials $\{\chi_S : S \in \mathcal{F}\}$ such that $\f – q\_\infty < \epsilon$.
 Deduce that all but a $2^{n}$ fraction of functions $f : \{1,1\}^n \to \{1,1\}$ have PTF representation of degree at most $n/2 + O(\sqrt{n \log n})$.

 Show that in the Berry–Esseen Theorem we can also conclude \[ \mathop{\bf Pr}[\boldsymbol{S} < u] – \mathop{\bf Pr}[\boldsymbol{Z} < u] \leq B \epsilon. \] (Hint: you’ll need that $\lim_{\delta \to 0^+} \mathop{\bf Pr}[\boldsymbol{Z} \leq u  \delta] = \mathop{\bf Pr}[\boldsymbol{Z} \leq u]$.)
 Deduce that if $I \subseteq {\mathbb R}$ is any interval we can also conclude \[ \mathop{\bf Pr}[\boldsymbol{S} \in I] – \mathop{\bf Pr}[\boldsymbol{Z} \in I] \leq 2B \beta. \]
 Show that the assumptions $\mathop{\bf E}[\boldsymbol{X}_i] = 0$ and $\sum_{i=1}^n \mathop{\bf Var}[\boldsymbol{X}_i] = 1$ in the Berry–Esseen Theorem are not restrictive, as follows. Let $\boldsymbol{X}_1, \dots, \boldsymbol{X}_n$ be independent random variables with finite means and variances. Let $\boldsymbol{S} = \sum_{i=1}^n \boldsymbol{X}_i$ and let $\boldsymbol{Z} \sim \mathrm{N}(\mu,\sigma^2)$, where $\mu = \sum_{i=1}^n \mathop{\bf E}[\boldsymbol{X}_i]$ and $\sigma^2 = \sum_{i=1}^n \mathop{\bf Var}[\boldsymbol{X}_i]$. Assuming $\sigma^2 > 0$, show that for all $u \in {\mathbb R}$, \[ \mathop{\bf Pr}[\boldsymbol{S} \leq u] – \mathop{\bf Pr}[\boldsymbol{Z} \leq u] \leq B \epsilon / \sigma^3, \] where \[ \epsilon = \sum_{i=1}^n \\boldsymbol{X}_i  \mathop{\bf E}[\boldsymbol{X}_i]\_3^3. \]

 Use the generalized Binomial Theorem to compute the power series for $(1z^2)^{1/2}$, valid for $z < 1$.
 Integrate to obtain the power series for $\arcsin z$ given in Section 3, valid for $z < 1$.
 Confirm that equality holds also for $z = \pm 1$.
 Verify that the random vector $\vec{\boldsymbol{S}}$ defined in equation (6) of Section 2 has $\mathop{\bf E}[\vec{\boldsymbol{S}}_1] = \mathop{\bf E}[\vec{\boldsymbol{S}}_2] = 0$, $\mathop{\bf E}[\vec{\boldsymbol{S}}_1^2] = \mathop{\bf E}[\vec{\boldsymbol{S}}_2^2] = 1$, $\mathop{\bf E}[\vec{\boldsymbol{S}}_1\vec{\boldsymbol{S}}_2] = \rho$; i.e., $\mathop{\bf E}[\vec{\boldsymbol{S}}] = \begin{bmatrix} 0 \\ 0 \end{bmatrix}$ and $\mathop{\bf Cov}[\vec{\boldsymbol{S}}] = \begin{bmatrix} 1 & \rho \\ \rho & 1 \end{bmatrix}$.
 In this exercise you will prove Sheppard’s Formula.
 Let $\boldsymbol{g}_1, \boldsymbol{g}_2$ be independent standard Gaussians and let \[ \boldsymbol{g}_1' = \boldsymbol{g}_1, \qquad \boldsymbol{g}_2' = \rho \boldsymbol{g}_1 + \sqrt{1\rho^2} \boldsymbol{g}_2. \] Show that $(\boldsymbol{g}_1′, \boldsymbol{g}_2′)$ has the same distribution as the random vector $(\boldsymbol{z}_1, \boldsymbol{z}_2)$ in Sheppard’s formula.
 Assume for now that $\rho \in [0,1)$. Show that $\boldsymbol{g}_1′, \boldsymbol{g}_2′ > 0$ if and only if $(\boldsymbol{g}_1, \boldsymbol{g}_2)$ is in a certain region $R \subseteq {\mathbb R}^2$ consisting of a quadrant plus a wedge of angle $\arcsin \rho$.
 Using the circular symmetry of the random vector $(\boldsymbol{g}_1, \boldsymbol{g}_2)$, deduce Sheppard’s formula for $\rho \in [0,1)$.
 Treat the cases $\rho \in (1,0)$, $\rho = \pm 1$.
 Prove Corollary 17.
 Fix $n$ odd. Using Theorem 16 show that $\widehat{\mathrm{Maj}_n}(S)$ is a decreasing function of $S$ for odd $1 \leq S \leq \frac{n1}{2}$. Deduce (using also Corollary 17) that $\hat{\lVert} \mathrm{Maj}_n \hat{\rVert}_\infty = \mathrm{Maj}_n(\{1\}) \sim \frac{\sqrt{2/\pi}}{\sqrt{n}}$.
 Prove Corollary 18.
 Prove Theorem 15. (Hint: Corollary 18).


Hey Ryan, enjoying your posts; wish I had time to work on these!
Looks like you want a “sgn” in Ex. 1.a.
Yep, fixed. Thanks Andy.
In the second sentence of exercise 8, have you misplaced the \hat symbols or was that deliberate?
Thanks Amirali — I think it’s okay though… There are hats on the norm signs, to denote the spectral norm. Perhaps its hard to see on your browser? Or maybe I’m missing something.
I think you can take $n+1$ instead of $2n$ in ex 12 a.
Hi Noam — that may be so (I didn’t think about it too carefully). But for safety I think I will stick with 2n, which is at least n+1 anyway.
thanks!
Ryan