<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Analysis of Boolean Functions</title>
	<atom:link href="http://www.contrib.andrew.cmu.edu/~ryanod/index.php?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://www.contrib.andrew.cmu.edu/~ryanod</link>
	<description>My tagline</description>
	<lastBuildDate>Fri, 17 May 2013 02:26:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>§8.4: $p$-biased analysis</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1339</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1339#comments</comments>
		<pubDate>Wed, 08 May 2013 21:45:47 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[All chapter sections]]></category>
		<category><![CDATA[Chapter 8: Generalized domains]]></category>
		<category><![CDATA[biased Fourier analysis]]></category>
		<category><![CDATA[derivative]]></category>
		<category><![CDATA[graph property]]></category>
		<category><![CDATA[influence]]></category>
		<category><![CDATA[random graph]]></category>
		<category><![CDATA[Russo-Margulis formula]]></category>
		<category><![CDATA[sharp threshold]]></category>
		<category><![CDATA[threshold phenomena]]></category>
		<category><![CDATA[transitive-symmetric function]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1339</guid>
		<description><![CDATA[<p> Perhaps the most common generalized domain in analysis of boolean functions is the case of the hypercube with &#8220;biased&#8221; bits. In this setting we think of a random input in $\{-1,1\}^n$ as having each bit independently equal to $-1$ ($\mathsf{True}$) with probability $p \in (0,1)$ and equal to $1$ ($\mathsf{False}$) with probability $q = [...]]]></description>
			<content:encoded><![CDATA[<p><a name="secp-biased"></a> Perhaps the most common generalized domain in analysis of boolean functions is the case of the hypercube with &#8220;biased&#8221; bits.<br />
<span id="more-1339"></span><br />
In this setting we think of a random input in $\{-1,1\}^n$ as having each bit independently equal to $-1$ ($\mathsf{True}$) with probability $p \in (0,1)$ and equal to $1$ ($\mathsf{False}$) with probability $q = 1-p$. (We could also consider different parameters $p_i$ for each coordinate; see the exercises.) In the notation of the chapter this means $L^2(\Omega^n, \pi_p^{\otimes n})$, where $\Omega = \{-1,1\}$ and $\pi_p$ is the distribution on $\Omega$ defined by $\pi_p(-1) = p$, $\pi_p(1) = q$. This context is often referred to as <em>$p$-biased Fourier analysis</em>, though it would be more consistent with our terminology if it were called &#8220;$\mu$-biased&#8221;, where \[ \mu = \mathop{\bf E}_{{\boldsymbol{x}}_i \sim \pi_p}[{\boldsymbol{x}}_i] = q-p = 1-2p. \] One of the more interesting features of the setting is that we can fix a combinatorial boolean function $f : \{-1,1\}^n \to \{-1,1\}$ and then consider its properties for various $p$ between $0$ and $1$; we will discuss this further later in this section. We will also sometimes use the abbreviated notation $\mathop{\bf Pr}_{\pi_p}[\cdot]$ in place of $\mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[\cdot]$, and similarly $\mathop{\bf E}_{\pi_p}[\cdot]$. The $p$-biased hypercube is one of the generalized domains where it can pay to look at an explicit Fourier basis. In fact, since we have $|\Omega| = 2$ there is a <em>unique</em> Fourier basis $\{\phi_0, \phi_1\}$ (up to negating $\phi_1$). For notational simplicity we&#8217;ll write $\phi$ instead of $\phi_1$ and use &#8220;set notation&#8221; rather than multi-index notation: </p>
<blockquote><p><b>Definition 39</b> <a name="defp-biased-phi"></a> In the context of $p$-biased Fourier analysis we define the basis function $\phi : \{-1,1\} \to {\mathbb R}$ by \[ \phi(x_i) = \frac{x_i - \mu}{\sigma}, \] where \[ \mu = \mathop{\bf E}_{{\boldsymbol{x}}_i \sim \pi_p}[{\boldsymbol{x}}_i] = q-p = 1- 2p, \quad \sigma = \mathop{\bf stddev}_{{\boldsymbol{x}}_i \sim \pi_p}[{\boldsymbol{x}}_i] = \sqrt{4pq} = 2\sqrt{p}\sqrt{1-p}. \] Note that $\sigma^2 = 1 &#8211; \mu^2$. We also have the formula $\phi(1) = \sqrt{p/q}$, $\phi(-1) = -\sqrt{q/p}$. </p></blockquote>
<p> We will use the notation $\mu$ and $\sigma$ throughout this section. It&#8217;s clear that $\{1, \phi\}$ is indeed a Fourier basis for $L^2(\{-1,1\}, \pi_p)$ because $\mathop{\bf E}[\phi({\boldsymbol{x}}_i)] = 0$ and $\mathop{\bf E}[\phi({\boldsymbol{x}}_i)^2] = 1$ by design. </p>
<blockquote><p><b>Definition 40</b> In the context of $L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ we define the product Fourier basis functions $(\phi_S)_{S \subseteq [n]}$ by \[ \phi_S(x) = \prod_{i \in S} \phi(x_i). \] Given $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ we write $\widehat{f}(S)$ for the associated Fourier coefficient; i.e., \[ \widehat{f}(S) = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}})\,\phi_S({\boldsymbol{x}})]. \] Thus we have the biased Fourier expansion \[ f(x) = \sum_{S \subseteq [n]} \widehat{f}(S)\,\phi_S(x). \] </p></blockquote>
<p> Although the notation is very similar to that of the classic uniform-distribution Fourier analysis, we caution that in general, \[ \phi_S \phi_T \neq \phi_{S \triangle T}. \] </p>
<blockquote><p><b>Example 41</b> Let $\chi_i \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ be the $i$th dictator function, $\chi_i(x) = x_i$, viewed under the $p$-biased distribution. We have \[ \phi(x_i) = \frac{x_i - \mu}{\sigma} \quad\Rightarrow\quad x_i = \mu + \sigma\phi(x_i), \] and the latter is evidently $f$&#8217;s (biased) Fourier expansion. I.e., \[ \widehat{\chi_i}(\emptyset) = \mu, \quad \widehat{\chi_i}(\{i\}) = \sigma, \quad \widehat{\chi_i}(S) = 0 \text{ otherwise}. \] </p></blockquote>
<p><p>
This example lets us see a link between a function&#8217;s &#8220;usual&#8221; Fourier expansion and its biased Fourier expansion. (For more on this, see the exercises.) Let&#8217;s abuse notation a little by writing simply $\phi_i$ instead of $\phi(x_i)$. We have the formulas \begin{equation} \label{eqn:p-biased-substitution} \phi_i = \frac{x_i &#8211; \mu}{\sigma} \quad\Leftrightarrow\quad x_i = \mu + \sigma \phi_i, \end{equation} and we can go from the usual Fourier expansion to the biased Fourier expansion simply by plugging in the latter. </p>
<blockquote><p><b>Example 42</b> Recall the &#8220;selection function&#8221; $\mathrm{Sel} : \{-1,1\}^3 \to \{-1,1\}$ from Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=301#excompute-expansions" target="_blank">1.1(j)</a>; $\mathrm{Sel}(x_1, x_2, x_2)$ outputs $x_2$ if $x_1 = -1$ and outputs $x_3$ if $x_1 = 1$. The usual Fourier expansion of $\mathrm{Sel}$ is \[ \mathrm{Sel}(x_1, x_2, x_3) = \tfrac{1}{2} x_2 + \tfrac{1}{2} x_3 -\tfrac{1}{2} x_1 x_2 + \tfrac{1}{2} x_1 x_3. \] Using the substitution from \eqref{eqn:p-biased-substitution} we get \begin{align} \mathrm{Sel}(x_1, x_2, x_3) &#038;= \tfrac{1}{2} (\mu + \sigma \phi_2) + \tfrac{1}{2} (\mu + \sigma \phi_3) -\tfrac{1}{2} (\mu + \sigma \phi_1) (\mu + \sigma \phi_2) + \tfrac{1}{2} (\mu + \sigma \phi_1) (\mu + \sigma \phi_3) \nonumber\\ &#038;= \mu + (\tfrac{1}{2} &#8211; \tfrac{1}{2} \mu)\sigma \, \phi_{2} + (\tfrac{1}{2} + \tfrac{1}{2} \mu) \sigma \, \phi_{3} &#8211; \tfrac{1}{2} \sigma^2 \, \phi_1 \phi_2 + \tfrac{1}{2} \sigma^2 \, \phi_1 \phi_3. \label{eqn:sel-biased} \end{align} Thus if we write $\mathrm{Sel}^{(p)}$ for the selection function thought of as an element of $L^2(\{-1,1\}^3, \pi_p^{\otimes 3})$, we have \begin{gather*} \widehat{\mathrm{Sel}^{(p)}}(\emptyset) = \mu, \quad \widehat{\mathrm{Sel}^{(p)}}(2) = (\tfrac{1}{2} &#8211; \tfrac{1}{2} \mu)\sigma, \quad \widehat{\mathrm{Sel}^{(p)}}(3) = (\tfrac{1}{2} + \tfrac{1}{2} \mu)\sigma, \\ \widehat{\mathrm{Sel}^{(p)}}(\{1,2\}) = -\tfrac{1}{2}\sigma^2, \quad \widehat{\mathrm{Sel}^{(p)}}(\{1,3\}) = \tfrac{1}{2}\sigma^2, \quad \widehat{\mathrm{Sel}^{(p)}}(S) = 0 \text{ else}. \end{gather*} By the Fourier formulas of Section <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#secgeneral-fourier-formulas"  target="_blank">2</a> we can deduce, e.g., that $\mathop{\bf E}[\mathrm{Sel}^{(p)}] = \mu$, $\mathbf{Inf}_1[\mathrm{Sel}^{(p)}] = (-\tfrac{1}{2} \sigma^2)^2 + (\tfrac{1}{2} \sigma^2)^2 = \tfrac{1}{2} \sigma^4$, etc. </p></blockquote>
<p> Let&#8217;s codify a piece of notation from this example: </p>
<blockquote><p><b>Notation 43</b> <a name="notsupmu"></a> Let $f : \{-1,1\}^n \to {\mathbb R}$ and let $p \in (0,1)$. We write $f^{(p)}$ for the function when viewed as an element of $L^2(\{-1,1\}^n, \pi_{p}^{\otimes n})$. </p></blockquote>
<p><p>
<br/><br />
 We now discuss derivative operators. We would like to define an operator $\mathrm{D}_i$ on $L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ which acts like differentiation on the biased Fourier expansion. E.g., referring to \eqref{eqn:sel-biased} we would like to have \[ \mathrm{D}_3 \mathrm{Sel}^{(p)} = (\tfrac{1}{2} + \tfrac{1}{2} \mu) \sigma + \tfrac{1}{2} \sigma^2 \, \phi_1. \] In general we are seeking $\frac{\partial}{\partial \phi_i}$ which, by basic calculus and the relationship \eqref{eqn:p-biased-substitution}, satisfies \[ \frac{\partial}{\partial \phi_i} = \frac{\partial x_i}{\partial \phi_i} \cdot \frac{\partial}{\partial x_i} = \sigma \cdot \frac{\partial}{\partial x_i}. \] Recognizing $\frac{\partial}{\partial x_i}$ as the &#8220;usual&#8221; $i$th derivative operator, we are led to the following: </p>
<blockquote><p><b>Definition 44</b> <a name="defbiased-deriv"></a> For $i \in [n]$, the <em>$i$th (discrete) derivative</em> operator $\mathrm{D}_i$ on $L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ is defined by \[ \mathrm{D}_i f(x) = \sigma \cdot \frac{f(x^{(i\mapsto 1)}) - f(x^{(i \mapsto -1)})}{2}. \] Note that this defines a different operator for each value of $p$. We sometimes write the above definition as \[ \mathrm{D}_{\phi_i} = \sigma \cdot \mathrm{D}_{x_i}. \] With respect to the biased Fourier expansion of $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ the operator $\mathrm{D}_i$ satisfies \begin{equation} \label{eqn:p-biased-deriv} \mathrm{D}_i f = \sum_{S \ni i} \widehat{f}(S)\,\phi_{S \setminus \{i\}}. \end{equation} </p></blockquote>
<p> Given this definition we can derive some addition formulas for influences, including a generalization of Proposition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=351#propmonotone-influences" target="_blank">2.20</a>: </p>
<blockquote><p><b>Proposition 45</b> <a name="propmonotone-influences-biased"></a> Suppose $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ is boolean-valued (i.e., has range $\{-1,1\}$). Then \[ \mathbf{Inf}_i[f] = \sigma^2 \mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}}) \neq f({\boldsymbol{x}}^{\oplus i})] \] for each $i \in [n]$. If furthermore $f$ is monotone then $\mathbf{Inf}_i[f] = \sigma \widehat{f}(i)$. </p></blockquote>
<p> <br/><em>Proof:</em>  Using Definition <a href="#defbiased-deriv">44</a>&#8216;s notation and \eqref{eqn:p-biased-deriv} we have \[ \mathbf{Inf}_i[f] = \mathop{\bf E}_{\pi_p}[(\mathrm{D}_{\phi_i} f)^2] = \sigma^2 \mathop{\bf E}_{\pi_p}[(\mathrm{D}_{x_i} f)^2]. \] Since $(\mathrm{D}_{x_i} f)^2$ is the $0$-$1$ indicator that $i$ is pivotal for $f$, the first formula follows. When $f$ is monotone we furthermore have that $(\mathrm{D}_{x_i} f)^2 = \mathrm{D}_{x_i} f$ and hence \[ \mathbf{Inf}_i[f] = \sigma^2 \mathop{\bf E}_{\pi_p}[\mathrm{D}_{x_i} f] = \sigma \mathop{\bf E}_{\pi_p}[\mathrm{D}_{\phi_i} f] = \sigma \widehat{f}(i), \] as claimed. $\Box$</p>
<p>
<br/></p>
<p>
 The remainder of this section is devoted to the topic of <em>threshold phenomena</em> in boolean functions. Much of the motivation for this comes from theory of random graphs, which we now briefly introduce. </p>
<blockquote><p><b>Definition 46</b> Given an undirected graph $G$ on $v \geq 2$ vertices, we identify it with the string in $\{\mathsf{True}, \mathsf{False}\}^{\binom{v}{2}}$ which indicates which edges are present ($\mathsf{True}$) and which are absent ($\mathsf{False}$). We write $\mathcal{G}(v,p)$ for the distribution $\pi_{p}^{\otimes \binom{v}{2}}$; this is called the <em>Erdős&#8211;Rényi random graph model</em>. Note that if we permute the $v$ vertices of a graph, this induces a permutation on the $\binom{v}{2}$ edges. A ($v$-vertex) <em>graph property</em> is a boolean function $f : \{\mathsf{True}, \mathsf{False}\}^{\binom{v}{2}} \to \{\mathsf{True}, \mathsf{False}\}$ which is invariant under all $v!$ such permutations of its input; colloquially, this means that $f$ &#8220;does not depend on the names of the vertices&#8221;. </p></blockquote>
<p>   Graph properties are always transitive-symmetric functions in the sense of Definition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=344#deftransitive-symmetric" target="_blank">2.10</a>. </p>
<blockquote><p><b>Examples 47</b> The following are all $v$-vertex graph properties: \begin{align*} \text{Conn}(G) &#038;= \mathsf{True} \text{ if $G$ is connected;} \\ \text{3Col}(G) &#038;= \mathsf{True} \text{ if $G$ is $3$-colourable;} \\ \text{Clique}_{k}(G) &#038;= \mathsf{True} \text{ if $G$ is contains a clique on at least $k$ vertices;} \\ \mathrm{Maj}_{n}(G) &#038;= \mathsf{True} \textstyle \text{ (assuming $n = \binom{v}{2}$ is odd) if $G$ has at least $\binom{v}{2}/2$ edges;} \\ \chi_{[n]}(G) &#038;= \mathsf{True} \text{ if $G$ has an odd number of edges.} \\ \end{align*} Note that each of these actually defines a family of boolean functions, one for each value of $v$; this is the typical situation in the study of graph properties. An example of a function $f : \{\mathsf{True},\mathsf{False}\}^{\binom{v}{2}} \to \{\mathsf{True}, \mathsf{False}\}$ which is <em>not</em> a graph property is the one defined by $f(G) = \mathsf{True}$ if vertex #$1$ has at least one neighbour; this $f$ is not invariant under permuting the vertices. </p></blockquote>
<p> Graph properties which are <em>monotone</em> are particularly nice to study; these are the ones for which adding edges can never make the property go from $\mathsf{True}$ to $\mathsf{False}$. The properties $\text{Conn}$, $\text{Clique}_{k}$, and $\mathrm{Maj}_n$ defined above are all monotone, as is $\neg\text{3Col}$. Take any monotone graph property, say $\text{Conn}$; a typical question in random graph theory would be, &#8220;how many edges does a graph need to have before it is likely to be connected?&#8221; Or more precisely, how does $\mathop{\bf Pr}_{\boldsymbol{G} \sim \mathcal{G}(v,p)}[\text{Conn}(\boldsymbol{G}) = \mathsf{True}]$ vary as $p$ increases from $0$ to $1$?</p>
<p>
There&#8217;s no need to ask this question just for graph properties. Given any monotone boolean function $f : \{\mathsf{True},\mathsf{False}\}^n \to \{\mathsf{True}, \mathsf{False}\}$ it is intuitively clear that when $p$ increases from $0$ to $1$ this causes $\mathop{\bf Pr}_{\pi_p} [f({\boldsymbol{x}}) = \mathsf{True}]$ to increase from $0$ to $1$ (unless $f$ is a constant function). As illustration, we show a plot of $\mathop{\bf Pr}_{\pi_p} [f({\boldsymbol{x}}) = \mathsf{True}]$ versus $p$ for the dictator function, $\mathrm{AND}_2$, and $\mathrm{Maj}_{101}$.</p>
<div id="attachment_1343" class="wp-caption aligncenter" style="width: 423px"><a href="http://www.contrib.andrew.cmu.edu/~ryanod/wp-content/uploads/2013/05/threshold-examples.png"><img src="http://www.contrib.andrew.cmu.edu/~ryanod/wp-content/uploads/2013/05/threshold-examples.png" alt="" title="Threshold examples" width="413" height="326" class="size-full wp-image-1343" /></a><p class="wp-caption-text"> Plot of Pr_{π_p}(f(x) = True) versus p for f a dictator (dotted), f = AND_2 (dashed), and f =Maj_{101} (solid)</p></div>
<p>
The <em>Margulis&#8211;Russo Formula</em> quantifies the rate at which $\mathop{\bf Pr}_{\pi_p} [f({\boldsymbol{x}}) = \mathsf{True}]$ increases with $p$; specifically, it relates the slope of the curve at $p$ to the total influence of $f$ under $\pi_p^{\otimes n}$. To prove the formula we switch to $\pm 1$ notation. </p>
<blockquote><p><b>Margulis&#8211;Russo Formula</b> Let $f : \{-1,1\}^n \to {\mathbb R}$. Recalling Notation <a href="#notsupmu">43</a> and the relation $\mu = 1-2p$, we have \begin{equation} \label{eqn:marg-russo1} \frac{d}{d\mu}\mathop{\bf E}[f^{(p)}] = \frac{1}{\sigma} \cdot \sum_{i=1}^n \widehat{f^{(p)}}(i). \end{equation} In particular, if $f : \{-1,1\}^n \to \{-1,1\}$ is monotone then \begin{equation} \label{eqn:marg-russo2} \frac{d}{dp}\,\mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}}) = -1] = \frac{d}{d\mu}\mathop{\bf E}[f^{(p)}] = \frac{1}{\sigma^2} \cdot \mathbf{I}[f^{(p)}]. \end{equation}
</p></blockquote>
<p>  <br/><em>Proof:</em>  Treating $f$ as a multilinear polynomial over $x_1, \dots, x_n$ we have \[ \mathop{\bf E}[f^{(p)}] = \mathrm{T}_\mu f(1, \dots, 1) = f(\mu, \dots, \mu) \] (this also follows from Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=301#exmultilin-interp" target="_blank">1.5</a>). By basic calculus, \[ \frac{d}{d\mu}f(\mu, \dots, \mu) = \sum_{i=1}^n \mathrm{D}_{x_i} f(\mu, \dots, \mu). \] But \[ \mathrm{D}_{x_i} f(\mu, \dots, \mu) = \mathop{\bf E}[\mathrm{D}_{x_i} f^{(p)}] = \frac{1}{\sigma} \mathop{\bf E}[\mathrm{D}_{\phi_i} f^{(p)}] = \frac{1}{\sigma} \widehat{f^{(p)}}(i), \] completing the proof of \eqref{eqn:marg-russo1}. As for \eqref{eqn:marg-russo2}, the second equality follows immediately from Proposition <a href="#propmonotone-influences-biased">45</a>. The first equality holds because $\mu = 1-2p$ and $\mathop{\bf E}[f] = 1-2\mathop{\bf Pr}[f = -1]$; the two factors of $-2$ cancel. $\Box$</p>
<p>
Looking again at the figure we see that the plot for $\mathrm{Maj}_{101}$ looks very much like a step function, jumping from nearly $0$ to nearly $1$ around the critical value $p = 1/2$. For $\mathrm{Maj}_n$, this &#8220;sharp threshold at $p = 1/2$&#8221; becomes more and more pronounced as $n$ increases. This is clearly suggested by the Margulis&#8211;Russo Formula: the derivative of the curve at $p = 1/2$ is equal to $\mathbf{I}[\mathrm{Maj}_n]$ (the usual, uniform-distribution total influence), which has the very large value $\Theta(\sqrt{n})$ (Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=368#thmmaj-maximizes-deg-1-sum" target="_blank">2.32</a>). Such sharp thresholds exist for many boolean functions; we give some examples: </p>
<blockquote><p><b>Examples 48</b> <a name="egsharp-thresh"></a> In the exercises you are asked to show that for every $\epsilon &gt; 0$ there is a $C$ such that \[ \mathop{\bf Pr}_{\pi_{1/2 - C/\sqrt{n}}}[\mathrm{Maj}_n = \mathsf{True}] \leq \epsilon, \quad \mathop{\bf Pr}_{\pi_{1/2 + C/\sqrt{n}}}[\mathrm{Maj}_n = \mathsf{True}] \geq 1 &#8211; \epsilon. \] Regarding the Erdős&#8211;Rényi graph model, the following facts are known: \begin{align*} \mathop{\bf Pr}_{\boldsymbol{G} \sim \mathcal{G}(v,p)}[\text{Clique}_{\log v}(\boldsymbol{G}) = \mathsf{True}] &#038;\xrightarrow[v \to \infty]{} \begin{cases} 0 &#038; \text{if } p &lt; 1/4, \\ 1 &#038; \text{if } p &gt; 1/4. \end{cases} \\ \mathop{\bf Pr}_{\boldsymbol{G} \sim \mathcal{G}(v,p)}[\text{Conn}(\boldsymbol{G}) = \mathsf{True}] &#038;\xrightarrow[v \to \infty]{} \begin{cases} 0 &#038; \text{if } p &lt; \frac{\ln v}{v}(1-\tfrac{\log \log v}{\log v}), \\ 1 &#038; \text{if } p &gt; \frac{\ln v}{v}(1+\tfrac{\log \log v}{\log v}). \end{cases} \\ \end{align*} </p></blockquote>
<p> In the above examples you can see that the &#8220;jump&#8221; occurs at various values of $p$. To investigate this phenomenon, we first single out the value for which $\mathop{\bf Pr}_{\pi_p}[f({\boldsymbol{x}}) = \mathsf{True}] = 1/2$: </p>
<blockquote><p><b>Definition 49</b> <a name="defcrit-prob"></a> Let $f : \{\mathsf{True},\mathsf{False}\}^n \to \{\mathsf{True}, \mathsf{False}\}$ be monotone and nonconstant. The <em>critical probability</em> for $f$, denoted $p_c$, is the unique value in $(0,1)$ for which $\mathop{\bf Pr}_{{\boldsymbol{x}} \sim \pi_p^{\otimes n}}[f({\boldsymbol{x}}) = \mathsf{True}] = 1/2$. We also write $q_c = 1-p_c$, $\mu_c = q_c &#8211; p_c = 1-2p_c$, and $\sigma = \sqrt{4p_cq_c}$. </p></blockquote>
<p>   In the exercises you are asked to verify that $p_c$ is well defined.</p>
<p>
Now suppose that $\Delta$ is the derivative of $\mathop{\bf Pr}_{\pi_p}[f({\boldsymbol{x}}) = \mathsf{True}]$ at $p = p_c$. Roughly speaking, we would expect $\mathop{\bf Pr}_{\pi_p}[f({\boldsymbol{x}}) = \mathsf{True}]$ to jump from near $0$ to near $1$ in an interval of around $p_c$ of width about $1/\Delta$. Thus a &#8220;sharp threshold&#8221; should roughly correspond to the case that $1/\Delta$ is small, even compared to $\min(p_c, q_c)$. The Margulis&#8211;Russo Formula says that $\Delta = \frac{1}{\sigma_c^2} \mathbf{I}[f^{(p_c)}]$, and since $\min(p_c,q_c)$ is proportional to $4p_cq_c = \sigma_c^2$ it follows that $1/\Delta$ is &#8220;small&#8221; compared to $\min(p_c,q_c)$ if and only if $\mathbf{I}[f^{(p_c)}]$ is &#8220;large&#8221;. Thus we have a neat criterion:</p>
<p>
  <b>Sharp threshold principle:</b> <em>Let $f : \{\mathsf{True}, \mathsf{False}\}^n \to \{\mathsf{True}, \mathsf{False}\}$ be monotone. Then, roughly speaking, $\mathop{\bf Pr}_{\pi_p}[f({\boldsymbol{x}}) = \mathsf{True}]$ has a &#8220;sharp threshold&#8221; if and only if $f$ has &#8220;large&#8221; (&#8220;superconstant&#8221;) total influence under its critical probability distribution.</em></p>
<p>
Of course this should all be made a bit more precise; see the exercises for details. In light of this principle, we can prove that a given $f$ has a sharp threshold by proving that $\mathbf{I}[f^{(p_c)}]$ is not &#8220;small&#8221;. This strongly motivates the problem of &#8220;characterizing&#8221; functions $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ for which $\mathbf{I}[f]$ is small. Friedgut&#8217;s Junta Theorem, mentioned at the end of Section <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=535#seclow-degree" target="_blank">3.1</a> and proved in a later chapter, tells us that in the uniform distribution case $p = 1/2$, the only way $\mathbf{I}[f]$ can be small is if $f$ is close to a junta. In particular, any monotone graph property with $p_c = 1/2$ must have a sharp threshold: since it is transitive-symmetric, all $n$ coordinates are equally influential and it can&#8217;t be close to a junta. These results also hold so long as $p$ is bounded away from $0$ and $1$, as we will show in Chapter 9. However many interesting monotone graph properties have $p_c$ very close to $0$: e.g. connectivity, as we saw in Examples <a href="#egsharp-thresh">48</a>. Characterizing the functions $f \in L^2(\{-1,1\}^n, \pi_p^{\otimes n})$ with small $\mathbf{I}[f]$ when $p = o_n(1)$ is a trickier task; see the work of Friedgut, Bourgain, and Hatami described in Chapter 10.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1339</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>§8.3: Orthogonal decomposition</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1323</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1323#comments</comments>
		<pubDate>Mon, 29 Apr 2013 02:20:07 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[All chapter sections]]></category>
		<category><![CDATA[Chapter 8: Generalized domains]]></category>
		<category><![CDATA[ANOVA]]></category>
		<category><![CDATA[Efron-Stein]]></category>
		<category><![CDATA[Hoeffding decomposition]]></category>
		<category><![CDATA[orthogonal decomposition]]></category>
		<category><![CDATA[product probability space]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1323</guid>
		<description><![CDATA[<p>In this section we describe a basis-free kind of &#8220;Fourier expansion&#8221; for functions on general product domains. We will refer to it as the orthogonal decomposition of $f \in L^2(\Omega^n, \pi^{\otimes n})$ though it goes by several other names in the literature: e.g., Hoeffding, Efron&#8211;Stein, or ANOVA decomposition. The general idea is to express \begin{equation} [...]]]></description>
			<content:encoded><![CDATA[<p>In this section we describe a basis-free kind of &#8220;Fourier expansion&#8221; for functions on general product domains. We will refer to it as the <em>orthogonal decomposition</em> of $f \in L^2(\Omega^n, \pi^{\otimes n})$ though it goes by several other names in the literature: e.g., <em>Hoeffding</em>, <em>Efron&#8211;Stein</em>, or <em>ANOVA decomposition</em>.<br />
<span id="more-1323"></span><br />
The general idea is to express \begin{equation} \label{eqn:pm1-orthog-decomp} f = \sum_{S \subseteq [n]} f^{=S} \end{equation} where each function $f^{=S} \in L^2(\Omega^n, \pi^{\otimes n})$ gives the &#8220;contribution to $f$ coming from coordinates $S$ (but not from any subset of $S$)&#8221;.</p>
<p>
To make this more precise, let&#8217;s start with the familiar case of $f : \{-1,1\}^n \to {\mathbb R}$. Here it is possible to define the functions $f^{=S} : \{-1,1\}^n \to {\mathbb R}$ simply by $f^{=S} = \widehat{f}(S)\,\chi_S$. (Later we will give an equivalent definition which doesn&#8217;t involve the Fourier basis.) This definition satisfies \eqref{eqn:pm1-orthog-decomp} as well as the following two properties: </p>
<ol>
<li> $f^{=S}$ depends only on the coordinates in $S$.
<li> If $T \subsetneq S$ and $g$ is a function depending only on the coordinates in $T$, then $\langle f^{=S}, g \rangle = 0$.
</ol>
<p> These properties describe what we mean precisely when we say that $f^{=S}$ is the &#8220;contribution to $f$ coming from coordinates $S$ (but not from any subset of $S$)&#8221;. Furthermore, the decomposition \eqref{eqn:pm1-orthog-decomp} is <em>orthogonal</em>, meaning $\langle f^{=S}, f^{=T} \rangle = 0$ whenever $S \neq T$.</p>
<p>
To make this definition basis-free, recall the &#8220;projection of $f$ onto coordinates $J$&#8221;, $f^{\subseteq J}$ Definition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#defprojection-onto-coords" target = "_blank">17</a>. You can think of $f^{\subseteq J}$ as the &#8220;contribution to $f$ coming from coordinates $J$ (collectively)&#8221;. It has a probabilistic definition not depending on any basis, and with the definition $f^{=S} = \widehat{f}(S)\,\chi_S$ we have from Proposition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#propprojection-formula" target="_blank">19</a> that \begin{equation} \label{eqn:orthog-invert-me} f^{\subseteq J} = \sum_{S \subseteq J} f^{=S}. \end{equation} It is precisely by inverting \eqref{eqn:orthog-invert-me} that we can give a basis-free definition of the functions $f^{=S}$.</p>
<p>
Let&#8217;s do this inversion for a general $f \in L^2(\Omega^n, \pi^{\otimes n})$. The projection functions $f^{\subseteq J} \in L^2(\Omega^n, \pi^{\otimes n})$ can be defined as in Definition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#defprojection-onto-coords" target="_blank">17</a>. If we want \eqref{eqn:orthog-invert-me} to hold for $J = \emptyset$ then we should define \[ f^{=\emptyset} = f^{\subseteq \emptyset} \] (which is the constant function equal to $\mathop{\bf E}[f]$). Given this, if we want \eqref{eqn:orthog-invert-me} to hold for singleton sets $J = \{j\}$ then we need \[ f^{\subseteq \{j\}} = f^{=\emptyset} + f^{=\{j\}} \quad\Leftrightarrow\quad f^{=\{j\}} = f^{\subseteq \{j\}} - f^{\subseteq \emptyset}. \] In other words, \[ f^{= \{j\}}(x) = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} [f \mid {\boldsymbol{x}}_j = x_j] &#8211; \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}} [f({\boldsymbol{x}})]. \] Notice this function only depends on the input value $x_j$; it measures the change in expectation of $f$ if you know the value $x_j$. Moving on to sets of cardinality $2$, if we want \eqref{eqn:orthog-invert-me} to hold for $J = \{i,j\}$ then we need \begin{align*} f^{\subseteq \{i,j\}} &#038;= f^{=\emptyset} + f^{=\{i\}} + f^{=\{j\}} + f^{=\{i,j\}} \\ &#038;= f^{\subseteq \emptyset} + (f^{\subseteq \{i\}} &#8211; f^{\subseteq \emptyset}) + (f^{\subseteq \{j\}} &#8211; f^{\subseteq \emptyset}) + f^{=\{i,j\}} \end{align*} and hence \[ f^{=\{i,j\}} = f^{\subseteq \{i,j\}} - f^{\subseteq \{i\}} - f^{\subseteq \{j\}} + f^{\subseteq \emptyset}. \] It&#8217;s clear that we can continue this and define all the functions $f^{=S}$ by the principle of inclusion-exclusion. To show this definition leads to an orthogonal decomposition we will need the following lemma: </p>
<blockquote><p><b>Lemma 34</b> <a name="lemorthog-subsets"></a> Let $f, g \in L^2(\Omega^n, \pi^{\otimes n})$. Assume that $f$ does not depend on any coordinate outside $I \subseteq [n]$, and $g$ does not depend on any coordinate outside $J \subseteq [n]$. Then $\langle f, g \rangle = \langle f^{\subseteq I \cap J}, g^{\subseteq I \cap J} \rangle$. </p></blockquote>
<p> <br/><em>Proof:</em>  We may assume without loss of generality that $I \cup J = [n]$. Given any $x \in \Omega^n$ we can break it into the parts $(x_{I \cap J}, x_{I \setminus J}, x_{J \setminus I})$. We then have \begin{align*} \langle f, g\rangle = \mathop{\bf E}_{{\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J}, {\boldsymbol{x}}_{J \setminus I}}[f({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J}) \cdot g({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{J \setminus I})], \end{align*} where we have abused notation slightly by writing $f$ and $g$ as functions just of the coordinates on which they actually depend. Since ${\boldsymbol{x}}_{I \setminus J}$ and ${\boldsymbol{x}}_{J \setminus I}$ are independent, the above equals \[ \mathop{\bf E}_{{\boldsymbol{x}}_{I \cap J}}\left[\mathop{\bf E}_{{\boldsymbol{x}}_{I \setminus J}}[f({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J})] \cdot \mathop{\bf E}_{{\boldsymbol{x}}_{J \setminus I}}[g({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{J \setminus I})] \right]. \] But now $\mathop{\bf E}_{{\boldsymbol{x}}_{I \setminus J}}[f({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{I \setminus J})]$ is nothing more than $f^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J})$, and similarly $\mathop{\bf E}_{{\boldsymbol{x}}_{J \setminus I}}[g({\boldsymbol{x}}_{I \cap J}, {\boldsymbol{x}}_{J \setminus I})] = g^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J})$. Thus the above equals \[ \mathop{\bf E}_{{\boldsymbol{x}}_{I \cap J}}[f^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J}) \cdot g^{\subseteq I \cap J}({\boldsymbol{x}}_{I \cap J})] = \langle f^{\subseteq I \cap J}, g^{\subseteq I \cap J} \rangle. \qquad \Box\]</p>
<p>
We can now give the main theorem on orthogonal decomposition: </p>
<blockquote><p><b>Theorem 35</b> <a name="thmorthogonal-decomposition"></a> Let $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then $f$ has a unique decomposition as \[ f = \sum_{S \subseteq [n]} f^{=S} \] where the functions $f^{=S} \in L^2(\Omega^n, \pi^{\otimes n})$ satisfy the following: </p>
<ol type = "i">
<li> <a name="itemfirst-orthog"></a> $f^{=S}$ depends only on the coordinates in $S$.
<li> <a name="itemorthog-decomp-ortho"></a> If $T \subsetneq S$ and $g \in L^2(\Omega^n, \pi^{\otimes n})$ depends only on the coordinates in $T$ then $\langle f^{=S}, g \rangle = 0$.
</ol>
<p> This decomposition has the following additional properties: </p>
<ol type = "i" start="3">
<li> <a name="itemorthog-decomp-ortho2"></a> Condition (ii) additionally holds whenever $S \not \subseteq T$.
<li> <a name="itemorthog-decomp-ortho3"></a> The decomposition is <em>orthogonal</em>: $\langle f^{=S}, f^{=T} \rangle = 0$ for $S \neq T$.
<li> <a name="itemby-incl-excl"></a> $\sum_{S \subseteq T} f^{=S} = f^{\subseteq T}$.
<li> <a name="itemorthog-linear"></a> For each $S \subseteq [n]$, the mapping $f \mapsto f^{=S}$ is a linear operator.
</ol>
</blockquote>
<p> <br/><em>Proof:</em>  We first show the existence of a decomposition satisfying (i)&#8211;(vi). We then show uniqueness for decompositions satisfying (i) and (ii). As suggested above, for each $S \subseteq [n]$ we define \[ f^{=S} = \sum_{J \subseteq S} (-1)^{|S| - |J|} f^{\subseteq J}, \] where the functions $f^{\subseteq J} \in L^2(\Omega^n, \pi^{\otimes n})$ are as in Definition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#defprojection-onto-coords" target="_blank">17</a>. Since each $f^{\subseteq J}$ depends only on the coordinates in $J$, condition (i) certainly holds. It is also immediate that condition (v) holds by inclusion-exclusion; you are asked to prove this explicitly in the exercises. Condition (vi) also follows because each $f \mapsto f^{\subseteq J}$ is a linear operator, as discussed after Definition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#defprojection-onto-coords" target="_blank">17</a>.</p>
<p>
 We now verify (ii). Assume $T \subsetneq S$ and that $g \in L^2(\Omega^n, \pi^{\otimes n})$ only depends on the coordinates in $T$. We have \begin{equation} \label{eqn:pair-me} \langle f^{=S}, g \rangle = \sum_{J \subseteq S} (-1)^{|S| &#8211; |J|} \langle f^{\subseteq J}, g \rangle. \end{equation} Take any $i \in S \setminus T$ and pair up the summands in \eqref{eqn:pair-me} as $J&#8217;$, $J&#8221;$, where $J&#8217; \not \ni i$ and $J&#8221; = J&#8217; \cup \{i\}$. By Lemma <a href="#lemorthog-subsets">34</a> we have \[ \langle f^{\subseteq J''}, g \rangle = \langle f^{\subseteq J'' \cap T}, g^{\subseteq T} \rangle = \langle f^{\subseteq J' \cap T}, g^{\subseteq T} \rangle, \] the latter equality using $i \not \in T$. But the signs $(-1)^{|S| &#8211; |J&#8217;|}$ and $(-1)^{|S| &#8211; |J&#8221;|}$ are opposite, so the summands in \eqref{eqn:pair-me} cancel in pairs. This shows the sum is $0$, confirming (ii).</p>
<p>
 We complete the existence proof by noting that (ii) $\Rightarrow$ (iii) $\Rightarrow$ (iv) (assuming (i)). The first implication is because $\langle f^{=S}, g \rangle = \langle f^{= S}, g^{\subseteq S \cap T} \rangle$ when $g$ depends only on the coordinates in $T$ (Lemma <a href="#lemorthog-subsets">34</a>), and $S \cap T \subsetneq S$ when $S \not \subseteq T$. The second implication is because $S \neq T$ implies either $S \not \subseteq T$ or $T \not \subseteq S$.</p>
<p>
 It remains to prove the uniqueness statement. Suppose $f$ has two representations satisfying (i) and (ii). By subtracting them we get a decomposition of the $0$ function that satisfies (i) and (ii); our goal is to show that each function in this decomposition is the $0$ function. We can do this by showing that any decomposition satisfying (i) and (ii) also satisfies &#8220;Parseval&#8217;s Theorem&#8221;: $\langle f, f \rangle = \sum_{S \subseteq [n]} \|f^{=S}\|_2^2$. But this is an easy consequence of (iv), which we just noted is itself a consequence of  (i) and (ii). $\Box$</p>
<p>
We can connect the orthogonal decomposition of $f$ to its expansion under Fourier bases as follows: </p>
<blockquote><p><b>Proposition 36</b> <a name="proporthog-decomp-basis"></a> Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ have orthogonal decomposition $f = \sum_{S \subseteq [n]} f^{=S}$. Fix any Fourier basis $\phi_0, \dots, \phi_{m-1}$ for $L^2(\Omega, \pi)$. Then \begin{equation} \label{eqn:orthog-decomp-basis} f^{=S} = \sum_{\substack{\alpha \in {\mathbb N}^n_{&lt; m} \\ \mathrm{supp}(\alpha) = S}} \widehat{f}(\alpha)\,\phi(\alpha). \end{equation} </p></blockquote>
<p> <br/><em>Proof:</em>  This follows easily from the uniqueness part of Theorem <a href="#thmorthogonal-decomposition">35</a>. If we take \eqref{eqn:orthog-decomp-basis} as the definition of functions $f^{=S}$, it is immediate that $\sum_S f^{=S} = f$ and that $f^{=S}$ depends only on the coordinates in $S$. Further, if $g$ depends only on coordinates $T \subsetneq S$ then $f^{=S}$ and $g$ have disjoint Fourier support by Corollary <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#corf-depends" target="_blank">20</a>; hence $\langle f^{=S}, g \rangle = 0$ by Plancherel (Proposition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#propgeneral-plancherel-etc" target="_blank">16</a>). $\Box$</p>
<blockquote><p><b>Example 37</b> <a name="egorthog-decomp1"></a> Let&#8217;s compute the orthogonal decomposition of the function $f : \{a,b,c\}^2 \to \{0,1\}$ from Example <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1305#egand2-3ary" target="_blank">15</a>. Recall that in this example $\{a,b,c\}$ has the uniform distribution and $f(x_1,x_2) = 1$ if and only if $x_1 = x_2 = c$. First, \[ f^{=\emptyset} = \mathop{\bf E}[f] = \tfrac{1}{9}. \] Next, for $i = 1,2$ we have that $f^{\subseteq \{i\}}(x)$ is $\frac13$ if $x_i = c$ and $0$ otherwise; hence \[ f^{= \{i\}}(x_1,x_2) = \begin{cases} +\frac{2}{9} &#038; \text{if } x_i = c, \\ -\frac{1}{9} &#038; \text{else.} \end{cases} \] Finally, it&#8217;s easiest to compute $f^{=\{1,2\}}$ as $f &#8211; f^{=\emptyset} &#8211; f^{=\{1\}} &#8211; f^{=\{2\}}$; this yields \[ f^{= \{1,2\}}(x_1,x_2) = \begin{cases} +\frac{4}{9} &#038; \text{if } x_1 = x_2 = c, \\ -\frac{2}{9} &#038; \text{if exactly one of } x_1,\ x_2 \text{ is } c, \\ +\frac{1}{9} &#038; \text{if } x_1, x_2 \neq c. \end{cases} \] You can check (exercise) that this is consistent with Proposition <a href="#proporthog-decomp-basis">36</a> and the Fourier expansion from Example <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1305#egand2-3ary" target="_blank">15</a>. </p></blockquote>
<p> We can write all of the Fourier formulas from Section <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#secgeneral-fourier-formulas" target="_blank">2</a> in terms of the orthogonal decomposition; e.g., \[ \langle f, g \rangle = \sum_{S \subseteq [n]} \langle f^{=S}, g^{=S} \rangle, \quad \mathbf{Inf}_i[f] = \sum_{S \ni i} \|f^{=S}\|_2^2, \quad \mathrm{T}_\rho f = \sum_{S \subseteq [n]} \rho^{|S|} f^{=S}. \] These formulas can be proved either by using the connection from Proposition <a href="#proporthog-decomp-basis">36</a> or by reasoning directly from the defining Theorem <a href="#thmorthogonal-decomposition">35</a>; see the exercises. The orthogonal decomposition also gives us the natural way of stratifying $f$ by degree: we end this section by generalizing some more definitions from Chapter <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=253#secbasic-fourier-formulas" target="_blank">1.4</a>: </p>
<blockquote><p><b>Definition 38</b> For $f \in L^2(\Omega^n, \pi^{\otimes n})$ and $k \in {\mathbb N}$ we define the <em>degree $k$ part of $f$</em> to be $f^{=k} = \sum_{|S| = k} f^{=S}$ and the <em>weight of $f$ at degree $k$</em> to be $\mathbf{W}^{k}[f] = \|f^{=k}\|_2^2$. We also use notation like $f^{\leq k} = \sum_{|S| \leq k} f^{=S}$ and $\mathbf{W}^{&gt; k}[f] = \sum_{|S| &gt; k} \|f^{=S}\|_2^2$. </p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1323</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>§8.2: Generalized Fourier formulas</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315#comments</comments>
		<pubDate>Wed, 17 Apr 2013 14:38:18 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[All chapter sections]]></category>
		<category><![CDATA[Chapter 8: Generalized domains]]></category>
		<category><![CDATA[degree]]></category>
		<category><![CDATA[expectation operator]]></category>
		<category><![CDATA[influence]]></category>
		<category><![CDATA[Laplacian operator]]></category>
		<category><![CDATA[noise operator]]></category>
		<category><![CDATA[noise stability]]></category>
		<category><![CDATA[Parseval]]></category>
		<category><![CDATA[Plancherel]]></category>
		<category><![CDATA[projection onto coordinates]]></category>
		<category><![CDATA[stable influence]]></category>
		<category><![CDATA[total influence]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1315</guid>
		<description><![CDATA[<p>In this section we will revisit a number of combinatorial/probabilistic notions and show that for functions $f \in L^2(\Omega^n, \pi^{\otimes n})$, these notions have familiar Fourier formulas which don&#8217;t depend on the Fourier basis. </p> <p> The orthonormality of Fourier bases gives us some formulas almost immediately: </p> <p>Proposition 16 Let $f, g \in L^2(\Omega^n, [...]]]></description>
			<content:encoded><![CDATA[<p>In this section we will revisit a number of combinatorial/probabilistic notions and show that for functions $f \in L^2(\Omega^n, \pi^{\otimes n})$, these notions have familiar Fourier formulas which don&#8217;t depend on the Fourier basis.<br />
<span id="more-1315"></span></p>
<p>
The orthonormality of Fourier bases gives us some formulas almost immediately: </p>
<blockquote><p><b>Proposition 16</b> <a name="propgeneral-plancherel-etc"></a> Let $f, g \in L^2(\Omega^n, \pi^{\otimes n})$. Then for any fixed product Fourier basis, the following formulas hold: \begin{align*} \mathop{\bf E}[f] &#038;= \widehat{f}(0)\\ \mathop{\bf E}[f^2] &#038;= \sum_{\alpha \in {\mathbb N}_{&lt; m}^n} \widehat{f}(\alpha)^2 \tag{Parseval} \\ \mathop{\bf Var}[f] &#038;= \sum_{\alpha \neq 0} \widehat{f}(\alpha)^2 \\ \langle f, g \rangle &#038;= \sum_{\alpha \in {\mathbb N}_{&lt;m}^n} \widehat{f}(\alpha)\widehat{g}(\alpha) \tag{Plancherel} \\ \mathop{\bf Cov}[f,g] &#038;= \sum_{\alpha \neq 0} \widehat{f}(\alpha)\widehat{g}(\alpha). \end{align*} </p></blockquote>
<p> <br/><em>Proof:</em>  We verify Plancherel&#8217;s Theorem, from which the other identities follow (see the exercises): \begin{align*} \langle f, g \rangle &#038;= \Bigl\langle \sum_{\alpha \in {\mathbb N}_{&lt; m}^n} \widehat{f}(\alpha) \phi_\alpha, \sum_{\beta \in {\mathbb N}_{&lt; m}^n} \widehat{g}(\beta) \phi_\beta \Bigr\rangle \\ &#038;= \sum_{\alpha, \beta \in {\mathbb N}_{&lt; m}^n} \widehat{f}(\alpha) \widehat{g}(\beta) \langle \phi_\alpha, \phi_\beta \rangle \\ &#038;= \sum_{\alpha \in {\mathbb N}_{&lt; m}^n} \widehat{f}(\alpha) \widehat{g}(\alpha) \end{align*} by orthonormality of $(\phi_\alpha)_{\alpha \in {\mathbb N}_{&lt;m}^n}$. $\Box$</p>
<p>
We now give a definition which will be the key for developing basis-independent Fourier expansions. In the case of $L^2(\{-1,1\})$ this definition appeared already in Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=602#exorthog1" target="_blank">3.28</a>. </p>
<blockquote><p><b>Definition 17</b> <a name="defprojection-onto-coords"></a> Let $J \subseteq [n]$ and write ${\overline{J}} = [n] \setminus J$. Given $f \in L^2(\Omega^n, \pi^{\otimes n})$, the <em>projection of $f$ on coordinates $J$</em> is the function $f^{\subseteq J} \in L^2(\Omega^n, \pi^{\otimes n})$ defined by \[ f^{\subseteq J}(x) = \mathop{\bf E}_{{\boldsymbol{x}}' \sim \pi^{\otimes {\overline{J}}}}[f(x_J, {\boldsymbol{x}}')], \] where $x_J \in \Omega^J$ denotes the values of $x$ in the $J$-coordinates. In other words, $f^{\subseteq J}(x)$ is the expectation of $f$ when the ${\overline{J}}$-coordinates of $x$ are rerandomized. Note that we take $f^{\subseteq J}$ to have $\Omega^n$ as its domain, even though it only depends on the coordinates in $J$.</p>
<p>
 Forming $f^{\subseteq J}$ is indeed the application of a projection linear operator to $f$, namely the <em>expectation over ${\overline{J}}$ operator</em>, $\mathrm{E}_{{\overline{J}}}$. We take this as the definition of the operator: $\mathrm{E}_{{\overline{J}}} f = f^{\subseteq J}$. When ${\overline{J}} = \{i\}$ is a singleton we write simply $\mathrm{E}_i$. </p></blockquote>
<blockquote><p><b>Remark 18</b> This definition of $\mathrm{E}_i$ is consistent with Definition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=351#defexpectation-operator" target="_blank">2.22</a>. You are asked to verify that $\mathrm{E}_{{\overline{J}}}$ is indeed a projection, self-adjoint linear operator in the exercises. </p></blockquote>
<blockquote><p><b>Proposition 19</b> <a name="propprojection-formula"></a> Let $J \subseteq [n]$ and $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then for any fixed product Fourier basis, \[ f^{\subseteq J} = \sum_{\substack{\alpha \in {\mathbb N}_{&lt; m}^n \\ \mathrm{supp}(\alpha) \subseteq J}} \widehat{f}(\alpha)\,\phi_\alpha. \] </p></blockquote>
<p> <br/><em>Proof:</em>  Since $\mathrm{E}_{{\overline{J}}}$ is a linear operator, it suffices to verify for all $\alpha$ that \[ \phi_\alpha^{\subseteq J} = \begin{cases} \phi_\alpha &#038; \text{if } \mathrm{supp}(\alpha) \subseteq J, \\ 0 &#038; \text{otherwise}. \end{cases} \] If $\mathrm{supp}(\alpha) \subseteq J$ then $\phi_\alpha$ does not depend on the coordinates outside $J$; hence indeed $\phi_\alpha^{\subseteq J}= \phi_\alpha$. So suppose $\mathrm{supp}(\alpha) \not \subseteq J$. Since $\phi_\alpha(x) = \bigl(\mathop{{\textstyle \prod}}_{i\in J} \phi_{\alpha_i}(x_i)\bigr)\bigl(\mathop{{\textstyle \prod}}_{i\in {\overline{J}}} \phi_{\alpha_i}(x_i)\bigr)$, we can write $\phi_\alpha = \phi_{\alpha_J} \cdot \phi_{\alpha_{\overline{J}}}$, where $\phi_{\alpha_J}$ depends only on the coordinates in $J$, $\phi_{\alpha_{{\overline{J}}}}$ depends only on the coordinates in ${\overline{J}}$, and $\mathop{\bf E}[\alpha_{{\overline{J}}}] = 0$ precisely because $\mathrm{supp}(\alpha) \not\subseteq J$. Thus for every $x \in \Omega^n$, \[ \phi_{\alpha}^{\subseteq J}(x) = \mathop{\bf E}_{{\boldsymbol{x}}' \sim \pi^{\otimes {\overline{J}}}}[\phi_{\alpha_J}(x_J) \phi_{\alpha_{{\overline{J}}}}({\boldsymbol{x}}')] = \phi_{\alpha_J}(x_J) \cdot \mathop{\bf E}_{{\boldsymbol{x}}&#8217; \sim \pi^{\otimes {\overline{J}}}}[ \phi_{\alpha_{{\overline{J}}}}({\boldsymbol{x}}')] = 0 \] as needed. $\Box$</p>
<blockquote><p><b>Corollary 20</b> <a name="corf-depends"></a> Let $f \in L^2(\Omega^n,\pi^{\otimes n})$ and fix a product Fourier basis. If $f$ depends only on the coordinates in $J \subseteq [n]$ then $\widehat{f}(\alpha) = 0$ whenever $\mathrm{supp}(\alpha) \not \subseteq J$. </p></blockquote>
<p> <br/><em>Proof:</em>  This follows from Proposition <a href="#propprojection-formula">19</a> because $f = f^{\subseteq J}$. $\Box$</p>
<blockquote><p><b>Corollary 21</b> <a name="corgeneral-expec-formula"></a> Let $i \in [n]$ and $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then for any fixed product Fourier basis, \[ \mathrm{E}_i f = \sum_{\alpha : \alpha_i = 0} \widehat{f}(\alpha)\,\phi_\alpha. \] </p></blockquote>
<p><p>
 Let us now define influences for functions $f \in L^2(\Omega^n, \pi^{\otimes n})$. In the case of $\Omega = \{-1,1\}$, our definition of $\mathbf{Inf}_i[f]$ from Chapter <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=351#secinfluences" target="_blank">2.2</a> was $\mathop{\bf E}[\mathrm{D}_i f]$. However the notion of a derivative operator does not make sense for more general domains $\Omega$. In fact, even in the case of $\Omega = \{-1,1\}$ it isn&#8217;t a basis-invariant notion: the choice of $\frac{f(x^{(i\mapsto 1)}) &#8211; f(x^{(i \mapsto -1)})}{2}$ rather than $\frac{f(x^{(i\mapsto -1)}) &#8211; f(x^{(i \mapsto 1)})}{2}$ is inherently arbitrary. Instead we can fall back on the <em>Laplacian</em> operators, and take the identity $\mathbf{Inf}_i[f] = \langle f, \mathrm{L}_i f \rangle$ from Proposition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=351#propdir-laplacian-facts" target="_blank">2.25</a> as a definition. </p>
<blockquote><p><b>Definition 22</b> Let $i \in [n]$ and $f \in L^2(\Omega^n, \pi^{\otimes n})$. The <em>$i$th coordinate Laplacian operator</em> $\mathrm{L}_i$ is the self-adjoint, projection linear operator defined by \[ \mathrm{L}_i f = f - \mathrm{E}_i f. \] The <em>influence of coordinate $i$ on $f$</em> is defined to be \[ \mathbf{Inf}_i[f] = \langle f, \mathrm{L}_i f \rangle = \langle \mathrm{L}_i f, \mathrm{L}_i f \rangle. \] The <em>total influence</em> of $f$ is defined to be $\mathbf{I}[f] = \sum_{i=1}^n \mathbf{Inf}_i[f]$. </p></blockquote>
<p>   You can think of $\mathrm{L}_i f$ as &#8220;the part of $f$ which depends on the $i$th coordinate&#8221;. </p>
<blockquote><p><b>Proposition 23</b> <a name="propgeneral-laplac-influence"></a> Let $i \in [n]$ and $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then for any fixed product Fourier basis, \begin{align*} \mathrm{L}_i f = \sum_{\alpha : \alpha_i \neq 0} \widehat{f}(\alpha)\,\phi_\alpha, \quad \mathbf{Inf}_i[f] = \sum_{\alpha : \alpha_i \neq 0} \widehat{f}(\alpha)^2, \quad \mathbf{I}[f] = \sum_{\alpha} \#\alpha \cdot \widehat{f}(\alpha)^2, \end{align*} </p></blockquote>
<p> <br/><em>Proof:</em>  The first formula is immediate from Corollary <a href="#corgeneral-expec-formula">21</a>, the second from Plancherel, and the third from summing over $i$. $\Box$</p>
<p> In the exercises you are asked to verify the following formulas which are often useful for computations: </p>
<blockquote><p><b>Proposition 24</b> <a name="propdir-laplacian-facts2"></a> Let $i \in [n]$ and $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then \[ \mathbf{Inf}_i[f] = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[\mathop{\bf Var}_{{\boldsymbol{x}}_i' \sim \pi}[f({\boldsymbol{x}}_1, \dots, {\boldsymbol{x}}_{i-1}, {\boldsymbol{x}}_i', {\boldsymbol{x}}_{i+1}, \dots, {\boldsymbol{x}}_n)]]. \] If furthermore $f$&#8217;s range is $\{-1,1\}$ then \[ \mathbf{Inf}_i[f] = \mathop{\bf E}[|\mathrm{L}_i f|] = 2\mathop{\bf Pr}_{\substack{{\boldsymbol{x}} \sim \pi^{\otimes n} \\ {\boldsymbol{x}}_i&#8217; \sim \pi}}[f({\boldsymbol{x}}) \neq f({\boldsymbol{x}}_1, \dots, {\boldsymbol{x}}_{i-1}, {\boldsymbol{x}}_i', {\boldsymbol{x}}_{i+1}, \dots, {\boldsymbol{x}}_n)]. \] </p></blockquote>
<blockquote><p><b>Example 25</b> Let&#8217;s continue Example <a href="#egand2-3ary">15</a>, in which $\{a,b,c\}$ has the uniform distribution and $f : \{a,b,c\}^2 \to \{0,1\}$ is $1$ if and only if both inputs are $c$. We compute $\mathbf{Inf}_1[f]$ two ways. Using Proposition <a href="#propdir-laplacian-facts2">24</a> we have $\mathop{\bf Var}[f({\boldsymbol{x}}_1, a)] = \mathop{\bf Var}[f({\boldsymbol{x}}_1, b)] = 0$ and $\mathop{\bf Var}[f({\boldsymbol{x}}_1, c)] = \frac13 \cdot \frac23 = \frac29$ (because $f({\boldsymbol{x}}_1, c)$ is Bernoulli with parameter $\frac13$); thus $\mathbf{Inf}_1[f] = \frac13 \cdot \frac29 = \frac{2}{27}$. Alternatively, using the formula from Proposition <a href="#propgeneral-laplac-influence">23</a> as well as the Fourier expansion from Example <a href="#egand2-3ary">15</a>, we can compute $\mathbf{Inf}_1[f] = (-\tfrac{\sqrt{2}}{18})^2 + (-\tfrac{\sqrt{6}}{18})^2 + (\tfrac{1}{18})^2 + (\tfrac{\sqrt{12}}{36})^2 +(\tfrac{\sqrt{12}}{36})^2 +(\tfrac16)^2 = \tfrac{2}{27}$. </p></blockquote>
<p> <br/></p>
<p>
Next, we straightforwardly extend our definitions of the noise operator and noise stability to general product spaces. </p>
<blockquote><p><b>Definition 26</b> <a name="defnoise-general"></a> Fix a finite product probability space $(\Omega^n, \pi^{\otimes n})$. For $\rho \in [0,1]$ and $x \in \Omega^n$ we write $\boldsymbol{y} \sim N_\rho(x)$ to denote that $\boldsymbol{y} \in \Omega^n$ is randomly chosen as follows: for each $i \in [n]$ independently, \[ \boldsymbol{y}_i = \begin{cases} x_i &#038; \text{with probability $\rho$,}\\ \text{drawn from $\pi$} &#038; \text{with probability $1-\rho$.} \end{cases} \] If ${\boldsymbol{x}} \sim \pi^{\otimes n}$ and $\boldsymbol{y} \sim N_\rho({\boldsymbol{x}})$, we say that $({\boldsymbol{x}}, \boldsymbol{y})$ is a <em>$\rho$-correlated pair under $\pi^{\otimes n}$</em>. (This definition is symmetric in ${\boldsymbol{x}}$ and $\boldsymbol{y}$.) </p></blockquote>
<blockquote><p><b>Definition 27</b> <a name="defgeneral-T-rho"></a> For a fixed space $L^2(\Omega^n, \pi^{\otimes n})$ and $\rho \in [0,1]$, the <em>noise operator with parameter $\rho$</em> is the linear operator $\mathrm{T}_\rho$ on functions $f \in L^2(\Omega^n,\pi^{\otimes n})$ defined by \[ \mathrm{T}_\rho f(x) = \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[f(\boldsymbol{y})]. \] The <em>noise stability of $f$ at $\rho$</em> is \[ \mathbf{Stab}_\rho[f] = \langle f, \mathrm{T}_\rho f \rangle = \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \text{ $\rho$-correlated}\\ \text{under } \pi^{\otimes n}}}}[f({\boldsymbol{x}}) f(\boldsymbol{y})]. \] </p></blockquote>
<blockquote><p><b>Proposition 28</b> <a name="propgeneral-trho"></a> Let $\rho \in [0,1]$ and let $f \in L^2(\Omega^n, \pi^{\otimes n})$. Then for any fixed product Fourier basis, \[ \mathrm{T}_\rho f = \sum_{\alpha \in {\mathbb N}_{&lt;m}^n} \rho^{\#\alpha} \widehat{f}(\alpha)\,\phi_\alpha, \qquad \mathbf{Stab}_\rho[f] = \sum_{\alpha \in {\mathbb N}_{&lt;m}^n} \rho^{\#\alpha} \widehat{f}(\alpha)^2. \] </p></blockquote>
<p> <br/><em>Proof:</em>  Let $\boldsymbol{J}$ denote a $\rho$-random subset of $[n]$; i.e., $\boldsymbol{J}$ is formed by including each $i \in [n]$ independently with probability $\rho$. Then by definition $T_\rho f (x) = \mathop{\bf E}_{\boldsymbol{J}} [f^{\subseteq \boldsymbol{J}}(x)]$ and so from Proposition <a href="#propprojection-formula">19</a> we get \[ T_\rho f (x) = \mathop{\bf E}_{\boldsymbol{J}} [f^{\subseteq J}(x)] = \mathop{\bf E}_{\boldsymbol{J}} \Bigl[\sum_{\substack{\alpha \in {\mathbb N}_{&lt;m}^n \\ \mathrm{supp}(\alpha) \subseteq \boldsymbol{J}}} \widehat{f}(\alpha)\,\phi_\alpha(x)\Bigr]= \sum_{\alpha \in {\mathbb N}_{&lt;m}^n} \rho^{\#\alpha} \widehat{f}(\alpha)\,\phi_\alpha(x), \] since for a fixed $\alpha$, the probability of $\mathrm{supp}(\alpha) \subseteq \boldsymbol{J}$ is $\rho^{\#\alpha}$. The formula for $\mathbf{Stab}_\rho[f]$ now follows from Plancherel. $\Box$</p>
<blockquote><p><b>Remark 29</b> The first formula in this proposition may be used to extend the definition of $\mathrm{T}_\rho f$ to values of $\rho$ outside $[0,1]$. </p></blockquote>
<p> We also define $\rho$-stable influences. The factor of $\rho^{-1}$ in our definition is for consistency with the $L^2(\{-1,1\}^n)$ case. </p>
<blockquote><p><b>Definition 30</b> <a name="defstable-influence-general"></a> For $f \in L^2(\Omega^n, \pi^{\otimes n})$, $\rho \in (0,1]$, and $i \in [n]$, the <em>$\rho$-stable influence</em> of $i$ on $f$ is \[ \mathbf{Inf}_i^{(\rho)}[f] = \rho^{-1} \mathbf{Stab}_\rho[\mathrm{L}_i f] = \sum_{\alpha : \alpha_i \neq 0} \rho^{\#\alpha-1} \widehat{f}(\alpha)^2. \] We also define $\mathbf{I}^{(\rho)}[f] = \sum_{i=1}^n \mathbf{Inf}_i^{(\rho)}[f]$. </p></blockquote>
<p> Just as in the case of $L^2(\{-1,1\}^n)$ we can use stable influences to define the &#8220;notable&#8221; coordinates of a function, of which there is a bounded quantity. A verbatim repetition of the proof of Proposition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=406#propfew-stable-influences" target="_blank">2.53</a> yields the following generalization: </p>
<blockquote><p><b>Proposition 31</b> <a name="propfew-stable-influences-general"></a> Suppose $f \in L^2(\Omega^n, \pi^{\otimes n})$ has $\mathop{\bf Var}[f] \leq 1$. Given $0 &lt; \delta &lt; 1$, $0 &lt; \epsilon \leq 1$, let $J = \{ i \in [n] : \mathbf{Inf}_i^{(1-\delta)}[f] \geq \epsilon\}$. Then $|J| \leq \frac{1}{\delta \epsilon}$. </p></blockquote>
<p><p>
We end this section by discussing the &#8220;degree&#8221; of functions on general product spaces. For $f \in L^2(\{-1,1\}^n)$ the Fourier expansion is a real polynomial; this yields an obvious definition for degree. But for general $f \in L^2(\Omega^n, \pi^{\otimes n})$ the domain is just an abstract set so we need to look for a more intrinsic definition. We take our cue from Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=301#exdegree" target="_blank">1.11(b)</a>: </p>
<blockquote><p><b>Definition 32</b> <a name="defgeneral-degree"></a> Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ be nonzero. The <em>degree</em> of $f$, written $\deg(f)$, is the least $k \in {\mathbb N}$ such that $f$ is a sum of $k$-juntas (functions depending on at most $k$ coordinates). </p></blockquote>
<blockquote><p><b>Proposition 33</b> <a name="propgeneral-degree"></a> Let $f \in L^2(\Omega^n, \pi^{\otimes n})$ be nonzero. Then for any fixed product Fourier basis we have $\deg(f) = \max\{\#\alpha : \widehat{f}(\alpha) \neq 0\}$. </p></blockquote>
<p> <br/><em>Proof:</em>  The inequality $\deg(f) \leq \max\{\#\alpha : \widehat{f}(\alpha) \neq 0\}$ is immediate from the Fourier expansion: \[ f = \sum_{\alpha : \widehat{f}(\alpha) \neq 0} \widehat{f}(\alpha)\,\phi_\alpha \] and each function $\widehat{f}(\alpha)\,\phi_\alpha$ depends on at most $\#\alpha$ coordinates. For the reverse inequality, suppose $f = g_1 + \cdots + g_m$ where each $g_i$ depends on at most $k$ coordinates. By Corollary <a href="#corf-depends">20</a> each $g_i$ has its Fourier support on functions $\phi_\alpha$ with $\#\alpha \leq k$. But $\widehat{f}(\alpha) = \widehat{g_1}(\alpha) + \cdots + \widehat{g_m}(\alpha)$ so the same is true of $f$. $\Box$</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1315</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>§8.1: Fourier bases for product spaces</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1305</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1305#comments</comments>
		<pubDate>Thu, 11 Apr 2013 16:50:14 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[All chapter sections]]></category>
		<category><![CDATA[Chapter 8: Generalized domains]]></category>
		<category><![CDATA[Fourier basis]]></category>
		<category><![CDATA[indicator basis]]></category>
		<category><![CDATA[multi-index]]></category>
		<category><![CDATA[orthonormal]]></category>
		<category><![CDATA[product basis]]></category>
		<category><![CDATA[product probability space]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1305</guid>
		<description><![CDATA[<p>We will now begin to discuss functions on (finite) product probability spaces. </p> <p>Definition 1 Let $(\Omega, \pi)$ be a finite probability space and assume $\pi$ has full support. For $n \in {\mathbb N}^+$ we write $L^2(\Omega^n, \pi^{\otimes n})$ for the (real) inner product space of functions $f : \Omega^n \to {\mathbb R}$, with inner [...]]]></description>
			<content:encoded><![CDATA[<p>We will now begin to discuss functions on (finite) product probability spaces.<br />
<span id="more-1305"></span></p>
<blockquote><p><b>Definition 1</b> <a name="defgeneral-inner-prod-space"></a> Let $(\Omega, \pi)$ be a finite probability space and assume $\pi$ has full support. For $n \in {\mathbb N}^+$ we write $L^2(\Omega^n, \pi^{\otimes n})$ for the (real) inner product space of functions $f : \Omega^n \to {\mathbb R}$, with inner product \[ \langle f, g \rangle = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[f({\boldsymbol{x}})g({\boldsymbol{x}})]. \] Here $\pi^{\otimes n}$ denotes the product probability distribution on $\Omega^n$. </p></blockquote>
<blockquote><p><b>Example 2</b> A simple example to keep in mind is $\Omega = \{a,b,c\}$ with $\pi(a) = \pi(b) = \pi(c) = 1/3$. Here $a$, $b$, and $c$ are simply abstract set elements. </p></blockquote>
<p> We can (and will) generalize to non-discrete probability spaces, and to complex inner product spaces. However we will keep to the above definition for now. </p>
<blockquote><p><b>Notation 3</b> We will write $\pi_{1/2}$ for the uniform probability distribution on $\{-1,1\}$. Thus so far in this book we have been studying functions in $L^2(\{-1,1\}^n, \pi_{1/2}^{\otimes n})$. For simplicity, we will write this as $L^2(\{-1,1\}^n)$. </p></blockquote>
<blockquote><p><b>Notation 4</b> Much of the notation we used for $L^2(\{-1,1\}^n)$ extends naturally to the case of $L^2(\Omega^n, \pi^{\otimes n})$: e.g., $\|f\|_p = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[|f({\boldsymbol{x}})|^p]^{1/p}$, or the restriction notation from Chapter <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=560" target="_blank">3.3</a>. </p></blockquote>
<p><p>
As we described in Chapter <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=253" target="_blank">1.4</a>, the essence of boolean Fourier analysis is in deriving combinatorial properties of a boolean function $f : \{-1,1\}^n \to {\mathbb R}$ from its coefficients over a particular basis of $L^2(\{-1,1\}^n)$, the basis of parity functions. We would like to achieve the same thing more generally for functions in $L^2(\Omega^n, \pi^{\otimes n})$. We begin by considering vector space bases more generally. </p>
<blockquote><p><b>Definition 5</b> Let $|\Omega| = m$. The <em>indicator basis</em> (or <em>standard basis</em>) for $L^2(\Omega, \pi)$ is just the set of $m$ indicator functions $(1_{x})_{x \in \Omega}$, where \[ 1_{x}(y) = \begin{cases} 1 &#038; \text{if } y = x, \\ 0 &#038; \text{if } y \neq x. \end{cases} \] </p></blockquote>
<blockquote><p><b>Fact 6</b> <a name="factl2-indic-basis"></a> The indicator basis is indeed a basis for $L^2(\Omega,\pi)$ since the functions $(1_{x})_{x \in \Omega}$ are nonzero, spanning, and orthogonal. Hence $\dim(L^2(\Omega,\pi)) = m$. </p></blockquote>
<p><p>
We will usually fix $\Omega$ and $\pi$ and then consider $L^2(\Omega^n, \pi^{\otimes n})$ for $n \in {\mathbb N}^+$. Applying the above definition gives us an indicator basis $(1_{x})_{x \in \Omega^n}$ for the $m^n$-dimensional space $L^2(\Omega^n, \pi^{\otimes n})$. The representation of $f \in L^2(\Omega,\pi)$ in this basis is just $f = \sum_{x \in \Omega} f(x)1_{x}$. This is not very interesting; the coefficients are just the values of $f$ so they don&#8217;t tell us anything new about the function. We would like a different basis that will generate useful &#8220;Fourier formulas&#8221; as in Chapter <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=253" target="_blank">1.4</a>.</p>
<p>
For inspiration, let&#8217;s look critically at the familiar case of $L^2(\{-1,1\}^n)$. Here we used the basis of all parity functions, $\chi_S(x) = \prod_{i\in S} x_i$. It will be helpful to think of the basis function $\chi_S : \{-1,1\}^n \to {\mathbb R}$ as follows: identify $S$ with its $0$-$1$ indicator vector and write \begin{gather*} \chi_S(x) = \prod_{i=1}^n \phi_{S_i}(x_i), \qquad \text{where} \quad \phi_0 \equiv 1, \quad \phi_1 = \textit{id}. \end{gather*} (Here $\textit{id}$ is just the identity map $\textit{id}(b) = b$.) We will identify three properties of this basis which we&#8217;d like to generalize.</p>
<p>
First, the parity basis is a <em>product basis</em>. We can break down its &#8220;product structure&#8221; as follows: For each coordinate $i \in [n]$ of the product domain $\{-1,1\}^n$, the set $\{1, \textit{id}\}$ is a basis for the $2$-dimensional space $L^2(\{-1,1\}, \pi_{1/2})$. We then get a basis for the $2^n$-dimensional product space $L^2(\{-1,1\}^n)$ by taking all possible $n$-fold products. More generally, suppose we are given an inner product space $L^2(\Omega, \pi)$ with $|\Omega| = m$. Let $\phi_0, \dots, \phi_{m-1}$ be any basis for this space. Then the set of all products $\phi_{i_1} \phi_{i_2} \cdots \phi_{i_n}$ ($0 \leq i_j &lt; m$) forms a basis for the space $L^2(\Omega^n, \pi^{\otimes n})$.</p>
<p>
Second, it is convenient that the parity basis is <em>orthonormal</em>. We will later check that if a basis $\phi_0, \dots, \phi_{m-1}$ for $L^2(\Omega, \pi)$ is orthonormal, then so too is the associated product basis for $L^2(\Omega^n, \pi^{\otimes n})$. This relies on the fact that $\pi^{\otimes n}$ is the product distribution. For example, the parity basis for $L^2(\{-1,1\}^n)$ is orthonormal because the basis $\{1, \textit{id}\}$ for $L^2(\{-1,1\}, \pi_{1/2})$ is orthonormal: $\mathop{\bf E}[1^2] = \mathop{\bf E}[{\boldsymbol{x}}_i^2] = 1$, $\mathop{\bf E}[1 \cdot {\boldsymbol{x}}_i] = 0$. Orthonormality is the property that makes Parseval&#8217;s Theorem hold; in the general context, this means that if $f \in L^2(\Omega, \pi)$ has the representation $\sum_{i=0}^{m-1} c_i \phi_i$ then $\mathop{\bf E}[f^2] = \sum_{i=0}^{m-1} c_i^2$.</p>
<p>
Finally, the parity basis contains the constant function $1$. This fact leads to several of our pleasant Fourier formulas. In particular, when you take an orthonormal basis $\phi_0, \dots, \phi_{m-1}$ for $L^2(\Omega, \pi)$ which has $\phi_0 \equiv 1$, then $0 = \langle \phi_0, \phi_i\rangle = \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi} [\phi_i({\boldsymbol{x}})]$ for all $i &gt; 0$. Hence if $f \in L^2(\Omega,\pi)$ has the expansion $f= \sum_{i=0}^{m-1} c_i \phi_i$, then $\mathop{\bf E}[f] = c_0$ and $\mathop{\bf Var}[f] = \sum_{i &gt; 0} c_i^2$.</p>
<p>
<br/></p>
<p>
We encapsulate the second and third properties with a definition: </p>
<blockquote><p><b>Definition 7</b> A <em>Fourier basis</em> for an inner product space $L^2(\Omega, \pi)$ is an orthonormal basis $\phi_0, \dots, \phi_{m-1}$ with $\phi_0 \equiv 1$. </p></blockquote>
<blockquote><p><b>Example 8</b> For each $n \in {\mathbb N}^+$, the $2^n$ parity functions $(\chi_S)_{S \subseteq [n]}$ form a Fourier basis for $L^2(\{-1,1\}^n, \pi_{1/2}^{\otimes n})$. </p></blockquote>
<blockquote><p><b>Remark 9</b> A Fourier basis for $L^2(\Omega, \pi)$ always exists because you can extend the set $\{1\}$ to a basis and then perform the Gram&#8211;Schmidt process. On the other hand, Fourier bases are not unique. Even in the case of $L^2(\{-1,1\}, \pi_{1/2})$ there are two possibilities: the basis $\{1, \textit{id}\}$ and the basis $\{1, -\textit{id}\}$. </p></blockquote>
<blockquote><p><b>Example 10</b> <a name="eg3-ary-basis"></a> In the case of $\Omega = \{a,b,c\}$ with $\pi(a) = \pi(b) = \pi(c) = 1/3$, one possible Fourier basis (see the exercises) is \[ \phi_0 \equiv 1, \quad \begin{array}{l} \phi_1(a) = +\sqrt{2} \\ \phi_1(b) = -\sqrt{2}/2 \\ \phi_1(c) = -\sqrt{2}/2, \end{array} \quad \begin{array}{l} \phi_2(a) = 0 \\ \phi_2(b) = +\sqrt{6}/2, \\ \phi_2(c) = -\sqrt{6}/2. \end{array} \] </p></blockquote>
<p><p>
As mentioned, given a Fourier basis for $L^2(\Omega, \pi)$ you can construct a Fourier basis for any $L^2(\Omega^n, \pi^{\otimes n})$ by &#8220;taking all $n$-fold products&#8221;. To make this precise we need some notation. </p>
<blockquote><p><b>Definition 11</b> An $n$-dimensional <em>multi-index</em> is a tuple $\alpha \in {\mathbb N}^n$. We write \[ \mathrm{supp}(\alpha) = \{i : \alpha_i \neq 0\}, \quad \text{and} \quad \#\alpha = |\mathrm{supp}(\alpha)|. \] (We don&#8217;t use the notation $|\alpha|$ here as this traditionally denotes $\sum_{i=1}^n \alpha_i$.) We may write $\alpha \in {\mathbb N}_{&lt;m}^n$ when we want to emphasize that each $\alpha_i \in \{0, 1, \dots, m-1\}$. </p></blockquote>
<blockquote><p><b>Definition 12</b> <a name="defmulti-index-phi"></a> Given functions $\phi_0, \dots, \phi_{m-1} \in L^2(\Omega, \pi)$ and a multi-index $\alpha \in {\mathbb N}_{&lt; m}^n$, we define $\phi_\alpha \in L^2(\Omega^n, \pi^{\otimes n})$ by \[ \phi_\alpha(x) = \prod_{i=1}^n \phi_{\alpha_i}(x_i). \] </p></blockquote>
<p> Now we can show that products of Fourier bases are Fourier bases. </p>
<blockquote><p><b>Proposition 13</b> <a name="propfbasis-product"></a> Let $\phi_0, \dots, \phi_{m-1}$ be a Fourier basis for $L^2(\Omega, \pi)$. Then the collection $(\phi_\alpha)_{\alpha \in {\mathbb N}_{&lt;m}^n}$ is a Fourier basis for $L^2(\Omega^n,\pi^{\otimes n})$ (with the understanding that $\alpha = (0, 0, \dots, 0)$ indexes the constant function $1$). </p></blockquote>
<p> <br/><em>Proof:</em>  First we check orthonormality. For any multi-indices $\alpha, \beta \in {\mathbb N}_{&lt;m}^n$ we have \begin{align*} \langle \phi_\alpha, \phi_\beta \rangle &#038;= \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}[\phi_\alpha({\boldsymbol{x}})\cdot \phi_\beta({\boldsymbol{x}})] \\ &#038;= \mathop{\bf E}_{{\boldsymbol{x}} \sim \pi^{\otimes n}}\Bigl[\prod_{i = 1}^n \phi_{\alpha_i}({\boldsymbol{x}}_i) \cdot \prod_{i = 1}^n \phi_{\beta_i}({\boldsymbol{x}}_i)\Bigr] \\ &#038;= \prod_{i = 1}^n \mathop{\bf E}_{{\boldsymbol{x}}_i \sim \pi}[\phi_{\alpha_i}({\boldsymbol{x}}_i) \cdot \phi_{\beta_i}({\boldsymbol{x}}_i)] \tag{$\pi^{\otimes n}$ is a product distribution} \\ &#038;= \prod_{i = 1}^n \boldsymbol{1}_{\{\alpha_i = \beta_i\}} \tag{$\{\phi_0, \dots, \phi_{m-1}\}$ is orthonormal} \\ &#038;= \boldsymbol{1}_{\{\alpha = \beta\}}. \end{align*} This confirms that the collection $(\phi_\alpha)_{\alpha \in {\mathbb N}_{&lt;m}^n}$ is orthonormal, and consequently linearly independent. It is therefore also a basis because it has cardinality $m^n$, which we know is the dimension of $L^2(\Omega^n, \pi^{\otimes n})$ (see Fact <a href="#factl2-indic-basis">6</a>). $\Box$</p>
<p>
Given a product Fourier basis as in Proposition <a href="#propfbasis-product">13</a>, we can express any $f \in L^2(\Omega^n, \pi^{\otimes n})$ as a linear combination of basis functions. We will write $\widehat{f}(\alpha)$ for the &#8220;Fourier coefficient&#8221; on $\phi_\alpha$ in this expression. </p>
<blockquote><p><b>Definition 14</b> <a name="defgeneral-fourier-coefficient"></a> Having <em>fixed</em> a Fourier basis $\phi_{0}, \dots, \phi_{m-1}$ for $L^2(\Omega, \pi)$, every $f \in L^2(\Omega^n, \pi^{\otimes n})$ is uniquely expressible as \[ f = \sum_{\alpha \in {\mathbb N}_{&lt; m}^n} \widehat{f}(\alpha) \phi_\alpha. \] This is the <em>Fourier expansion</em> of $f$ with respect to the basis. The real number $\widehat{f}(\alpha)$ is called the <em>Fourier coefficient of $f$ on $\alpha$</em> and it satisfies \[ \widehat{f}(\alpha) = \langle f, \phi_\alpha \rangle. \] </p></blockquote>
<blockquote><p><b>Example 15</b> <a name="egand2-3ary"></a> Fix the Fourier basis as in Example <a href="#eg3-ary-basis">10</a>. Let $f : \{a,b,c\}^2 \to \{0,1\}$ be the function which is $1$ if and only if both inputs are $c$. Then you can check (see the exercises) that \begin{multline*} f = \tfrac19 &#8211; \tfrac{\sqrt{2}}{18} \phi_{(1,0)} &#8211; \tfrac{\sqrt{6}}{18} \phi_{(2,0)} &#8211; \tfrac{\sqrt{2}}{18} \phi_{(0,1)} &#8211; \tfrac{\sqrt{6}}{18} \phi_{(0,2)} \\<br />
+ \tfrac{1}{18}\phi_{(1,1)} + \tfrac{\sqrt{12}}{36} \phi_{(2,1)} + \tfrac{\sqrt{12}}{36} \phi_{(1,2)} + \tfrac{1}{6}\phi_{(2,2)}. \end{multline*}
</p></blockquote>
<p> The notation $\widehat{f}(\alpha)$ may seem poorly chosen because it doesn&#8217;t show the dependence on the basis. However the Fourier formulas we develop in the next section will have the property that <em>they are the same for every product Fourier basis</em>. We will show a basis-independent way of developing the formulas in Section 3 of this chapter. We also give a more abstract approach to general domains in a later chapter. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1305</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Chapter 8: Generalized domains</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1299</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1299#comments</comments>
		<pubDate>Wed, 10 Apr 2013 13:21:34 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[Chapter 8: Generalized domains]]></category>
		<category><![CDATA[Chapter overview]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1299</guid>
		<description><![CDATA[<p>So far we have studied functions $f : \{0,1\}^n \to {\mathbb R}$. What about, say, $f : \{0,1,2\}^n \to {\mathbb R}$? In fact, very little of what we&#8217;ve done so far depends on the domain being $\{0,1\}^n$; what it has mostly depended on is our viewing the domain as a product probability distribution. Indeed, much [...]]]></description>
			<content:encoded><![CDATA[<p>So far we have studied functions $f : \{0,1\}^n \to {\mathbb R}$. What about, say, $f : \{0,1,2\}^n \to {\mathbb R}$? In fact, very little of what we&#8217;ve done so far depends on the domain being $\{0,1\}^n$; what it has mostly depended on is our viewing the domain as a <em>product probability distribution</em>. Indeed, much of analysis of boolean functions carries over to the case of functions $f : \Omega_1 \times \cdots \times \Omega_n \to {\mathbb R}$ where the domain has a product probability distribution $\pi_1 \otimes \cdots \otimes \pi_n$. There are two main exceptions: the &#8220;derivative&#8221; operator $\mathrm{D}_i$ does not generalize to the case when $|\Omega_i| &gt; 2$ (though the Laplacian operator $\mathrm{L}_i$ does); and, the important notion of hypercontractivity (introduced in the next chapter) depends strongly on the probability distributions $\pi_i$.</p>
<p>
In this chapter we focus on the case where all the $\Omega_i$&#8217;s are the same, as are the $\pi_i$&#8217;s. (This is just to save on notation; it will be clear that everything we do holds in the more general setting.) Important classic cases include functions on the <em>$p$-biased hypercube</em> (Section 4) and functions on abelian groups (Section 5). For the issue of generalizing the <em>range</em> of functions &#8212; e.g., studying functions $f : \{0,1,2\}^n \to \{0,1,2\}$ &#8212; see the exercises.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1299</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hiatus</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1289</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1289#comments</comments>
		<pubDate>Fri, 21 Dec 2012 19:13:50 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1289</guid>
		<description><![CDATA[<p>The last video lecture for my CMU course on Analysis of Boolean Functions has just been posted. Except for responding to comments, the blog will be going on hiatus for a little while. Next semester I&#8217;ll post the final chapters of the book; there will be between 4 and 7 of them, depending on how [...]]]></description>
			<content:encoded><![CDATA[<p>The last video lecture for my CMU course on Analysis of Boolean Functions has just been posted.  Except for responding to comments, the blog will be going on hiatus for a little while.  Next semester I&#8217;ll post the final chapters of the book; there will be between 4 and 7 of them, depending on how much energy I have.</p>
<p>Happy holidays&#8230;&#8230;..</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1289</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>CMU online course: Lecture 23 published</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1287</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1287#comments</comments>
		<pubDate>Fri, 21 Dec 2012 19:09:26 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[CMU Online Course]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1287</guid>
		<description><![CDATA[<p>The video for Lecture 23 of the online course is available on the course web page.</p> ]]></description>
			<content:encoded><![CDATA[<p>The <a href = "http://cc-web.isri.cmu.edu/Panopto/Pages/Viewer/Default.aspx?id=28c409ce-50b4-456a-a80d-a6566afc1789">video for Lecture 23</a> of the online course is available on the <a href = "http://www.cs.cmu.edu/~odonnell/aobf12/">course web page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1287</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>CMU online course: Lecture 22 published</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1285</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1285#comments</comments>
		<pubDate>Tue, 18 Dec 2012 21:03:10 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[CMU Online Course]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1285</guid>
		<description><![CDATA[<p>The video for Lecture 22 of the online course is available on the course web page.</p> ]]></description>
			<content:encoded><![CDATA[<p>The <a href = "http://cc-web.isri.cmu.edu/Panopto/Pages/Viewer/Default.aspx?id=8e0266f1-94af-4aec-af18-8384b3e69344">video for Lecture 22</a> of the online course is available on the <a href = "http://www.cs.cmu.edu/~odonnell/aobf12/">course web page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1285</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Chapter 7 notes</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1279</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1279#comments</comments>
		<pubDate>Fri, 14 Dec 2012 18:35:26 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[Chapter 7: Property testing, PCPPs, and CSPs]]></category>
		<category><![CDATA[Chapter notes]]></category>
		<category><![CDATA[PCP Theorem]]></category>
		<category><![CDATA[PCPP]]></category>
		<category><![CDATA[Unique Games]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1279</guid>
		<description><![CDATA[<p>The study of property testing was initiated by Rubinfeld and Sudan [RS96] and significantly expanded by Goldreich, Goldwasser, and Ron [GGR98]; the stricter notion of local testability was introduced (in the context of error-correcting codes) by Friedl and Sudan [FS95]. The first local tester for dictatorship was given by Bellare, Goldreich, and Sudan [BGS95,BGS98] (as [...]]]></description>
			<content:encoded><![CDATA[<p>The study of property testing was initiated by Rubinfeld and Sudan <b>[RS96]</b> and significantly expanded by Goldreich, Goldwasser, and Ron <b>[GGR98]</b>; the stricter notion of local testability was introduced (in the context of error-correcting codes) by Friedl and Sudan <b>[FS95]</b>. The first local tester for dictatorship was given by Bellare, Goldreich, and Sudan <b>[BGS95,BGS98]</b> (as in Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#exBGS-test" target="_blank">8</a>); it was later rediscovered by Parnas, Ron, and Samorodnitsky <b>[PRS01,PRS02]</b>. The relevance of Arrow&#8217;s Theorem to testing dictatorship was pointed out by Kalai <b>[Kal02]</b>.<br />
<span id="more-1279"></span></p>
<p>
The idea of assisting testers by providing proofs grew out of complexity-theoretic research on interactive proofs and PCPs; see the early work Ergun, Kumar, and Rubinfeld <b>[EKR99]</b> and the references therein. The specific definition of PCPPs was introduced independently by Ben-Sasson&#8211;Goldreich&#8211;Harsha&#8211;Sudan&#8211;Vadhan <b>[BGH+04]</b> and by Dinur&#8211;Reingold <b>[DR04]</b> in 2004. Both of these works obtained the PCPP Theorem, relying on the fact that previous literature essentially already gave PCPP reductions of exponential (or greater) proof length): <b>[BGH+04]</b> observed that Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1189#thmpcpp-builder1" target="_blank">20</a> can be obtained from <b>[ALM+98]</b> (their proof is Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#exexponential-pcpp" target="_blank">18</a>), while <b>[DR04]</b> pointed out that the slightly easier Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1189#thmpcpp-builder0" target="_blank">19</a> can be extracted from <b>[BGS98]</b>. The proof we gave for Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1189#thmpcpp-for-any-ppty" target="_blank">16</a> is inspired by the presentation in <b>[Din07]</b>.</p>
<p>
The PCP Theorem and its stronger forms (the PCPP Theorem and Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1189#thmpcpp-builder3" target="_blank">21</a>) have a somewhat remarkable consequence. Suppose a researcher claims to prove a famous mathematical conjecture, say &#8220;$\mathsf{P} \neq \mathsf{NP}$&#8221;. To ensure maximum confidence in correctness, a journal might request the researcher submit a formalized proof, suitable for a mechanical proof-checking system. If the submitted formalized proof $w$ is a boolean string of length $n$, the proof-checker will be implementable by a circuit $C$ of size $O(n)$. Notice that the string property $\mathcal{C}$ decided by $C$ is nonempty if and only if there exists a (length-$n$) proof of $\mathsf{P} \neq \mathsf{NP}$. Suppose the journal applies Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1189#thmpcpp-builder3" target="_blank">21</a> to $C$ and requires the researcher submit the additional proof $\Pi$ of length $n \cdot \mathrm{polylog}(n)$. Now the journal can run a rather amazing testing algorithm which reads just $3$ bits of the submitted proof $(w,\Pi)$. If the researcher&#8217;s proof of $\mathsf{P} \neq \mathsf{NP}$ is correct then the test will accept with probability $1$. On the other hand, if the test accepts with probability at least $1-\gamma$ (where $\gamma$ is the rejection rate in Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1189#thmpcpp-builder3" target="_blank">21</a>) then $w$ must be $1$-close to the set of strings accepted by $C$. This doesn&#8217;t necessarily mean that $w$ is a correct proof of $\mathsf{P} \neq \mathsf{NP}$ &#8212; but it does mean that $\mathcal{C}$ is nonempty and hence a correct proof of $\mathsf{P} \neq \mathsf{NP}$ exists! By querying a larger constant number of bits from $(w,\Pi)$ as in Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#exlocal-test-to-standard" target="_blank">1</a>, say $\lceil 30/\gamma\rceil$ bits, the journal can become $99.99$&#37; convinced that indeed $\mathsf{P} \neq \mathsf{NP}$.</p>
<p>
CSPs are very widely studied in computer science; it is impossible to survey the topic here. In the case of boolean CSPs the monographs <b>[CKS01,KSTW01]</b> contain useful background regarding complexity theory and approximation algorithms. The notion of approximation algorithms and the derandomized $(\frac78,1)$-approximation algorithm for Max-E$3$-Sat (Proposition <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1230#prop78-for-3sat" target="_blank">37</a>, Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#excondit-expec" target="_blank">21</a>) are due to Johnson <b>[Joh74]</b>. Incidentally, there is also an efficient $(\frac78,1)$-approximation algorithm for Max-$3$-Sat <b>[KZ97]</b> but both the algorithm and its analysis are extremely difficult, the latter requiring computer assistance <b>[Zwi02]</b>. H&aring;stad&#8217;s hardness theorems are from <b>[Has01]</b>, building on <b>[Has96,Has99]</b>. That paper also proves $\mathsf{NP}$-hardness of $(\frac1p + \delta, 1-\delta)$-approximating Max-E$3$-Lin(mod $p$) (for $p$ prime) and of $(\frac78, 1)$-approximating Max-$\mathrm{CSP}(\{\mathrm{NAE}_4\})$, both of which are optimal. Using tools from <b>[TSSW00]</b>, H&aring;stad\xspace also showed $\mathsf{NP}$-hardness of $(\frac{11}{16} + \delta, \frac34)$-approximating Max-Cut which is still the best known such result. The best known inapproximability result for Unique-Games($q$) is $\mathsf{NP}$-hardness of $(\frac38 + q^{-\Theta(1)}, \tfrac{1}{2})$-approximation <b>[OW12]</b>. Khot&#8217;s influential Unique Games Conjecture was made in <b>[Kho02]</b>; the peculiar name has its origins in a work of Feige and Lov{&aacute;}sz <b>[FL92]</b>. The generic Theorem <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1230#thmug-hardness-from-dict-tests" target="_blank">41</a>, giving UG-hardness from Dictator-vs.-No-Notables tests, essentially appears in <b>[KKMO07]</b>. (We remark that the terminology &#8220;Dictator-vs.-No-Notables&#8221; is not standard.) If one is willing to assume the Unique Games Conjecture, there is an almost-complete theory of optimal inapproximability due to Raghavendra <b>[Rag09]</b> which will be discussed further in Chapter 14. Many more inapproximability results, with and without the Unique Games Conjecture, are known; for some recent surveys, see <b>[Kho05,Kho10a,Kho10]</b>.</p>
<p>
As mentioned, Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#exBGS-test" target="_blank">8</a> is due to <b>[BGS95,PRS01]</b>. The technique described in Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#excondit-expec" target="_blank">21</a> is known as the Method of Conditional Expectations. The trick in Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#exoddness-wlog" target="_blank">23</a> is closely related to the notion of &#8220;folding&#8221; from the theory of PCPs. The bug described in Exercise <a href="http://www.contrib.andrew.cmu.edu/~ryanod/?p=1252#exfix-hastad-bug" target="_blank">31</a> is rarely addressed in the literature; the trick used to overcome it appears in, e.g., <b>[ABH+05]</b>. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1279</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CMU online course: Lecture 21 published</title>
		<link>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1276</link>
		<comments>http://www.contrib.andrew.cmu.edu/~ryanod/?p=1276#comments</comments>
		<pubDate>Wed, 12 Dec 2012 02:58:12 +0000</pubDate>
		<dc:creator>Ryan O'Donnell</dc:creator>
				<category><![CDATA[CMU Online Course]]></category>

		<guid isPermaLink="false">http://www.contrib.andrew.cmu.edu/~ryanod/?p=1276</guid>
		<description><![CDATA[<p>The video for Lecture 21 of the online course is available on the course web page.</p> ]]></description>
			<content:encoded><![CDATA[<p>The <a href = "http://cc-web.isri.cmu.edu/Panopto/Pages/Viewer/Default.aspx?id=c832e7cc-095f-4b8e-870a-c35bcc757d89">video for Lecture 21</a> of the online course is available on the <a href = "http://www.cs.cmu.edu/~odonnell/aobf12/">course web page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.contrib.andrew.cmu.edu/~ryanod/?feed=rss2&#038;p=1276</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
