Suppose $f : \{-1,1\}^n \to \{-1,1\}$ is a voting rule for a $2$-candidate election. Making the impartial culture assumption, the $n$ voters independently and uniformly randomly choose their votes ${\boldsymbol{x}} = ({\boldsymbol{x}}_1, \dots, {\boldsymbol{x}}_n)$. Now imagine that when each voter goes to the ballot box there is some chance that their vote is misrecorded.
Specifically, say that each vote is correctly recorded with probability $\rho \in [0,1]$ and is garbled — i.e., changed to a random bit — with probability $1-\rho$. Writing $\boldsymbol{y} = (\boldsymbol{y}_1, \dots, \boldsymbol{y}_n)$ for the votes that are finally recorded, we may ask about the probability that $f({\boldsymbol{x}}) = f(\boldsymbol{y})$; i.e., whether the misrecorded votes affected the outcome of the election. This has to do with the noise stability of $f$.
Definition 39 Let $\rho \in [0,1]$. For fixed $x \in \{-1,1\}^n$ we write $\boldsymbol{y} \sim N_\rho(x)$ to denote that the random string $\boldsymbol{y}$ is drawn as follows: for each $i \in [n]$ independently, \[ \boldsymbol{y}_i = \begin{cases} x_i & \text{with probability $\rho$,}\\ \text{uniformly random} & \text{with probability $1-\rho$.} \end{cases} \] We extend the notation to all $\rho \in [-1,1]$ as follows: \[ \boldsymbol{y}_i = \begin{cases} x_i & \text{with probability $\tfrac{1}{2} + \tfrac{1}{2} \rho$,}\\ -x_i & \text{with probability $\tfrac{1}{2} - \tfrac{1}{2} \rho$.} \end{cases} \] We say that $\boldsymbol{y}$ is $\rho$-correlated to $x$.
Definition 40 If ${\boldsymbol{x}} \sim \{-1,1\}^n$ is drawn uniformly at random and then $\boldsymbol{y} \sim N_\rho({\boldsymbol{x}})$, we say that $({\boldsymbol{x}}, \boldsymbol{y})$ is a $\rho$-correlated pair of random strings. This definition is symmetric in ${\boldsymbol{x}}$ and $\boldsymbol{y}$: it is equivalent to saying that independently for each $i \in [n]$, the pair of random bits $({\boldsymbol{x}}_i, \boldsymbol{y}_i)$ satisfies $\mathop{\bf E}[{\boldsymbol{x}}_i] = \mathop{\bf E}[\boldsymbol{y}_i] = 0$ and $\mathop{\bf E}[{\boldsymbol{x}}_i \boldsymbol{y}_i] = \rho$.
With these definitions in hand we can now define the important concept of noise stability, which measures the correlation between $f({\boldsymbol{x}})$ and $f(\boldsymbol{y})$ when $({\boldsymbol{x}},\boldsymbol{y})$ is a $\rho$-correlated pair.
Definition 41 For $f : \{-1,1\}^n \to {\mathbb R}$ and $\rho \in [-1,1]$, the noise stability of $f$ at $\rho$ is \[ \mathbf{Stab}_\rho[f] = \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}}[f({\boldsymbol{x}}) f(\boldsymbol{y})]. \] If $f : \{-1,1\}^n \to \{-1,1\}$ we have \begin{align*} \mathbf{Stab}_\rho[f] &= \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[f({\boldsymbol{x}}) = f(\boldsymbol{y})] \quad- \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[f({\boldsymbol{x}}) \neq f(\boldsymbol{y})] \\ &= 2\mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[f({\boldsymbol{x}}) = f(\boldsymbol{y})] – 1. \end{align*}
In the voting scenario described above, the probability that the misrecording of votes doesn’t affect the election outcome is $\tfrac{1}{2} + \tfrac{1}{2} \mathbf{Stab}_\rho[f]$.
When $\rho$ is close to $1$ (i.e., the “noise” is small) it’s sometimes more natural to ask about the probability that reversing a small fraction of the votes reverses the outcome of the election.
Definition 42 For $f : \{-1,1\}^n \to \{-1,1\}$ and $\delta \in [0,1]$ we write $\mathbf{NS}_\delta[f]$ for noise sensitivity of $f$ at $\delta$, defined to be the probability that $f({\boldsymbol{x}}) \neq f(\boldsymbol{y})$ when ${\boldsymbol{x}} \sim \{-1,1\}^n$ is uniformly random and $\boldsymbol{y}$ is formed from ${\boldsymbol{x}}$ by reversing each bit independently with probability $\delta$. In other words, \[ \mathbf{NS}_\delta[f] = \frac12 – \frac12 \mathbf{Stab}_{1-2\delta}[f]. \]
Examples 43 The constant functions $\pm 1$ have noise stability $1$ for every $\rho$. The dictator functions $\chi_i$ satisfy $\mathbf{Stab}_\rho[\chi_i] = \rho$ for all $\rho$ (equivalently, $\mathbf{NS}_\delta[\chi_i] = \delta$ for all $\delta$). More generally, \[ \mathbf{Stab}_\rho[\chi_S] = \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}}[{\boldsymbol{x}}^S \boldsymbol{y}^S] = \mathop{\bf E}\left[\prod_{i \in S} ({\boldsymbol{x}}_i \boldsymbol{y}_i)\right] = \prod_{i \in S} \mathop{\bf E}[{\boldsymbol{x}}_i \boldsymbol{y}_i] = \prod_{i \in S} \rho = \rho^{|S|}, \] where we used the fact that the bit pairs $({\boldsymbol{x}}_i,\boldsymbol{y}_i)$ are independent across $i$ to convert the expectation of a product to a product of an expectation.
There is no convenient expression for the noise stability of the majority function $\mathbf{Stab}_\rho[\mathrm{Maj}_n]$. However for fixed $\rho$, the noise stability tends to a nice limit as $n \to \infty$:
Theorem 44 For any $\rho \in [-1,1]$, \[ \lim_{\substack{n \to \infty \\ n \text{ odd}}} \mathbf{Stab}_\rho[\mathrm{Maj}_n] = \tfrac{2}{\pi} \arcsin \rho = 1 – \tfrac{2}{\pi} \arccos \rho. \] Taking $\rho = 1-2\delta$ and using $\arccos(1-z) = \sqrt{2z} + O(z^{3/2})$ we deduce \[ \lim_{\substack{n \to \infty \\ n \text{ odd}}} \mathbf{NS}_\delta[\mathrm{Maj}_n] = \tfrac{2}{\pi} \sqrt{\delta} + O(\delta^{3/2}). \]
We prove Theorem 44 in Chapter 5.
There is a simple Fourier formula for the noise stability of a boolean function; it’s one of the most powerful links between the combinatorics of boolean functions and their Fourier spectra. To determine it, we begin by introducing the most important operator in analysis of boolean functions: the noise operator, denoted $\mathrm{T}_\rho$ for historical reasons.
Definition 45 For $\rho \in [-1,1]$, the noise operator with parameter $\rho$ is the linear operator $\mathrm{T}_\rho$ on functions $f : \{-1,1\}^n \to {\mathbb R}$ defined by \[ \mathrm{T}_\rho f(x) = \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[f(\boldsymbol{y})]. \]
Proposition 46 For $f : \{-1,1\}^n \to {\mathbb R}$, the Fourier expansion of $\mathrm{T}_\rho f$ is given by \[ \mathrm{T}_\rho f = \sum_{S \subseteq [n]} \rho^{|S|} \widehat{f}(S)\,\chi_S = \sum_{k=0}^n \rho^k f^{=k}. \]
Proof: Since $\mathrm{T}_\rho$ is a linear operator, it suffices to verify that $\mathrm{T}_\rho \chi_S = \rho^{|S|} \chi_S$: \[ \mathrm{T}_\rho \chi_S(x) = \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[\boldsymbol{y}^S] = \prod_{i \in S} \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[\boldsymbol{y}_i] = \prod_{i \in S} (\rho x_i) = \rho^{|S|} \chi_S(x). \] Here we used the fact that for $\boldsymbol{y} \sim N_\rho(x)$ the bits $\boldsymbol{y}_i$ are independent and satisfy $\mathop{\bf E}[\boldsymbol{y}_i] = \rho x_i$. $\Box$
Some alternate ways of proving this proposition are outlined in the exercises.
The connection between $\mathrm{T}_\rho$ and noise stability is that \[ \mathbf{Stab}_\rho[f] = \mathop{\bf E}_{{\substack{{\boldsymbol{x}} \sim \{-1,1\}^n \\ \boldsymbol{y} \sim N_\rho({\boldsymbol{x}})}}}[f({\boldsymbol{x}}) f(\boldsymbol{y})] = \mathop{\bf E}_{{\boldsymbol{x}}} \left[f({\boldsymbol{x}}) \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho({\boldsymbol{x}})}[f(y)]\right]; \] hence:
Fact 47 $\mathbf{Stab}_\rho[f] = \langle f, \mathrm{T}_\rho f \rangle$.
From Plancherel’s Theorem and Proposition 46 we deduce the Fourier formula for noise stability:
Theorem 48 For $f : \{-1,1\}^n \to {\mathbb R}$, \[ \mathbf{Stab}_\rho[f] = \sum_{S \subseteq [n]} \rho^{|S|}\widehat{f}(S)^2 = \sum_{k=0}^n \rho^k \cdot \mathbf{W}^{k}[f]. \] Hence for $f : \{-1,1\}^n \to \{-1,1\}$ we have
\begin{gather} \mathbf{Stab}_\rho[f] = \mathop{\bf E}_{\boldsymbol{S} \sim \mathscr{S}_{f}}[\rho^{|\boldsymbol{S}|}], \\ \mathbf{NS}_\delta[f] = \tfrac{1}{2} \sum_{k=0}^n (1 – (1-2\delta)^{k}) \cdot \mathbf{W}^{k}[f]. \end{gather}
Thus the noise stability of $f$ at $\rho$ is equal to the sum of its Fourier weights, attenuated by a factor which decreases exponentially with degree. A simple but important corollary is that dictators (and their negations) maximize noise stability:
Proposition 49 Let $\rho \in (0,1)$. If $f : \{-1,1\}^n \to \{-1,1\}$ is unbiased then $\mathbf{Stab}_\rho[f] \leq \rho$, with equality if and only if $f = \pm \chi_i$ for some $i \in [n]$.
Proof: For unbiased $f$ we have $\mathbf{W}^{0}[f] = 0$ and hence $\mathbf{Stab}_\rho[f] = \sum_{k \geq 1} \rho^k \mathbf{W}^{k}[f]$. Since $\rho^k < \rho$ for all $k > 1$, noise stability is maximized if all of $f$’s Fourier weight is on degree $1$. This occurs if and only if $f = \pm \chi_i$, by Exercise 1.19(a). $\Box$
For a fixed function $f$, it’s often interesting to see how $\mathbf{Stab}_\rho[f]$ varies as a function of $\rho$. From Theorem 48 we see that $\mathbf{Stab}_\rho[f]$ is a (univariate) polynomial with nonnegative coefficients; in particular, it’s an increasing function of $\rho$ on $[0,1]$. The derivatives of this polynomial at $0$ and $1$ have nice interpretations, as can be immediately deduced from Theorem 48:
Proposition 50 For $f : \{-1,1\}^n \to {\mathbb R}$, \begin{align*} \frac{d}{d \rho} \mathbf{Stab}_\rho[f]\Bigm|_{\rho = 0} &= \mathbf{W}^{1}[f], \\ \frac{d}{d \rho} \mathbf{Stab}_\rho[f]\Bigm|_{\rho = 1} &= \mathbf{I}[f]. \end{align*}
For $f : \{-1,1\}^n \to \{-1,1\}$ we have that $\mathbf{NS}_\delta[f]$ is an increasing function of $\delta$ on $[0, 1/2]$, and the second identity is equivalent to \[ \frac{d}{d \delta} \mathbf{NS}_\delta[f]\Bigm|_{\delta = 0} = \mathbf{I}[f]. \] We conclude this section by introducing a version of influences that also incorporates noise.
Definition 51 For $f : \{-1,1\}^n \to {\mathbb R}$, $\rho \in [0,1]$ and $i \in [n]$, the $\rho$-stable influence of $i$ on $f$ is \[ \mathbf{Inf}_i^{(\rho)}[f] = \mathbf{Stab}_\rho[\mathrm{D}_i f] = \sum_{S \ni i} \rho^{|S|-1} \widehat{f}(S)^2, \] with $0^0$ interpreted as $1$. We also define $\mathbf{I}^{(\rho)}[f] = \sum_{i=1}^n \mathbf{Inf}_i^{(\rho)}[f]$.
In the exercises you are asked to verify
Fact 52 $\mathbf{I}^{(\rho)}[f] = \frac{d}{d \rho} \mathbf{Stab}_\rho[f] = \sum_{k=1}^n k \rho^{k-1} \cdot \mathbf{W}^{k}[f]$.
The $\rho$-stable influence $\mathbf{Inf}_i^{(\rho)}[f]$ increases from $\widehat{f}(i)^2$ up to $\mathbf{Inf}_i[f]$ as $\rho$ increases from $0$ to $1$. For $0 < \rho < 1$ there isn’t an especially natural combinatorial interpretation for $\mathbf{Inf}_i^{(\rho)}[f]$ beyond $\mathbf{Stab}_\rho[\mathrm{D}_i f]$; however we will see later that the stable influences are technically very useful. One reason for this is that every function $f : \{-1,1\}^n \to \{-1,1\}$ has at most “constantly” many “stably-influential” coordinates:
Proposition 53 Suppose $f : \{-1,1\}^n \to {\mathbb R}$ has $\mathop{\bf Var}[f] \leq 1$. Given $0 < \delta, \epsilon \leq 1$, let $J = \{ i \in [n] : \mathbf{Inf}_i^{(1-\delta)}[f] \geq \epsilon\}$. Then $|J| \leq \frac{1}{\delta \epsilon}$.
Proof: Certainly $|J| \leq \mathbf{I}^{(1-\delta)}[f]/\epsilon$ so it remains to verify $\mathbf{I}^{(1-\delta)}[f] \leq 1/\delta$. Comparing Fact 52 with $\mathop{\bf Var}[f] = \sum_{k \neq 0} \mathbf{W}^{k}[f]$ term by term, it suffices to show that $(1-\delta)^{k-1} k \leq 1/\delta$ for all $k \geq 1$. This is an easy exercise. $\Box$
It’s good to think of the set $J$ in this proposition as the “notable” coordinates for function $f$. Had we used the usual influences in place of stable influences, we would not have been guaranteed a bounded number of “notable” coordinates (since, e.g., the parity function $\chi_{[n]}$ has all $n$ of its influences equal to $1$).
There is a simple Fourier formula for the noise stability of a boolean function; it’s one of the most powerful links between **the the**
Yi, you’re the master of typos, thanks!
In the second paragraph from top you have meant f(x) NOT EQUAL to f(y) presumably or?
Well, same difference I suppose. I think I probably wrote ‘equals’ because it’s preceding the definition of noise stability, which is larger the more likely they are to be equal. I agree that it’s moderately unexpected to see ‘equals’ there as opposed to ‘not equals’.
Very minor, but in the proof of P46, right-most entry, should be “rho^|S|”, shouldn’t it? (instead of “rho^S”)
Yep thanks, fixed!
In Definition 41, Theorem 44, and Definition 45, it should be $\rho \in [0, 1]$, shouldn’t it?
No, $\rho \in [-1,0)$ is allowed; see the second half of Definition 39. Thanks for writing though.
In theorem 44, I think the written taylor expansion is the one of $\arccos(1-z)$.
Thanks! It’s corrected in the book.
This is not a correction but a question concerning the stability notion of Boolean functions. The assumption here is that noise is applied randomly to the variables. Is there anything known about the stability wrt adversarial manipulations? For instance, can we say something about the difficulty of manipulating a voting rule? This has interesting applications in adversarial machine learning.
The Noise Stability of constant functions have NS=0, as it is the probability, that the outcomes are different, right?