Suppose $f : \{-1,1\}^n \to \{-1,1\}$ is a voting rule for a $2$-candidate election. Making the impartial culture assumption, the $n$ voters independently and uniformly randomly choose their votes ${\boldsymbol{x}} = ({\boldsymbol{x}}_1, \dots, {\boldsymbol{x}}_n)$. Now imagine that when each voter goes to the ballot box there is some chance that their vote is *misrecorded*.

Specifically, say that each vote is correctly recorded with probability $\rho \in [0,1]$ and is garbled — i.e., changed to a random bit — with probability $1-\rho$. Writing $\boldsymbol{y} = (\boldsymbol{y}_1, \dots, \boldsymbol{y}_n)$ for the votes that are finally recorded, we may ask about the probability that $f({\boldsymbol{x}}) = f(\boldsymbol{y})$; i.e., whether the misrecorded votes affected the outcome of the election. This has to do with the *noise stability* of $f$.

Definition 39Let $\rho \in [0,1]$. For fixed $x \in \{-1,1\}^n$ we write $\boldsymbol{y} \sim N_\rho(x)$ to denote that the random string $\boldsymbol{y}$ is drawn as follows: for each $i \in [n]$ independently, \[ \boldsymbol{y}_i = \begin{cases} x_i & \text{with probability $\rho$,}\\ \text{uniformly random} & \text{with probability $1-\rho$.} \end{cases} \] We extend the notation to all $\rho \in [-1,1]$ as follows: \[ \boldsymbol{y}_i = \begin{cases} x_i & \text{with probability $\tfrac{1}{2} + \tfrac{1}{2} \rho$,}\\ -x_i & \text{with probability $\tfrac{1}{2} - \tfrac{1}{2} \rho$.} \end{cases} \] We say that $\boldsymbol{y}$ is$\rho$-correlatedto $x$.

Definition 40If ${\boldsymbol{x}} \sim \{-1,1\}^n$ is drawn uniformly at random and then $\boldsymbol{y} \sim N_\rho({\boldsymbol{x}})$, we say that $({\boldsymbol{x}}, \boldsymbol{y})$ is a$\rho$-correlated pairof random strings. This definition is symmetric in ${\boldsymbol{x}}$ and $\boldsymbol{y}$: it is equivalent to saying that independently for each $i \in [n]$, the pair of random bits $({\boldsymbol{x}}_i, \boldsymbol{y}_i)$ satisfies $\mathop{\bf E}[{\boldsymbol{x}}_i] = \mathop{\bf E}[\boldsymbol{y}_i] = 0$ and $\mathop{\bf E}[{\boldsymbol{x}}_i \boldsymbol{y}_i] = \rho$.

With these definitions in hand we can now define the important concept of noise stability, which measures the correlation between $f({\boldsymbol{x}})$ and $f(\boldsymbol{y})$ when $({\boldsymbol{x}},\boldsymbol{y})$ is a $\rho$-correlated pair.

Definition 41For $f : \{-1,1\}^n \to {\mathbb R}$ and $\rho \in [-1,1]$, thenoise stability of $f$ at $\rho$is \[ \mathbf{Stab}_\rho[f] = \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}}[f({\boldsymbol{x}}) f(\boldsymbol{y})]. \] If $f : \{-1,1\}^n \to \{-1,1\}$ we have \begin{align*} \mathbf{Stab}_\rho[f] &= \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[f({\boldsymbol{x}}) = f(\boldsymbol{y})] \quad- \mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[f({\boldsymbol{x}}) \neq f(\boldsymbol{y})] \\ &= 2\mathop{\bf Pr}_{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}[f({\boldsymbol{x}}) = f(\boldsymbol{y})] – 1. \end{align*}

In the voting scenario described above, the probability that the misrecording of votes doesn’t affect the election outcome is $\tfrac{1}{2} + \tfrac{1}{2} \mathbf{Stab}_\rho[f]$.

When $\rho$ is close to $1$ (i.e., the “noise” is small) it’s sometimes more natural to ask about the probability that reversing a small fraction of the votes reverses the outcome of the election.

Definition 42For $f : \{-1,1\}^n \to \{-1,1\}$ and $\delta \in [0,1]$ we write $\mathbf{NS}_\delta[f]$ fornoise sensitivity of $f$ at $\delta$, defined to be the probability that $f({\boldsymbol{x}}) \neq f(\boldsymbol{y})$ when ${\boldsymbol{x}} \sim \{-1,1\}^n$ is uniformly random and $\boldsymbol{y}$ is formed from ${\boldsymbol{x}}$ by reversing each bit independently with probability $\delta$. In other words, \[ \mathbf{NS}_\delta[f] = \frac12 – \frac12 \mathbf{Stab}_{1-2\delta}[f]. \]

Examples 43The constant functions $\pm 1$ have noise stability $1$ for every $\rho$. The dictator functions $\chi_i$ satisfy $\mathbf{Stab}_\rho[\chi_i] = \rho$ for all $\rho$ (equivalently, $\mathbf{NS}_\delta[\chi_i] = \delta$ for all $\delta$). More generally, \[ \mathbf{Stab}_\rho[\chi_S] = \mathop{\bf E}_{{\substack{({\boldsymbol{x}}, \boldsymbol{y}) \\ \text{ $\rho$-correlated}}}}[{\boldsymbol{x}}^S \boldsymbol{y}^S] = \mathop{\bf E}\left[\prod_{i \in S} ({\boldsymbol{x}}_i \boldsymbol{y}_i)\right] = \prod_{i \in S} \mathop{\bf E}[{\boldsymbol{x}}_i \boldsymbol{y}_i] = \prod_{i \in S} \rho = \rho^{|S|}, \] where we used the fact that the bit pairs $({\boldsymbol{x}}_i,\boldsymbol{y}_i)$ are independent across $i$ to convert the expectation of a product to a product of an expectation.

There is no convenient expression for the noise stability of the majority function $\mathbf{Stab}_\rho[\mathrm{Maj}_n]$. However for fixed $\rho$, the noise stability tends to a nice limit as $n \to \infty$:

Theorem 44For any $\rho \in [-1,1]$, \[ \lim_{\substack{n \to \infty \\ n \text{ odd}}} \mathbf{Stab}_\rho[\mathrm{Maj}_n] = \tfrac{2}{\pi} \arcsin \rho = 1 – \tfrac{2}{\pi} \arccos \rho. \] Taking $\rho = 1-2\delta$ and using $\arccos(1-z) = \sqrt{2z} + O(z^{3/2})$ we deduce \[ \lim_{\substack{n \to \infty \\ n \text{ odd}}} \mathbf{NS}_\delta[\mathrm{Maj}_n] = \tfrac{2}{\pi} \sqrt{\delta} + O(\delta^{3/2}). \]

We prove Theorem 44 in Chapter 5.

There is a simple Fourier formula for the noise stability of a boolean function; it’s one of the most powerful links between the combinatorics of boolean functions and their Fourier spectra. To determine it, we begin by introducing the most important operator in analysis of boolean functions: the *noise operator*, denoted $\mathrm{T}_\rho$ for historical reasons.

Definition 45For $\rho \in [-1,1]$, thenoise operator with parameter $\rho$is the linear operator $\mathrm{T}_\rho$ on functions $f : \{-1,1\}^n \to {\mathbb R}$ defined by \[ \mathrm{T}_\rho f(x) = \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[f(\boldsymbol{y})]. \]

Proposition 46For $f : \{-1,1\}^n \to {\mathbb R}$, the Fourier expansion of $\mathrm{T}_\rho f$ is given by \[ \mathrm{T}_\rho f = \sum_{S \subseteq [n]} \rho^{|S|} \widehat{f}(S)\,\chi_S = \sum_{k=0}^n \rho^k f^{=k}. \]

*Proof:* Since $\mathrm{T}_\rho$ is a linear operator, it suffices to verify that $\mathrm{T}_\rho \chi_S = \rho^{|S|} \chi_S$: \[ \mathrm{T}_\rho \chi_S(x) = \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[\boldsymbol{y}^S] = \prod_{i \in S} \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho(x)}[\boldsymbol{y}_i] = \prod_{i \in S} (\rho x_i) = \rho^{|S|} \chi_S(x). \] Here we used the fact that for $\boldsymbol{y} \sim N_\rho(x)$ the bits $\boldsymbol{y}_i$ are independent and satisfy $\mathop{\bf E}[\boldsymbol{y}_i] = \rho x_i$. $\Box$

Some alternate ways of proving this proposition are outlined in the exercises.

The connection between $\mathrm{T}_\rho$ and noise stability is that \[ \mathbf{Stab}_\rho[f] = \mathop{\bf E}_{{\substack{{\boldsymbol{x}} \sim \{-1,1\}^n \\ \boldsymbol{y} \sim N_\rho({\boldsymbol{x}})}}}[f({\boldsymbol{x}}) f(\boldsymbol{y})] = \mathop{\bf E}_{{\boldsymbol{x}}} \left[f({\boldsymbol{x}}) \mathop{\bf E}_{\boldsymbol{y} \sim N_\rho({\boldsymbol{x}})}[f(y)]\right]; \] hence:

Fact 47$\mathbf{Stab}_\rho[f] = \langle f, \mathrm{T}_\rho f \rangle$.

From Plancherel’s Theorem and Proposition 46 we deduce the Fourier formula for noise stability:

Theorem 48For $f : \{-1,1\}^n \to {\mathbb R}$, \[ \mathbf{Stab}_\rho[f] = \sum_{S \subseteq [n]} \rho^{|S|}\widehat{f}(S)^2 = \sum_{k=0}^n \rho^k \cdot \mathbf{W}^{k}[f]. \] Hence for $f : \{-1,1\}^n \to \{-1,1\}$ we have

\begin{gather} \mathbf{Stab}_\rho[f] = \mathop{\bf E}_{\boldsymbol{S} \sim \mathscr{S}_{f}}[\rho^{|\boldsymbol{S}|}], \\ \mathbf{NS}_\delta[f] = \tfrac{1}{2} \sum_{k=0}^n (1 – (1-2\delta)^{k}) \cdot \mathbf{W}^{k}[f]. \end{gather}

Thus the noise stability of $f$ at $\rho$ is equal to the sum of its Fourier weights, attenuated by a factor which decreases exponentially with degree. A simple but important corollary is that dictators (and their negations) maximize noise stability:

Proposition 49Let $\rho \in (0,1)$. If $f : \{-1,1\}^n \to \{-1,1\}$ is unbiased then $\mathbf{Stab}_\rho[f] \leq \rho$, with equality if and only if $f = \pm \chi_i$ for some $i \in [n]$.

*Proof:* For unbiased $f$ we have $\mathbf{W}^{0}[f] = 0$ and hence $\mathbf{Stab}_\rho[f] = \sum_{k \geq 1} \rho^k \mathbf{W}^{k}[f]$. Since $\rho^k < \rho$ for all $k > 1$, noise stability is maximized if all of $f$’s Fourier weight is on degree $1$. This occurs if and only if $f = \pm \chi_i$, by Exercise 1.19(a). $\Box$

For a fixed function $f$, it’s often interesting to see how $\mathbf{Stab}_\rho[f]$ varies as a function of $\rho$. From Theorem 48 we see that $\mathbf{Stab}_\rho[f]$ is a (univariate) *polynomial* with nonnegative coefficients; in particular, it’s an increasing function of $\rho$ on $[0,1]$. The derivatives of this polynomial at $0$ and $1$ have nice interpretations, as can be immediately deduced from Theorem 48:

Proposition 50For $f : \{-1,1\}^n \to {\mathbb R}$, \begin{align*} \frac{d}{d \rho} \mathbf{Stab}_\rho[f]\Bigm|_{\rho = 0} &= \mathbf{W}^{1}[f], \\ \frac{d}{d \rho} \mathbf{Stab}_\rho[f]\Bigm|_{\rho = 1} &= \mathbf{I}[f]. \end{align*}

For $f : \{-1,1\}^n \to \{-1,1\}$ we have that $\mathbf{NS}_\delta[f]$ is an increasing function of $\delta$ on $[0, 1/2]$, and the second identity is equivalent to \[ \frac{d}{d \delta} \mathbf{NS}_\delta[f]\Bigm|_{\delta = 0} = \mathbf{I}[f]. \] We conclude this section by introducing a version of influences that also incorporates noise.

Definition 51For $f : \{-1,1\}^n \to {\mathbb R}$, $\rho \in [0,1]$ and $i \in [n]$, the$\rho$-stable influenceof $i$ on $f$ is \[ \mathbf{Inf}_i^{(\rho)}[f] = \mathbf{Stab}_\rho[\mathrm{D}_i f] = \sum_{S \ni i} \rho^{|S|-1} \widehat{f}(S)^2, \] with $0^0$ interpreted as $1$. We also define $\mathbf{I}^{(\rho)}[f] = \sum_{i=1}^n \mathbf{Inf}_i^{(\rho)}[f]$.

In the exercises you are asked to verify

Fact 52$\mathbf{I}^{(\rho)}[f] = \frac{d}{d \rho} \mathbf{Stab}_\rho[f] = \sum_{k=1}^n k \rho^{k-1} \cdot \mathbf{W}^{k}[f]$.

The $\rho$-stable influence $\mathbf{Inf}_i^{(\rho)}[f]$ increases from $\widehat{f}(i)^2$ up to $\mathbf{Inf}_i[f]$ as $\rho$ increases from $0$ to $1$. For $0 < \rho < 1$ there isn’t an especially natural combinatorial interpretation for $\mathbf{Inf}_i^{(\rho)}[f]$ beyond $\mathbf{Stab}_\rho[\mathrm{D}_i f]$; however we will see later that the stable influences are technically very useful. One reason for this is that every function $f : \{-1,1\}^n \to \{-1,1\}$ has at most “constantly” many “stably-influential” coordinates:

Proposition 53Suppose $f : \{-1,1\}^n \to {\mathbb R}$ has $\mathop{\bf Var}[f] \leq 1$. Given $0 < \delta, \epsilon \leq 1$, let $J = \{ i \in [n] : \mathbf{Inf}_i^{(1-\delta)}[f] \geq \epsilon\}$. Then $|J| \leq \frac{1}{\delta \epsilon}$.

*Proof:* Certainly $|J| \leq \mathbf{I}^{(1-\delta)}[f]/\epsilon$ so it remains to verify $\mathbf{I}^{(1-\delta)}[f] \leq 1/\delta$. Comparing Fact 52 with $\mathop{\bf Var}[f] = \sum_{k \neq 0} \mathbf{W}^{k}[f]$ term by term, it suffices to show that $(1-\delta)^{k-1} k \leq 1/\delta$ for all $k \geq 1$. This is an easy exercise. $\Box$

It’s good to think of the set $J$ in this proposition as the “notable” coordinates for function $f$. Had we used the usual influences in place of stable influences, we would not have been guaranteed a bounded number of “notable” coordinates (since, e.g., the parity function $\chi_{[n]}$ has all $n$ of its influences equal to $1$).

There is a simple Fourier formula for the noise stability of a boolean function; it’s one of the most powerful links between **the the**

Yi, you’re the master of typos, thanks!

In the second paragraph from top you have meant f(x) NOT EQUAL to f(y) presumably or?

Well, same difference I suppose. I think I probably wrote ‘equals’ because it’s preceding the definition of noise stability, which is larger the more likely they are to be equal. I agree that it’s moderately unexpected to see ‘equals’ there as opposed to ‘not equals’.

Very minor, but in the proof of P46, right-most entry, should be “rho^|S|”, shouldn’t it? (instead of “rho^S”)

Yep thanks, fixed!

In Definition 41, Theorem 44, and Definition 45, it should be $\rho \in [0, 1]$, shouldn’t it?

No, $\rho \in [-1,0)$ is allowed; see the second half of Definition 39. Thanks for writing though.

In theorem 44, I think the written taylor expansion is the one of $\arccos(1-z)$.

Thanks! It’s corrected in the book.

This is not a correction but a question concerning the stability notion of Boolean functions. The assumption here is that noise is applied randomly to the variables. Is there anything known about the stability wrt adversarial manipulations? For instance, can we say something about the difficulty of manipulating a voting rule? This has interesting applications in adversarial machine learning.