# §9.6: Highlight: The Kahn–Kalai–Linial Theorem

Recalling the social choice setting of Chapter 2.5, consider a $2$-candidate, $n$-voter election using a monotone voting rule $f : \{-1,1\}^n \to \{-1,1\}$. We assume the impartial culture assumption (that the votes are independent and uniformly random), but with a twist: one of the candidates, say $b \in \{-1,1\}$, is able to secretly bribe $k$ voters, fixing their votes to $b$. (Since $f$ is monotone, this is always the optimal way for the candidate to fix the bribed votes.) How much can this influence the outcome of the election?

This question was posed by Ben-Or and Linial in a 1985 work [BL85,BL90]; more precisely, they were interested in designing (unbiased) voting rules $f$ which minimize the effect of any bribed $k$-coalition.

Let’s first consider $k = 1$. If voter $i$ is bribed to vote for candidate $b$ (but all other votes remain uniformly random) this changes the bias of $f$ by $b\widehat{f}(i) = b \mathbf{Inf}_i[f]$. Here we used the assumption that $f$ is monotone (i.e., Proposition 2.20). This led Ben-Or and Linial to the question of which unbiased $f : \{-1,1\}^n \to \{-1,1\}$ has the least possible maximum influence:

Definition 26 Let $f : \{-1,1\}^n \to {\mathbb R}$. The maximum influence of $f$ is $\mathbf{MaxInf}[f] = \max \{\mathbf{Inf}_i[f] : i \in [n]\}.$

Ben-Or and Linial constructed the (nearly) unbiased $\mathrm{Tribes}_n : \{-1,1\}^n \to \{-1,1\}$ function (from Chapter 4.2) and noted that it satisfies $\mathbf{MaxInf}[\mathrm{Tribes}_n] = O(\frac{\log n}{n})$. They further conjectured that every unbiased function $f$ has $\mathbf{MaxInf}[f] = \Omega(\frac{\log n}{n})$. This conjecture was famously proved by Kahn, Kalai, and Linial [KKL88]:

Kahn–Kalai–Linial (KKL) Theorem For any $f : \{-1,1\}^n \to \{-1,1\}$, $\mathbf{MaxInf}[f] \geq \mathop{\bf Var}[f] \cdot \Omega\Bigl(\frac{\log n}{n}\Bigr).$

Notice that the theorem says something sensible even for very biased functions $f$; i.e., those with low variance. The variance of $f$ is indeed the right “scaling factor” since $\frac{1}{n} \mathop{\bf Var}[f] \leq \mathbf{MaxInf}[f] \leq \mathop{\bf Var}[f]$ holds trivially, by the Poincaré Inequality.

Before proving the KKL Theorem, let’s see an additional consequence for Ben-Or and Linial’s problem.

Proposition 27 Let $f : \{-1,1\}^n \to \{-1,1\}$ be monotone and assume $\mathop{\bf E}[f] \geq -.99$. Then there exists a subset $J \subseteq [n]$ with $|J| \leq O(n/\log n)$ which if “bribed to vote $1$” causes the outcome to be $1$ almost surely; i.e., $$\label{eqn:bribery-goal} \mathop{\bf E}[f_{\overline{J} \mid (1, \dots, 1)}] \geq .99.$$ Similarly, if $\mathop{\bf E}[f] \leq .99$ there exists $J \subseteq [n]$ with $|J| \leq O(n/\log n)$ such that $\mathop{\bf E}[f_{\overline{J} \mid (-1, \dots, -1)}] \leq -.99$.

Proof: By symmetry it suffices to prove the result regarding bribery by candidate $+1$. The candidate executes the following strategy: first, bribe the voter $i_1$ with the largest influence on $f_0 = f$; then bribe the voter $i_2$ with the largest influence on $f_1 = f^{(i_1 \mapsto 1)}$, then bribe the voter $i_3$ with the largest influence on $f_2 = f^{(i_1, i_2 \mapsto 1)}$, etc. For each $t \in {\mathbb N}$ we have $\mathop{\bf E}[f_{t+1}] \geq \mathop{\bf E}[f_t] + \mathbf{MaxInf}[f_t].$ If after $t$ bribes the candidate has not yet achieved \eqref{eqn:bribery-goal} we have $-.99 \leq \mathop{\bf E}[f_t] < .99$; thus $\mathop{\bf Var}[f_t] \geq \Omega(1)$ and the KKL Theorem implies that $\mathbf{MaxInf}[f_t] \geq \Omega(\frac{\log n}{n})$. Thus the candidate will achieve a bias of at least $.99$ after bribing at most $(.99 – (-.99))/\Omega(\frac{\log n}{n}) = O(n/\log n)$ voters. $\Box$

Thus in any monotone election scheme, there is always a candidate $b \in \{-1,1\}$ and a $o(1)$-fraction of the voters that $b$ can bribe such that the election becomes $99$%-biased in $b$’s favour. And if the election scheme was not terribly biased to begin with, then both candidates have this ability. For a more precise version of this result, and also a nonmonotone version, see the exercises. Note also that although the $\mathrm{Tribes}_n$ function is essentially optimal for standing up to a single bribed voter, it is quite bad at standing up to bribed coalitions: by bribing just a single tribe (DNF term) — about $\log n$ voters — the outcome can be completely forced to $\mathsf{True}$. Nevertheless, Proposition 27 is close to sharp: Ajtai and Linial [AL93] constructed an unbiased monotone function $f : \{-1,1\}^n \to \{-1,1\}$ such that bribing any set of at most $\epsilon n/\log^2 n$ voters changes the expectation by at most $O(\epsilon)$.

The remainder of this section is devoted to the proof of the KKL Theorem and some variants. As mentioned earlier, the proof quickly follows from summing Corollary 12 over all coordinates; but let’s give a more leisurely description. We’ll focus on the main case of interest: showing that $\mathbf{MaxInf}[f] \geq \Omega(\frac{\log n}{n})$ when $f$ is unbiased (i.e. $\mathop{\bf Var}[f] = 1$). If $f$’s total influence is at least, say, $.1 \log n$ then even the average influence is $\Omega(\frac{\log n}{n})$. So we may as well assume $\mathbf{I}[f] \leq .1 \log n$.

This leads us to the problem of characterizing (unbiased) functions with small total influence. (This is the same issue that arose at the end of Chapter 8.4 when studying sharp thresholds.) It’s helpful to think about the case that the total influence is very small — say $\mathbf{I}[f] \leq K$ where $K = 10$ or $K = 100$, though we eventually want to handle $K = .1 \log n$. Let’s think of $f$ as the indicator of a volume-$1/2$ set $A \subset \{-1,1\}^n$, so $\frac{\mathbf{I}[f]}{n}$ is the fraction of Hamming cube edges on the boundary of $A$. The edge-isoperimetric inequality (or Poincaré Inequality) tells us that $\mathbf{I}[f] \geq 1$: at least a $\frac{1}{n}$ fraction of the cube’s edges must be on $A$’s boundary, with dictators and negated-dictators being the minimizers. Now what can we say if $\mathbf{I}[f] \leq K$; i.e., $A$’s boundary has only $K$ times more edges than the minimum? Must $f$ be “somewhat similar” to a dictator or negated-dictator? Kahn, Kalai, and Linial showed that the answer is yes: $f$ must have a coordinate with influence at least $2^{-O(K)}$. This should be considered very large (and dictator-like), since a priori all of the influences could have been equal to $\frac{K}{n}$.

KKL Edge-Isoperimetric Theorem Let $f : \{-1,1\}^n \to \{-1,1\}$ be nonconstant and let $\mathbf{I}’[f] = \mathbf{I}[f]/\mathop{\bf Var}[f] \geq 1$ (which is just $\mathbf{I}[f]$ if $f$ is unbiased). Then $\mathbf{MaxInf}[f] \geq \left(\tfrac{9}{\mathbf{I}’[f]^2}\right)\cdot 9^{-\mathbf{I}’[f]}.$

This theorem is sharp for $\mathbf{I}’[f] = 1$ (cf. Exercise 1.19) and it’s nontrivial (in the unbiased case) for $\mathbf{I}[f]$ as large as $\Theta(\log n)$. This last fact lets us complete the proof of the KKL Theorem as originally stated:

Proof: We may assume $f$ is nonconstant. If $\mathbf{I}’[f] = \mathbf{I}[f]/\mathop{\bf Var}[f] \geq .1 \log n$ then we are done: the total influence is at least $.1 \mathop{\bf Var}[f] \cdot \log n$ and hence $\mathbf{MaxInf}[f] \geq .1 \mathop{\bf Var}[f] \cdot \frac{\log n}{n}$. Otherwise, the KKL Edge-Isoperimetric Theorem implies $\mathbf{MaxInf}[f] \geq \Omega\left(\tfrac{1}{\log^2 n}\right) \cdot 9^{-.1 \log n} = \widetilde{\Omega}(n^{-.1 \log 9}) = \Omega(n^{-.317}) \gg \mathop{\bf Var}[f] \cdot \Omega\left(\tfrac{\log n}{n}\right). \quad \Box$

(You are asked to be careful about the constant factors in the exercises.)

We now turn to proving the KKL Edge-Isoperimetric Theorem. The high-level idea is to look at the contrapositive: supposing all of $f$’s influences are small, we want to show its total influence must be large. The assumption here is that each derivative $\mathrm{D}_i f$ is a $\{-1,0,1\}$-valued function which is nonzero only on a “small” set. Hence “small-set expansion” implies that each derivative has “unusually large” noise sensitivity. (We are really just repeating Corollary 12 in words here.) In turn this means that for each $i \in [n]$, the Fourier weight of $f$ on coefficients containing $i$ must be quite “high up”. Since this holds for all $i$ we deduce that all of $f$’s Fourier weight must be quite “high up” — hence $f$ must have “large” total influence. We now make this story formal:

Proof of the KKL Edge-Isoperimetric Theorem: We treat only the case that $f$ is unbiased, leaving the general case to the exercises. The theorem is an immediate consequence of the following chain of inequalities: $3 \cdot 3^{-\mathbf{I}[f]} \ \stackrel{\text{(a)}}{\leq} \ 3\mathbf{Stab}_{1/3}[f] \ \stackrel{\text{(b)}}{\leq} \ \mathbf{I}^{(1/3)}[f] \ \stackrel{\text{(c)}}{\leq} \ \sum_{i=1}^n \mathbf{Inf}_i[f]^{3/2} \ \stackrel{\text{(d)}}{\leq} \ \mathbf{MaxInf}[f]^{1/2} \cdot \mathbf{I}[f].$ The key inequality is (c), which comes from summing Corollary 12 over all coordinates $i \in [n]$. Inequality (d) is immediate from $\mathbf{Inf}_i[f]^{3/2} \leq \mathbf{MaxInf}[f]^{1/2} \cdot \mathbf{Inf}_i[f]$. Inequality (b) is trivial from the Fourier formulas (recall Fact 2.52): $\mathbf{I}^{(1/3)}[f] = \sum_{|S| \geq 1} |S| (1/3)^{|S|-1}\widehat{f}(S)^2 \geq 3 \sum_{|S| \geq 1} (1/3)^{|S|} \widehat{f}(S)^2 = 3\mathbf{Stab}_{1/3}[f]$ (the last equality using $\widehat{f}(\emptyset) = 0$). Finally, inequality (a) is quickly proved using the spectral sample: for $\boldsymbol{S} \sim \mathscr{S}_{f}$ we have $$\label{eqn:finish-kkl1} 3\mathbf{Stab}_{1/3}[f] = 3 \sum_{S \subseteq [n]} (1/3)^{|S|} \widehat{f}(S)^2 = 3 \mathop{\bf E}[3^{-|\boldsymbol{S}|}] \geq 3 \cdot 3^{-\mathop{\bf E}[|\boldsymbol{S}|] } = 3 \cdot 3^{-\mathbf{I}[f]},$$ the inequality following from convexity of $s \mapsto 3^{-s}$. We remark that it’s essentially only this \eqref{eqn:finish-kkl1} that needs to be adjusted when $f$ is not unbiased. $\Box$

We end this chapter by deriving an even stronger version of the KKL Edge-Isoperimetric Theorem, and deducing Friedgut’s Junta Theorem (mentioned at the end of Chapter 3.1) as a consequence. The KKL Edge-Isoperimetric Theorem tells us that if $f$ is unbiased and $\mathbf{I}[f] \leq K$ then $f$ must look somewhat like a $1$-junta, in the sense of having a coordinate with influence at least $2^{-O(K)}$. Friedgut’s Junta Theorem shows that in fact $f$ must essentially be a $2^{O(K)}$-junta. To obtain this conclusion, you really just have to sum Corollary 12 only over the coordinates which have small influence on $f$. It’s also possible to get even stronger conclusions if $f$ is known to have particularly good low-degree Fourier concentration. In aid of this, we’ll start by proving the following somewhat technical-looking result:

Theorem 28 Let $f : \{-1,1\}^n \to \{-1,1\}$. Given $0 < \epsilon \leq 1$ and $k \geq 0$, define $\tau = \frac{\epsilon^2}{\mathbf{I}[f]^2}9^{-k}, \qquad J = \{j \in [n] : \mathbf{Inf}_j[f] \geq \tau\}, \qquad \text{so }|J| \leq (\mathbf{I}[f]^3/\epsilon^2) 9^k.$ Then $f$’s Fourier spectrum is $\epsilon$-concentrated on $\mathcal{F} = \{S : S \subseteq J\} \cup \{S : |S| > k\}.$ In particular, suppose $f$’s Fourier spectrum is also $\epsilon$-concentrated on degree up to $k$. Then $f$’s Fourier spectrum is $2\epsilon$-concentrated on $\mathcal{F}' = \{S : S \subseteq J, |S| \leq k\},$ and $f$ is $\epsilon$-close to a $|J|$-junta $h : \{-1,1\}^J \to \{-1,1\}$.

Proof: Summing Corollary 12 just over $i \not \in J$ we obtain $\sum_{i \not \in J} \mathbf{Inf}^{(1/3)}_i[f] \leq \sum_{i \not \in J} \mathbf{Inf}_i[f]^{3/2} \leq \max_{i \not \in J} \{\mathbf{Inf}_i[f]^{1/2}\} \cdot \sum_{i \not \in J} \mathbf{Inf}_i[f] \leq \tau^{1/2} \cdot \mathbf{I}[f] \leq 3^{-k} \epsilon,$ where the last two inequalities used the definitions of $J$ and $\tau$, respectively. On the other hand, \begin{multline*} \sum_{i \not \in J} \mathbf{Inf}^{(1/3)}_i[f] = \sum_{i \not \in J} \sum_{S \ni i} (1/3)^{|S|-1} \widehat{f}(S)^2 = \sum_{S} |S \cap {\overline{J}}| \cdot 3^{1-|S|} \widehat{f}(S)^2 \\ \geq \sum_{S\not \in \mathcal{F}} |S \cap {\overline{J}}| \cdot 3^{1-|S|} \widehat{f}(S)^2 \geq 3^{-k} \sum_{S\not \in \mathcal{F}} \widehat{f}(S)^2. \end{multline*} Here the last inequality used that $S \not \in \mathcal{F}$ implies $|S \cap {\overline{J}}| \geq 1$ and $3^{1-|S|} \geq 3^{-k}$. Combining these two deductions yields $\sum_{S \not \in \mathcal{F}} \widehat{f}(S)^2 \leq \epsilon$, as claimed.

As for the second part of the theorem, when $f$’s Fourier spectrum is $2\epsilon$-concentrated on $\mathcal{F}’$ it follows from Proposition 3.31 that $f$ is $2\epsilon$-close to the boolean-valued $|J|$-junta $\mathrm{sgn}(f^{\subseteq J})$. From Exercise 3.33 we may deduce that $f$ is in fact $\epsilon$-close to some $h : \{-1,1\}^J \to \{-1,1\}$. $\Box$

Remark 29 As you are asked to show in the exercises, by using Corollary 25 in place of Corollary 12, we can achieve junta size $(\mathbf{I}[f]^{2+\eta}/\epsilon^{1+\eta}) \cdot C(\eta)^k$ in Theorem 28 for any $\eta> 0$, where $C(\eta) = (2/\eta+1)^2$.

In Theorem 28 we may always take $k = \mathbf{I}[f]/\epsilon$, by the “Markov argument” Proposition 3.2. Thus we obtain as a corollary:

Friedgut’s Junta Theorem Let $f : \{-1,1\}^n \to \{-1,1\}$ and let $0 < \epsilon \leq 1$. Then $f$ is $\epsilon$-close to an $\exp(O(\mathbf{I}[f]/\epsilon))$-junta. Indeed, there is a set $J \subseteq [n]$ with $|J| \leq \exp(O(\mathbf{I}[f]/\epsilon))$ such that $f$’s Fourier spectrum is $2\epsilon$-concentrated on $\{S \subseteq J : |S| \leq \mathbf{I}[f]/\epsilon\}$.

As mentioned, we can get stronger results for functions which are $\epsilon$-concentrated up to degree much less than $\mathbf{I}[f]/\epsilon$. Width-$w$ DNFs, for example, are $\epsilon$-concentrated on degree up to $O(w \log(1/\epsilon))$ (by Theorem 4.22). Thus:

Corollary 30 Any width-$w$ DNF is $\epsilon$-close to a $(1/\epsilon)^{O(w)}$-junta.

Uniformly noise stable functions do even better. From Peres’s Theorem we know that linear threshold functions are $\epsilon$-concentrated up to degree $O(1/\epsilon^2)$. Thus Theorem 28 and Remark 29 imply:

Corollary 31 Let $f : \{-1,1\}^n \to \{-1,1\}$ be a linear threshold function and let $0 < \epsilon, \eta \leq 1/2$. Then $f$ is $\epsilon$-close to a junta on $\mathbf{I}[f]^{2+\eta} \cdot (1/\eta)^{O(1/\epsilon^2)}$ coordinates.

Assuming $\epsilon$ is a small universal constant we can take $\eta = 1/\log(O(\mathbf{I}[f]))$ and deduce that every LTF is $\epsilon$-close to a junta on $\mathbf{I}[f]^2 \cdot \mathrm{polylog}(\mathbf{I}[f])$ coordinates. This is essentially best possible since $\mathbf{I}[\mathrm{Maj}_n] = \Theta(\sqrt{n})$, but $\mathrm{Maj}_n$ is not even $.1$-close to any $o(n)$-junta. By virtue of Kane’s recent theorem on the uniform noise stability of PTFs, we can also get this conclusion for any constant-degree PTF. One more interesting fact we may derive is that every boolean function has a Fourier coefficient which is at least inverse-exponential in the square of its total influence:

Corollary 32 Assume $f : \{-1,1\}^n \to \{-1,1\}$ satisfies $\mathop{\bf Var}[f] \geq 1/2$. Then there exists $S \subseteq [n]$ with $0 < |S| \leq O(\mathbf{I}[f])$ such that $\widehat{f}(S)^2 \geq \exp(-O(\mathbf{I}[f]^2))$.

Proof: Taking $\epsilon = 1/8$ in Friedgut’s Junta Theorem we get a $J$ with $|J| \leq \exp(O(\mathbf{I}[f]))$ such that $f$ has Fourier weight at least $1-2\epsilon = 3/4$ on ${\mathcal{F} = \{S \subseteq J : S \leq 8\mathbf{I}[f]\}}$. Since $\widehat{f}(\emptyset)^2 = 1 – \mathop{\bf Var}[f] \leq 1/2$ we conclude that $f$ has Fourier weight at least $1/4$ on $\mathcal{F}’ = \mathcal{F} \setminus \{\emptyset\}$. But $|\mathcal{F}’| \leq |J|^{8\mathbf{I}[f]} = \exp(-O(\mathbf{I}[f]^2))$, so the result follows by the Pigeonhole Principle. (Here we used that $(1/4)\exp(-O(\mathbf{I}[f]^2)) = \exp(-O(\mathbf{I}[f]^2))$ because $\mathbf{I}[f] \geq \mathop{\bf Var}[f] \geq \frac12$.) $\Box$

Remark 33 Of course, if $\mathop{\bf Var}[f] < 1/2$ then $f$ has a large empty Fourier coefficient: $\widehat{f}(\emptyset)^2 \geq 1/2$. For a more refined version of Corollary 32, see the exercises.

It is an open question whether Corollary 32 can be improved to give a Fourier coefficient satisfying $\widehat{f}(S)^2 \geq \exp(-O(\mathbf{I}[f]))$; see the exercises and the discussion of the Fourier Entropy–Influence Conjecture in Chapter 10.

### 8 comments to §9.6: Highlight: The Kahn–Kalai–Linial Theorem

• LYT

A couple of tiny typos:

1. “they were interesting in” -> interested?
2. “a priori all of the influence” -> influences?

Regarding “We remark that it’s only this (2) that needs…”, maybe you mean to refer to the equation before (2) instead of (2)? It seems to me that (2) does not assume balancedness, whereas the line before does.

• Thanks LYT! Fixed the first two. For the last one — yeah, I know what you mean; as I was writing it I could see it wasn’t totally accurate. It’s more like you forget about the second-to-last inequality, and then adjust the last one. I’ll try to come up with a more accurate phrase eventually.

• Tim Black

A couple more little things:
In the last line of the proof of the KKL Theorem you have “$\geq O(\log^2 n) \cdots$”. Should it be $\frac{1}{O(\log^2 n)}$? In the next line: “asked and to be”.

• Fixed and fixed. Thanks for all of these comments, Tim, please keep them coming!

• Matt Franklin

Small typo at the end of the proof of Theorem 9.28 (p. 264 in book):
“From Exercise 3.31 …” should maybe be “From Exercise 3.34 …”.

• Thanks!

• Matt Franklin

There may be two small typos in the proof of Corollary 9.32 (p. 265 in book):
$S \leq 8 I[f]$ maybe should be $|S| \leq 8 I[f]$ (end of 1st sentence).
$exp(-O(I[f]^2))$ maybe should be $exp(O(I[f]^2))$ (middle of 3rd sentence).

• That’s two more — thank you very much!