# Chapter 6 notes

The ${\mathbb F}_2$-polynomial representation of a boolean function $f$ is often called its algebraic normal form. It seems to have first been explicitly introduced by Zhegalkin in 1927 [Zhe27].

For functions $f : {\mathbb Z}_n \to {\mathbb R}$, the idea of $\epsilon$-regularity as a pseudorandomness notion dates back to Chung and Graham [CG92], as does the equivalent combinatorial condition Proposition 7. (In the context of quasirandom graphs, the ideas date further back to Thomason [Tho87a] and Chung–Graham–Wilson [CGW89].) The idea of treating functions with small (stable) influences as being `generic’ has its origins in the work of Kahn–Kalai–Linial [KKL88]. The notion was brought to the fore in work on hardness of approximation — implicitly, by Håstad [Has96,Has99], and later more explicitly by Khot, Kindler, Mossel, and O’Donnell [KKMO07].

The notion of $\epsilon$-biased sets (and also $(\epsilon,k)$-wise independent distributions) was introduced by Naor and Naor [NN93] (see also the independent work of Peralta [Per90]). The construction in Theorem 30 is due to Alon, Goldreich, Håstad, and Peralta [AGHP92] (as is Exercise 23). As noted in [NN93], $\epsilon$-biased sets are closely related to error-correcting codes over ${\mathbb F}_2$; indeed, they are equivalent to linear error-correcting in which all pairs of codewords have relative distance in $[\tfrac{1}{2} - \tfrac{1}{2} \epsilon, \tfrac{1}{2} + \tfrac{1}{2} \epsilon]$. In particular, the construction in Theorem 30 is the concatenation of the well-known Reed–Solomon and Hadamard codes (see, e.g., [MS77] for definitions). The nonconstructive upper bound in Exercise 24 is essentially the Gilbert–Varshamov bound and is close to known lower bound of $\Omega(\frac{n}{\epsilon^2 \log(1/\epsilon)})$ (assuming $\epsilon \geq 2^{-\Omega(n)}$), which follows from the work of McEliece, Rodemich, Rumsey, and Welch [MRRW77] (see [MS77]). Additionally, constructive upper bounds of $O(\frac{n}{\epsilon^3})$ and $O(\frac{n^{5/4}}{\epsilon^{5/2}})$ are known using tools from coding theory; see the work of Ben-Aroya and Ta-Shma [BT09] and Matthews and Peachey [MP11].

The probabilistic notion of correlation immunity — i.e., condition (14) of Corollary 14 — was first introduced by Siegenthaler [Sie84]; we further discuss his work below. Independently and shortly thereafter, Chor, Friedman, Goldreich, Håstad, Rudich, and Smolensky [CFG+85] introduced the definition of resilience and also connected it to $(0,k)$-regularity of the Fourier spectrum; i.e., they proved Corollary 14. (In the cryptography literature, Corollary 14 is called the Xiao–Massey Theorem after [XM88].) The work [CFG+85] also essentially contains Theorem 25 and the relevant function from Examples 16; cf. [MOS04].

The problem of constructing explicit $k$-wise distributions of small support arose in different guises in different areas — in the study of orthogonal arrays (in statistics), error-correcting codes, and algorithmic derandomization. Alon, Babai, and Itai [ABI85] gave the construction in Theorem 32 — in fact, the stronger one from Exercise 27 — based on the analysis of dual BCH codes in [MS77]. The lower bound from Exercise 28 is essentially due to Rao [Rao47]; see also the independent proofs in [CFG+85,ABI85].

Siegenthaler’s Theorem is from [Sie84]. His motivation was the study of cryptographic stream ciphers in cryptography. In this application, a short random sequence of bits (“secret key”) is transformed via some scheme into a very long sequence of pseudorandom bits (“keystream”), which can then be used as a one-time pad for encryption. A basic component of most schemes is a linear feedback shift register (LFSR), which can efficiently generate long, fairly statistically-uniform sequences. However due to its ${\mathbb F}_2$-linearity, it suffers from some simple cryptanalytic attacks. An early idea for combating this is to take $n$ independent LFSR streams and combine them via some function $f : {\mathbb F}_2^n \to {\mathbb F}_2$. Effective attacks are possible in such a scheme if $f$ is correlated with any of its input bits — or indeed (as Siegenthaler pointed out) any input pair, triple, etc. This led Siegenthaler to define the probabilistic notion of correlation-immunity. Although $\chi_{[n]}$ is the maximally correlation-immune function, it is not suitable as a LFSR combining function precisely because of its ${\mathbb F}_2$-linearity; the same is true of any function of low ${\mathbb F}_2$-degree. Siegenthaler precisely captured this tradeoff between correlation-immunity and ${\mathbb F}_2$-degree in his theorem.

Bent functions were named and first studied by Rothaus around 1966; he didn’t publish the notion until 1976 however [Rot76], at which point there were already several works on subject, see [Dil72]. Bent functions have application in cryptography and coding theory; see, e.g., the survey [Car10]. The basic constructions presented in Section 3 are due to Rothaus; the class of bent functions described in Exercise 18 is called the Maiorana–McFarland family. Dickson’s Theorem is from [Dic01]; see also [MS77].

Theorem 36 is from [MOS04]; there is an improved algorithm for learning $k$-juntas which runs in time roughly $n^{.6024 k} \mathrm{poly}(n)$, due to G. Valiant [Val12]. Avrim Blum offers a prize of 1000 dollars for solving the case of $k = \log \log n$ in $\mathrm{poly}(n)$ time [Blu03]. Theorem 42 is due to Kushilevitz and Mansour [KM93]. The Derandomized BLR Test and Theorem 44 (and Exercise 32) are due to Ben-Sasson, Sudan, Vadhan, and Wigderson [BSVW03].

The result of Exercise 11 is due to Muller [Mul54a]; deriving Exercise 30 from it and from [BEHW87] is folklore. The result of Exercise 12(a) is due to Bernasconi and Codenotti [BC99]; Exercise 13 is from [MS77]. In Exercise 25, part (a) is due to Freivalds [Fre79] and part (b) to Naor and Naor [NN93]. The Gowers norm and results of Exercise 34 are from [Gow01].

### 2 comments to Chapter 6 notes

• Sankardeep

• Ryan O'Donnell