Randomness Evaluation with the Discrete Fourier Transform Test Based on Exact Analysis of the Reference Distribution 7 Hiroki Okada and Ken Umeno 1 0 January 10, 2017 2 n a J H. Okada and K. Umeno are with the Department of Applied Mathematics and Physics, 8 Graduate School of Informatics, Kyoto University, Kyoto, JAPAN. ] R e-mail: [email protected], [email protected] C . Abstract s c [ In this paper, we study the problems in the discrete Fourier transform (DFT) test included in NIST SP 800-22 released by the National Institute of Standards and Tech- 1 v nology (NIST), which is a collection of tests for evaluating both physical and pseudo- 0 randomnumbergeneratorsforcryptographicapplications. Themostcrucialproblemin 6 the DFT test is that its reference distribution of the test statistic is not derived math- 9 1 ematically but rather numerically estimated; the DFT test for randomness is based on 0 a pseudo-random number generator (PRNG). Therefore, the present DFT test should . 1 not be used unless the reference distribution is mathematically derived. Here, we prove 0 that a power spectrum, which is a component of the test statistic, follows a chi-squared 7 1 distribution with 2 degrees of freedom. Based on this fact, we propose a test whose : reference distribution of the test statistic is mathematically derived. Furthermore, the v i results of testing non-random sequences and several PRNGs showed that the proposed X test is more reliable and definitely more sensitive than the present DFT test. r a Keywords: Computer security, random sequences, statistical analysis 1 Introduction Random numbers are used in many types of applications, such as cryptography, numerical simulations, andsoon. However, itisnoteasytogenerate“truly”randomnumbersequences. Pseudo-random number generators (PRNGs) generate the sequences by iterating some re- currence relation; therefore, the sequences are theoretically not “truly” random. The binary “truly” random sequence is defined as the sequence in which each element has a probability of exactly 1 of being “0” or “1” and in which the elements are statistically independent of 2 1 each other. It is also difficult to ascertain if the sequence is truly random; therefore, the randomness of the sequences is evaluated statistically. NIST SP 800-22 [1, 2] is one of the famous statistical test suites for randomness that was used for selecting the Advanced Encryption Standard (AES) algorithm. NIST SP 800-22 consists of fifteen tests, and every test is hypothesis testing, where the hypothesis is that the input sequence is truly random; if the hypothesis is not rejected in all the tests, it is implied that the input sequences are random. Among the tests included in NIST SP 800-22, the DFT test is of the greatest concern to us. This test detects periodic features of a random number sequence; input sequences are discrete Fourier transformed, and the test statistic is composed of the Fourier coefficients. In 2003, Kim et al. [3, 4] reported that the DFT test and the Lempel-Ziv test in the original NIST SP 800-22 [1] have crucial theoretical problems. Regarding the DFT test, it is reported that the test statistic does not follow the expected referencedistributionbecauseoftheproblemthattheDFTtestregardsFouriercoefficientsas independent stochastic variables although they are not. Kim et al. numerically estimated the distribution of the test statistic with pseudo-random numbers generated with a PRNG and proposed a new DFT test with the estimated distribution. In 2005, Hamano [5] theoretically scrutinized the distribution of the Fourier coefficients in the original DFT test. However, he couldnotderivethetheoreticaldistributionoftheteststatistic, buthedidmaketheproblems in the DFT test clearer. In 2005, because of these reports, in NIST SP 800-22 version 1.7, the Lempel-Ziv test was deleted, and the DFT test was revised according to the report of Kim et al. The DFT test has not subsequently been revised. In 2012, Pareschi et al. [6] reviewed three tests included in NIST SP 800-22, and they also numerically estimated the distribution of the test statistic. Consequently, they reported that the distribution estimated by Kim et al. is not sufficiently accurate. As stated above, several researchers have attempted to revise the DFT test. However, the distribution of the test statistic has still not been derived theoretically but rather numerically estimated. Inthispaper, wereviewtheproblemsintheDFTtest, andweprovethreefacts, whichare important for analyzing the reference distribution of the test statistic: Under the assumption that the input sequence is an ideal random number sequence, when j (cid:54)= 0, (cid:113) (cid:113) • The asymptotic distributions of both 2c (X) and 2s (X) are the standard normal n j n j distribution (N(0,1)) when n → ∞. (cid:113) (cid:113) • When n is sufficiently large, 2c (X) and 2s (X) are statistically independent of n j n j each other. • The asymptotic distribution of 2|S (X)|2 is a chi-squared distribution with 2 degrees n j of freedom (χ2) when n → ∞. 2 Here, X is an n-bit binary sequence, S (X) is the j-th discrete Fourier coefficient of X, and j c (X) and s (X) are the real and imaginary parts of S (X), and they are defined in (1), j j j (2) and (3) in Section 2, respectively. There is no information about these factors in NIST SP800-22, and, to the best of our knowledge, no researchers who have studied the DFT test have ever provided rigorous proofs. These factors are necessary for analyzing the reference 2 distribution of the test statistic. Furthermore, we propose a new DFT test based on the fact that χ2 is the asymptotic distribution of 2|S (X)|2. By comparing the results of several 2 n j PRNGs, we show that our test is more reliable and definitely more sensitive than the present DFT test. 2 Discrete Fourier Transform Test In this section, we explain the procedure of the original DFT test (DFTT ), released in original 2001 [1], before the revision in 2005 [2]. We also explain the problems reported by several researchers [4, 5]. The focus of this test is the peak heights in the discrete Fourier transform of the sequence. The purpose of this test is to detect periodic features in the tested sequence that would indicate a deviation from the assumption of randomness. The intention is to detect whether the number of peaks exceeding the 95 % threshold is significantly different than 5 %. 2.1 The procedure of the original DFT test 1) The zeros and ones of the input sequence E = {(cid:15) ,··· ,(cid:15) } are converted to values 0 n−1 of −1 and +1 to create the sequence X = {x ,··· ,x }, where x = 2(cid:15) − 1 (i ∈ 0 n−1 i i {0,...,n−1}). For simplicity, let n be even. 2) Apply a discrete Fourier transform (DFT) to X to produce Fourier coefficients {S (X)}n−1. The Fourier coefficient S (X) and its real and imaginary parts c (X) j j=0 j j and s (X) are defined as follows: j n(cid:88)−1 2πkj √ n(cid:88)−1 2πkj S (X) := x cos − −1 x sin (1) j k k n n k=0 k=0 n−1 (cid:88) 2πkj c (X) := x cos (2) j k n k=0 n−1 (cid:88) 2πkj s (X) := x sin (3) j k n k=0 n−1 3) Compute {|S (X)|}2 , where j j=0 |S (X)|2 = (c (X))2 +(s (X))2. j j j Because |S (X)| = |S (X)|, {|S (X)|}n−1 are discarded. j n−j j j=n 2 √ n−1 4) Compute a threshold value T = 3n. The 95% values {|S (X)|}2 are supposed 0.95 j j=0 to be < T . 0.95 3 According to SP800-22, 2|S (X)|2 is considered to follow χ2, and T is defined by the n j 2 0.95 following equation. P(|S (X)| < T ) = (cid:90) n2T02.95 1e−ydy j 0.95 2 2 0 = 1−e−T02.95 n := 0.95 √ (cid:112) ∴ T = −nln(0.05) (cid:39) 3n 0.95 √ Several researchers [4, 5] reported that this T = 3n was incorrect, and it was 0.95 (cid:112) accordingly revised as T = −nln(0.05) in the DFT test in the revised NIST 0.95 SP800-22 [2]. 5) Count (cid:110) n (cid:111) N = # |S (X)| | |S (X)| < T ,0 ≤ j ≤ −1 . 1 j j 0.95 2 n−1 If {|S (X)|}2 are mutually independent, then under the assumption of randomness, j j=0 N can be considered to follow B(n,0.95), where B is the binomial distribution. 1 2 According to the central limit theorem, when n is sufficiently large, the approximation to B(n,p) is given by the normal distribution N(np, np(1−p)). Therefore, when n is sufficiently large, under the assumption of randomness, (cid:16) n n(cid:17) N ∼ N 0.95 ,(0.95)(0.05) . 1 2 2 6) Compute a test static N −0.95n d = 1 2 . (cid:112) (0.95)(0.05)n 2 When n is sufficiently large, under the assumption of randomness, the test statistic d can be considered to follow N(0,1) (cid:18) (cid:19) |d| 7) Compute P-value; p = erfc √ . 2 If p < α, then conclude that the sequence is non-random, where α is a significance level of the DFT test. NIST recommends α = 0.01 [2]. Therefore, we also define α = 0.01. If p ≥ α, conclude that the sequence is random. 8) Perform1)to7)formsamplesequences{X ,X ,...,X }; mP-values{p ,p ,...,p } 1 2 m 1 2 m are computed. 4 9) (Second-level test I: Proportion of sequences passing a test) Count the number of sample sequences for which P-value ≥ α and define it as m . p Then,undertheassumptionofrandomness,m followsB(m,1−α),whichapproximates p N(m(1 − α),mα(1 − α)) when m is sufficiently large. Therefore, the proportion of (cid:16) (cid:17) sequences passing a test (= m /m) approximately follows N (1−α), α(1−α) . The p m range of acceptable m /m is determined using the significance interval defined as p (cid:114) (cid:114) α(1−α) m α(1−α) p 1−α−3 < < 1−α+3 . (4) m m m If the proportion falls outside of this interval, there is evidence that the data are non-random. 10) (Second-level test II: Uniform distribution of P-values) Uniformity may also be determined by applying a χ2 test and determining a P-value corresponding to the goodness-of-fit distributional test on the P-values obtained for an arbitrary statistical test (i.e., the P-value of the P-values). This is performed by computing (cid:88)10 (F −m/10)2 χ2 = i , m/10 i=1 where F is the number of P-values in sub-interval i. A P-value P is calculated such i T that (cid:18)9 χ2(cid:19) P = igamc , , T 2 2 where igamc is the complementary incomplete gamma function. If P ≥ α (:= 0.0001), (5) T II thesequencescanbeconsideredtobeuniformlydistributed,whereα isthesignificance II level for P . T 11) If the set of P-values {p ,p ,...,p } passes both 9) and 10), the physical or pseudo- 1 2 m random number generators that generated the input sequences are concluded to be ideal. 2.2 The fundamental problems of the original and present DFT tests Kim et al. [4] and Hamano [5] reported the following: • The test statistic d := √N1−0.95n2 does not follow N(0,1); (0.95)(0.05)n 2 5 (cid:0) (cid:1) • N does not follow N 0.95n,(0.95)(0.05)n . 1 2 2 Furthermore, Kim et al., using Secure Hash Generator (G-SHA1) [2] as a PRNG, estimated that (cid:16) n n(cid:17) N ∼ N 0.95 ,(0.95)(0.05) ; 1 2 4 N −0.95n d := 1 2 ∼ N(0,1), kim (cid:112)(0.95)(0.05)n 4 and DFTT was revised according to this report of Kim et al. [2]; the present DFT test, original denotedasDFTT , hasnotbeenrevisedsincethen. Therefore, thereferencedistribution present of the test statistic of DFTT is not mathematically derived. Furthermore, Pareschi et present al. reported that the numerical estimation is not sufficiently accurate; they numerically estimated that (cid:16) n n (cid:17) N ∼ N 0.95 ,(0.95)(0.05) ; 1 2 3.8 N −0.95n d := 1 2 ∼ N(0,1). pareschi (cid:112)(0.95)(0.05) n 3.8 Moreover, Pareschi et al. proposed that the DFT test with this test statistic (DFTT ) is pareschi more reliable. (The definition of the reliability of a test is discussed in Section 5.) Therefore, it can be considered that DFTT still has errors. First, DFTT and DFTT present present pareschi are performed based on a PRNG, whose randomness should be evaluated with a randomness test; they cannot be used unless the reference distribution is mathematically derived. n−1 As stated in step 5) in Section 2.1, {|S (X)|}2 are considered to be mutually indepen- j j=0 n−1 dent. However, {|S (X)|}2 are not mutually independent, and this problem is expected to j j=0 (cid:0) (cid:1) be the main factor for why N does not follow N 0.95n,(0.95)(0.05)n [4, 5]. Furthermore, 1 2 2 before considering this problem, it is also necessary to ensure that 2|S (X)|2 follows χ2. Al- n j 2 though 2|S (X)|2 is considered to follow χ2 in step 4) in Section 2.1, there is no information n j 2 about this in SP800-22, and no researchers studying the DFT test have ever provided rig- orous proofs to the best of our knowledge. We provide a proof for the DFT test in Section 3. 3 The asymptotic distribution of 2|S (X)|2 j n In this section, we analyze the asymptotic distribution of 2|S (X)|2. From the definition of n j |S (X)| in (1), j (cid:32)(cid:114) (cid:33)2 (cid:32)(cid:114) (cid:33)2 2 2 2 |S (X)|2 = c (X) + s (X) . j j j n n n 6 When j = 0, 2 (cid:32)(cid:80)n−1x (cid:33)2 |S (X)|2 = 2 k√=0 k . 0 n n Under the assumption that X is an ideal random number sequence, P(x = −1) = P(x = k k 1) = 1 and {x }n−1 are mutually independent, and E[x ] = 0,V[x ] = 1. Therefore, as 2 k k=0 k k (cid:16)(cid:80)n−1x (cid:17) a consequence of the central limit theorem, when n is sufficiently large, k√=0 k follows n N(0,1), and (cid:16)(cid:80)nk√=−01xk(cid:17)2 follows a chi-squared distribution with 1 degree of freedom (χ2). n 1 Thus, 2|S (X)|2 does not follow χ2. n 0 2 In the following, we consider the case when j (cid:54)= 0. Here, 2|S (X)|2 follows χ2 if the n j 2 following is true: (cid:113) (cid:113) • Both 2c (X) and 2s (X) follow N(0,1). n j n j (cid:113) (cid:113) • 2c (X) and 2s (X) are mutually independent. n j n j In the following 2 subsections, we prove the following Theorem 1, Theorem 2 and Theorem 3: Theorem 1: When n is sufficiently large, both (cid:113) (cid:113) 2c (X) and 2s (X) follow n j n j N(0,1). Theorem 2: When n is sufficiently large, (cid:113) (cid:113) 2c (X) and 2s (X) are mutu- n j n j ally independent. Theorem 3: 2|S (X)|2 follows χ2 when n is suffi- n j 2 ciently large. From the definition of χ2, Theorem 3 can be proven by combing Theorem 1 and Theorem 2. 2 (cid:113) 3.1 Proof of Theorem 1: The asymptotic distribution of 2c (X) n j In this subsection, we prove Theorem 1. Hamano [5] showed that the average, variance, skewness, and kurtosis of c (X) and N(0, n) are the same. However, it cannot be proven j 2 that N(0, n) is the asymptotic distribution of c (X) based only on these factors. 2 j (cid:113) (cid:113) (cid:113) 2c (X) is expressed as 2c (X) := 2 (cid:80)n−1x a , where a = cos 2πkj. Under n j n j n k=0 k k,j k,j n the assumption that X is an ideal random number sequence, the characteristic function of 7 (cid:113) 2c (X) denoted by φ(t) is expressed as follows: n j (cid:34) (cid:32)(cid:114) (cid:33)(cid:35) 2√ φ(t) = E exp −1tc (X) X j n (cid:34)n(cid:89)−1 (cid:32)(cid:114)2√ (cid:33)(cid:35) = E exp −1tx a X k k,j n k=0 n(cid:89)−1 (cid:34) (cid:32)(cid:114)2√ (cid:33)(cid:35) = E exp −1tx a x k k,j k n k=0 n−1 (cid:32)(cid:114) (cid:33) (cid:89) 2 = cos ta . k,j n k=0 n−1 (cid:32)(cid:114) (cid:33) (cid:88) 2 ∴ logφ(t) = logcos ta , k,j n k=0 where 1 (cid:88) E (·) := (·), X 2n X∈{−1,1}n 1 (cid:88) E (·) := (·). x k 2 x ∈{−1,1} k Using the Taylor expansion about a point t = 0, we obtain (cid:32)(cid:114) (cid:33) 2 1 1 logcos ta = − a2 t2 − a4 t4 +O(t6). n k,j n k,j 3n2 k,j n−1 n−1 1 (cid:88) 1 (cid:88) ∴ logφ(t) = − a2 t2 − a4 t4 +O(t6). n k,j 3n2 k,j k=0 k=0 Since n−1 n−1 (cid:88) n (cid:88) a2 = , a2l ≤ n (l ∈ {1,2,3,...}), k,j 2 k,j k=0 k=0 1 lim logφ(t) = − t2. ∴ lim φ(t) = e−1t2. 2 n→∞ 2 n→∞ (cid:113) Thus, N(0,1) is the asymptotic distribution of 2c (X). Likewise, it can be proven that n j (cid:113) N(0,1) is the asymptotic distribution of 2s (X). n j 8 (cid:113) 3.2 Proof of Theorem 2: Statistical independence of 2c (X) and n j (cid:113) 2s (X) n j In this subsection, we prove Theorem 2. Let us define a 2-dimensional stochastic variable Y as the following equation: (cid:32)(cid:114) (cid:114) (cid:33) 2 2 Y := (Y ,Y ) := c (X), s (X) . 1 2 j j n n UndertheassumptionthatX isanidealrandomnumbersequence, thecharacteristicfunction of Y denoted by ψ(t) is expressed as follows: √ ψ(t) = E [exp( −1tY (cid:62))] X (cid:34) (cid:32)(cid:114) (cid:33)(cid:35) 2√ = E exp −1(t c (X)+t s (X)) X 1 j 2 j n n−1 (cid:32)(cid:114) (cid:33) (cid:89) 2 = cos (t a +t b ) , 1 k,j 2 k,j n k=0 where 2πkj 2πkj t = (t ,t ), a = cos , b = sin . 1 2 k,j k,j n n Therefore, n−1 (cid:32)(cid:114) (cid:33) (cid:88) 2 logψ(t) = logcos (a t +b t ) . k,j 1 k,j 2 n k=0 Using the Taylor expansion about a point t = 0, we obtain (cid:32)(cid:114) (cid:33) 2 logcos (a t +b t ) k 1 k 2 n (a t +b t )2 (a t +b t )4 k,j 1 k,j 2 k,j 1 k,j 2 = − − +··· . n 3n2 Since n−1 n−1 n−1 n−1 (cid:88) (cid:88) n (cid:88) (cid:88) a2 = b2 = , a b = 0, albm ≤ n(l,m ≥ 0), k k 2 k k k k k=0 k=0 k=0 k=0 we obtain tt(cid:62) (cid:18) tt(cid:62)(cid:19) lim logψ(t) = − , ∴ lim ψ(t) = exp − . n→∞ 2 n→∞ 2 9 Therefore, when n is sufficiently large, the joint probability distribution function is described as follows: 1 (cid:18) y2 +y2(cid:19) f (y ,y ) = exp − 1 2 . Y1,Y2 1 2 2π 2 As we proved before, N(0,1) is the asymptotic distribution of both Y and Y . Thus, when 1 2 n is sufficiently large, the probability distribution functions of Y and Y are f (y ) = (cid:16) (cid:17) (cid:16) (cid:17) 1 2 Y1 1 √1 exp −y12 and f (y ) = √1 exp −y22 , respectively. Therefore, when n is sufficiently 2π 2 Y2 2 2π 2 large, the following equation is obtained: f (y ,y ) = f (y )f (y ). Y1,Y2 1 2 Y1 1 Y2 2 (cid:113) (cid:113) This means that 2c (X) and 2s (X) are mutually independent when n is sufficiently n j n j large. 4 The proposed DFT test In Section 3, we proved Theorem 3, stating that 2|S (X)|2(j (cid:54)= 0) follows χ2 when n is n j 2 n−1 sufficiently large. Therefore, if {|S (X)|}2 are mutually independent, we can consider that j j=1 N follows N (cid:0)0.95n,(0.95)(0.05)n(cid:1). However, {|S (X)|}n2−1 are not mutually independent. 1 2 2 j j=1 Therefore, it is necessary to mathematically analyze the distribution of the test statistic d n−1 under the condition that {|S (X)|}2 are not mutually independent. Hamano [5] attempted j j=1 n−1 tomathematicallyderivethedistributionoftheset{|S (X)|}2 , buthecouldnotdoso, and j j=1 we also could not derive this distribution. However, we rigorously proved that the asymptotic distribution of 2|S (X)|2 is χ2, and we develop the new DFT test (DFTT ) based on n j 2 proposed this fact. The reference distribution of the test statistic of DFTT is mathematically proposed derived, whereas that of DFTT is estimated with a PRNG. We explain the test statistic present of DFTT in the next subsection. proposed 4.1 The procedure of the proposed DFT test In the standard approach in NIST SP800-22, each sequence is analyzed; thus, m sequences give m P-values. However, DFTT generates n − 1 (n: length of a sequence) P- proposed 2 values. Therefore, more P-values are generated since n is generally larger than m. Since the numberofP-valuesshouldnotbetoolarge(seeSection5.3),beforeconductingDFTT , proposed it is necessary to adjust the length of the sequences and make them into more sets of short sequences(seealsoTable5),assumingthatthesetinputsequencesarecontinuouslygenerated by an RNG. Therefore, DFTT is theoretically not appropriate for the isolated set of proposed sequences. The procedure of the proposed DFT test is described as follows: 10