ebook img

The complexity of bit retrieval PDF

0.59 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The complexity of bit retrieval

The complexity of bit retrieval 6 1 0 VeitElser 2 DepartmentofPhysics n CornellUniversity a J 3 1 Abstract ] S D Bitretrievalistheproblemofreconstructingabinarysequencefromits . periodic autocorrelation, with applications in cryptographyand x-ray crys- s c tallography.Afterdefiningtheproblem,withandwithoutnoise,wedescribe [ andcomparevariousalgorithmsforsolvingit.Ageometricalconstraintsatis- faction algorithm, relaxed-reflect-reflect,is currently the best algorithm for 1 v noisybitretrieval. 8 2 4 3 1 Bit retrieval 0 . 1 0 The bit retrieval problem is like the problem of factoring integers, but with some 6 1 modifications totherules ofarithmetic. Thesemodifications areillustrated below, : where the calculation of 13 19 in base-2 is contrasted with the rules of bit re- v × i trieval. There are two changes: (i) exponents inthe binary expansion are periodic X (columns “wrap around”), and (ii) the columns are summed without carrying. In r a bothfactoringandbitretrievaltheproblemistoreversetheprocess: findthebinary sequences atthetop,giventheirproductatthebottom. Periodicorring-likearrangements ofintegersthatarecombinedwiththerules justdescribedareelementsofthepolynomialringZ = Z[x]/(xN 1),whereN N − is the period ofthe exponents (N = 5in the example). Factoring elements in Z N is hard, even when we are told the coefficients of the factors are limited to 0 and 1. Wewill see shortly that in bit retrieval it makes more sense to instead limit the coefficients to 1;wetherefore definetheset ± S = s +s x+ +s xN−1 Z : s = 1, 0 k N 1 . (1) N 0 1 N−1 N k { ··· ∈ ± ≤ ≤ − } 1 1 1 0 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 × × 1 1 0 1 0 1 1 0 1 1 1 0 1 1 1 0 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 0 1 1 1 2 2 2 2 1 The rules of ordinary base-2 multiplication (left) are modified (right) sothatexponentshaveperiod5andcolumnsaresummedwithoutcar- ries. “Retrieval” is a reference to phase retrieval, an important special case where thetwocoefficientsequencesarereflectionsofeachother(one“ring”isthemirror oftheother). Thisbringsustoourfirstformulation ofbitretrieval: Definition 1.1. Bit retrieval is the problem where, given a(x) Z known to N ∈ have theforma(x) = s(x)s(1/x)forsomes(x) S ,wemustfinds′(x) S N N ∈ ∈ suchthats′(x)s′(1/x) = a(x). The problem definition sidesteps the question of uniqueness. Clearly, if s′(x) is a solution, then so are s′(x)xr and s′(1/x)xr, for arbitrary r. Werestricted ± ± the solutions of bit retrieval to be elements of S so that they form orbits in this N groupoforder4N. Anothernicepropertyof 1sequencesthatwillbeusefullater ± (section 5)isthattheyhavethesame2-norm. The fastest known algorithm for bit retrieval, as defined above, was discov- ered by Howgrave-Graham and Szydlo [HGS] and was based on earlier work by Gentry and Szydlo [GS] that proposed an attack on the NTRU digital signature scheme. This algorithm has about the same complexity as factoring a number of O(N logN) bits; in fact, the first and hardest step of the algorithm is precisely the factorization of a number of that size. However, a seemingly small change can make bit retrieval much harder than this. Before we describe this change, we furtherdeveloptherelationship tophaseretrieval. Bit retrieval is a highly idealized model of phase retrieval in crystallography. In that setting, the polynomial coefficients s ,...,s are samples within one 0 N−1 period of a periodic function (a 1D crystal), and the coefficients of the product 2 a(x) = s(x)s(1/x)their(periodic) autocorrelation: N−1 N−1 a = s s = s s = a . (2) k l l−k l l+k −k l=0 l=0 X X Theindices in(2)arealltaken modN. Incrystallography onewouldrefertosas the contrast because one acquires information about it through its action on radi- ation to produce diffraction patterns. Since the contrast elements are all 1, the ± centralautocorrelation istrivial: a = N. Thereare N/2 nontrivial autocorrela- 0 ⌊ ⌋ tionsasaresultofthereflectionsymmetryin(2). Tocompletetheconnection tophaseretrieval,westartwiththeidentity N−1 1 aˆ = ei2πkq/Na = √N sˆ 2, (3) q k q √N | | k=0 X where N−1 1 xˆ = ei2πkq/Nx (4) q k √N k=0 X defines the Fourier transform of a periodic sequence x. From (3) we see that the autocorrelations only give us the Fourier transform magnitudes sˆ2, and without | | knowledge of the phases of sˆwe are unable to invert the transform (4) to recover thesignss. Phaseretrievalreferstothestrategyofdiscoveringtheunknownphases ofsˆbydemandingconsistencywithadditionalinformationwehaveaboutthecon- trast s. In the case of bit retrieval, this translates to the observation that only very special sets of phases, when combined with the known magnitudes, produce con- trastvaluescomprising only 1. ± To a mathematician, the Fourier magnitudes sˆ2 are algebraic numbers that | | when examined in great enough detail will reveal the 1 coefficients of s, even ± withoutknowledgeofthephases. Bycontrast, whenthesesamenumbersaremea- sured inadiffraction experiment, theyareknown onlytoafiniteprecision. Given that we are retrieving elements from a finite set, how much imprecision or noise canbetolerated? Theobservation thattheautocorrelation coefficients arealways integers inthe same congruence class mod 4 as N (appendix 9.1) motivates the following set of symmetricpolynomials asthesmallest“quanta” ofautocorrelation noise: E = e(x) Z : e = 0; e = e = 2, 1 k N/2 . (5) N N 0 k −k { ∈ ± ≤ ≤ ⌊ ⌋} 3 Definition 1.2. Noisy bit retrieval is the problem where, given a noisy autocorre- lationn(x)knowntohavetheformn(x)= s(x)s(1/x)+e(x),wheres(x) S N ∈ and e(x) E , we must find elements s′(x) S and e′(x) E such that N N N ∈ ∈ ∈ n(x) = s′(x)s′(1/x)+e′(x). In noisy bit retrieval we know that all of the k = 0 noisy autocorrelations n k 6 are off by 2, but we do not know whether the true autocorrelations are obtained ± by rounding up by 2 or down by 2. We will see in the next section how even this amount of noise in the data completely undermines the most efficient bit retrieval algorithms. The fastest known algorithms for noisy bit retrieval have complexity 2cN;algorithmsandestimatesoftheconstantcarediscussed insections5and6. Wewouldstilllikeittobetruethattheintroduction ofnoise(5)doesnot,with high probability, sacrifice solution uniqueness. This issupported bythe following theorem. Here we assume uniform probability distributions, both on the set of signsequencesS andthenoiseE weapplytotheirautocorrelations; “random” N N elementsofthesesetsareelementssampledfromtheuniform distribution. Theorem 1.1. Let s and e be random elements respectively of S and E , and N N n(x) = s(x)s(1/x)+e(x)thecorresponding noisyautocorrelation. Lets′ bethe sameassbutwithasingle oneofitsN signs reversed. Theprobability, thatthere exists an e′ E such that s′(x)s′(1/x)+e′(x) = n(x) (the modified s′ is also N ∈ compatible withn),isequal to(3/4)(N−1)/2 for oddN and(1/2)(3/4)N/2−1 for evenN. Proof. First consider odd N. Without loss of generality we may assume the re- versed sign is in position 0, so s′ = s , and s′ = s for k = 0. Upon reversal, 0 − 0 k k 6 theautocorrelations changeasfollows: a′ a = (s′ s )(s +s )= 2(s +s ), 1 k (N 1)/2. (6) k − k 0− 0 k −k ± k −k ≤ ≤ − Eachchange arisesfrom apairofindependent signs, andistherefore 0withprob- ability 1/2 and 4 with probability 1/2. Since the noisy data n has not changed, ± e′ e = a a′. Whenever a′ a = 0, an unchanged e′ = e is compati- k − k k − k k − k k k ble with the modified s′. However, whenever a′ a = 4, both e and e′ are k − k ± k k determined(theirvaluesarelimitedto 2)andinparticular, onlyonechoiceofe k ± allows for s′ to becompatible withn. Thenet probability that there exists acom- patible e′ is therefore (1/2)(1)+(1/2)(1/2) = 3/4 and the stated result follows k from the independence of the (N 1)/2 outcomes for the different k. When N − iseven, onlythecasek = N/2ischanged because thechange ina is 4with N/2 ± 4 probability1. Theprobabilitythatthereexistsacompatiblee′ thereforechanges N/2 from3/4to1/2. We conjecture that uniqueness in the sense of the above theorem extends be- yond the simple case of a single reversed sign. There is always non-uniqueness stemming from the invariance of the autocorrelation with respect to the order 4N group generated by cyclic-shifts, reflection and sign reversal. But this is a small group and inconsequential ifweview bit retrieval, in information-theoretic terms, as the decoding stage of a noisy communication channel. In the “noisy autocor- relator channel” an input signal of N bits, in the form of signs s, is encoded with noise as s a+e = n. Thestronger result suggested by the theorem is that the → information capacity of this channel (an asymptotic property for large N) is the sameastheentropy oftheuniform distribution ontheinputs. Turning the (probabilistically qualified and symmetry amended) uniqueness conjecture into a theorem presents difficult challenges. There exist polynomials, forexample(N = 13) s(x) = 1+x2+x3+x4+x5+x6+x7+x10+x11 (7) s′(x) = 1+x2+x3+x4+x7+x9+x10+x11+x12, (8) thathavethesameautocorrelationandyetarenotinthesameorbitoftheorder4N symmetry group. The equality of the autocorrelations is in this case explained by thefactthat s(x)= p(x)q(x) = (1+x2+x7)(1+x3+x4), (9) and s′(x) = p(x)q(1/x). To prove the theorem one needs to bound this form of non-uniqueness, which exists even without noise. Though extremely rare, the phenomenon offactorizable solutions (contrast) is also knownto occur incrystal- lography[PS]. Bythestandards ofcrystallography, thenoisedefinedby(5)hasanunrealistic dependenceonN. SincethenoisecoefficientseareO(1),sowillbethedifference in the Fourier coefficients eˆ, between the true and noisy transforms, aˆ and nˆ. By (3) this translates into O(1/√N) errors in the Fourier magnitudes sˆ2. Crystal- | | lography experiments arenoisier, beingcontent withanO(1)signal-to-noise ratio and therefore noise amplitudes for e and eˆin the bit retrieval model growing as O(√N). Thisbringsustoyetathirdproblem: 5 Definition1.3. Fixed-precision bitretrievalistheproblemwhere,givenprecision η > 0andanoisyautocorrelation transform nˆ knowntosatisfy sˆ 2 nˆ /√N < η, 0 q N/2 , (10) q q | | − ≤ ≤ ⌊ ⌋ (cid:12) (cid:12) for some Fourier tra(cid:12)nsformed sequen(cid:12)ce of signs s, we must find such a sign se- (cid:12) (cid:12) quence. This version of bit retrieval comes closest to the phase retrieval problem in crystallography. There the data naturally arrives via the Fourier transform and is always subject to noise. The order N of the cyclic group in bit retrieval corre- sponds to the number of resolution elements, or voxels, in the representation of the contrast. That the symmetry group of the 3D problem is not the cyclic group of order N, but a direct product of three such groups having the same order, is probably largely irrelevant to the complexity of bit retrieval. Finally, although a stricttwo-valuedcontrastisapoorwaytoapproximateacontinuouscontrastfunc- tion (electron density), itisnot abad model for representing adilute collection of equally scattering atomsatlowresolution. The fastest algorithms for solving fixed-precision bit retrieval, like noisy bit retrieval, have complexity 2cN, where the constant c now depends on the noise parameter η. Butunlike thenoisy version, solutions inthefixed-precision version haveextensiveentropy, thatis,growinnumberexponentially withN. Thiscanbe argued non-rigorously asfollows. Take N large and consider flipping a large random subset of M signs, while keeping M N. TheFouriertransform changes assˆ sˆ +∆sˆ where,using q q q ≪ → a result of Freedman and Lane [FL], the ∆sˆ are independent complex-normal q random variables with zero mean and variance O(M/N). The probability that the flipsviolate anyof thecorresponding inequalities in (10), inthe limit ofsmall variance, isanintegraloverthetailofaGaussiandistribution anddepends onη as B exp( b Nη2/M)for somepositive constants B andb . Theprobability that q q q q − noinequality isviolatedbehaves as ⌊N/2⌋ 1 B exp( b Nη2/M) . (11) q q − − q=0 Y (cid:0) (cid:1) Now consider the limit N with M/N held fixed and M/N η2. In this → ∞ ≪ limit (11)approaches 1foranyofthe sets offlipped signs which, forfixedM/N, have extensive entropy. Solutions therefore have extensive entropy for anyη > 0. 6 Crystallography withfixedη canescapethissourceofnon-uniqueness bykeeping N underaboundproportional to1/η2. Fixed-precision bit retrieval would reduce to noise-free bit retrieval if instead offixingη (asN increases) wewereallowedtotakethelimitη 0. Inthislimit, → theFouriertransformofthenoisynˆ,afterroundingthecoefficients, istheautocor- relation a of bit retrieval. Weget a variant of noisy bit retrieval if instead we take limitssuchthatNη2 = O(1),i.e. keepingηjustsmallenoughtopreservesolution uniqueness. The fixed-precision version lends itself naturally to geometrical con- straintsatisfaction algorithms,twoofwhichweshalldescribeindetail. Unlikethe algebraic algorithms developed for solving the noise-free problem, the geometric algorithms areeasilyadapted tosolveanyofthethreeproblems. Notsurprisingly, thecomplexity ofthegeometricalgorithms isrelatively insensitive toη. 2 Symmetry and noise Toappreciatetheeffectofnoiseonbitretrievalcomplexity,wefocusinthissection oninstances whereitisknownthatthesignsshaveareflectionsymmetry. Weare then free to target the rotated polynomial s′(x) = xrs(x) that has the property s′(x) = s′(1/x). To avoid complications in the presentation that do not alter the mainideas,werestrictourselves toprimeN inthissection. Symmetric bit retrieval is very easy. Dropping the prime on our reflection symmetricsigns,wedefineb(x) Z withcoefficientsb = (1 s )/2 0,1 . N k k ∈ − ∈ { } The coefficients a′ of the corresponding autocorrelation a′(x) = b(x)b(1/x) = k b(x)2 arerelatedtothesignautocorrelations asfollows: 1 a′ = a +N 2 s . (12) k 4 k − l ! l X By(2)thesumofthesignsisoneofthesquarerootsofthesumofthea ’s. Exercis- k ing symmetrytoalways select thenon-negative root, thetransformed a′(x) Z N ∈ isknown. Wenowobservethereareexactlyasmanybitsofinformationinthesym- metricb(x)asthereareparitybitsinthea′(x)coefficients. Thissuggestsreducing 7 allthecoefficientsmod2,soweareworkinginthering(Z/2)[x]/(xN 1): − 2 (N−1)/2 a′(x) = b + b (xk +x−k) (13) 0 k   k=1 X  (N−1)/2  = b + b (x2k +x−2k). (14) 0 k k=1 X Since N isa prime greater than 2, there is a unique element 2−1 in the field of N elementsandanexplicitformulaforbitretrieval: b = a′ (mod 2). (15) k 2−1k Byreducing thetransformed autocorrelation coefficients mod2wehavemade ourbitretrieval algorithm maximallyvulnerable tonoise. Indeed, withthe 2un- ± certainty in a of noisy bit retrieval, the parities of the a′’s are completely uncer- k k tainandsoitwouldseem,thebitsb . However,wenextconsideramoreelaborate k polynomial-time algorithm whose noise tolerance is somewhat better. Since sign reversal s s is the only symmetry remaining in reflection symmetric bit re- → − trieval, itisnotsurprising thatthisspecial caseoftheproblem canbereduced toa shortest latticevectorproblem, whichsharesthissymmetry. When the signs s have reflection symmetry, from (4) we see that sˆis purely real and we can write sˆ = sˆ y , where y = 1. We then have the following q q q q | | ± equationsrelatingthesignsx oftheunknownsequenceandtheunknownsignsy k q ofitsFouriertransform: (N−1)/2 √N sˆ y = x + 2cos(2πkq/N)x , 0 q (N 1)/2. (16) q q 0 k | | ≤ ≤ − k=1 X Since y = y , there are just as many independent equations and Fourier signs, −q q M = (N +1)/2,asthereareunknownsignsinthereflection-symmetric sequence weareattemptingtoretrieve. The observation that (16) should hold for arbitrary levels of precision, in nu- merical approximations of the cosine functions and the data √N sˆ , leads to a q | | polynomial time bit retrieval algorithm for sequences known to have reflection symmetry. Unlike theq = 0equation, whichonlyreveals thenumber of+1signs (up to overall sign reversal), the other equations, individually, become nontrivial instances oftheintegerpartitioning problem whentheircoefficients aremultiplied 8 by a large number K = 2P and then rounded to the nearest integer. Unlike the usualintegerpartitioningproblem,herewerequireonlythatthepartitionproduces a sum consistent with the round-off errors. Nevertheless, it is easy to produce arbitrarily good approximate integer equations because the round-off has a fixed bound while arbitrarily large P-bit approximations of the Fourier coefficients can becomputedintimethatgrowsasapolynomial inP. It is straightforward to adapt the method of Lagarias and Odlyzko [LO], for solving low density subset sum problems, to solve symmetric bit retrieval. Low density in our context corresponds to setting the number of bits P, in the approx- imation of the coefficients, sufficiently large in comparison to the number of un- known signs, 2M. However, rather than use just one of the q = 0 equations, and 6 theinformationinjustoneoftheFouriermagnitudes,weconstructalatticeΛfrom information provided byallM equations. Thegenerators ofΛaretherowsofthe following2M 2M matrix: × KD 0 G= ⌊ ⌉ , (17) KC I M×M (cid:20) ⌊ ⌉ (cid:21) where denotesroundingtothenearestintegerandtheM M blocksC and ⌊···⌉ × D aredefinedby 1, k = 0 C = (18) kq 2cos(2πkq/N), 1 k M 1, (cid:26) ≤ ≤ − D = diag √N sˆ ,..., √N sˆ . (19) 0 M−1 − | | − | | (cid:16) (cid:17) Byconstruction, Λhastwoshortvectors: v = [w w 1 1] = [0 0 1 1] G (20) 1 0 M−1 ··· ··· ··· ··· · v = [z z s s ] = [y y s s ] G, (21) s 0 M−1 0 M−1 0 M−1 0 M−1 ··· ··· ··· ··· · where w ,...,w and z ,...,z are sets of small integers produced by 0 M−1 0 M−1 round-off. Vector v is small because each of the columns of the matrix C has 1 zerosumwhilev issmallbyequations (16). s Each column of (21) represents one instance of the integer partitioning prob- lem: assigning M + 1 signs to the same number of P-bit integers to produce a small sum. Inthe equivalent subset sum problem wemustfindasubset ofM +1 P-bit integers to produce a given target sum, again with neglect of the low order round-off bits. In base-2 arithmetic, asolution is checked byverifying that nearly P column sums (low order bits excepted) are all even, where these were equally 9 likely to have been either parity in a randomly guessed subset. Reasoning prob- abilistically, we conclude that P must be at least as large as M if we expect to recover a unique subset, or choice of signs in the equivalent integer partitioning problem. Fewerbitsshould sufficewhenthesamesetofsigns isrequired tosolve allM 1non-trivial(q = 0)integerpartitioning problemsrepresented by(21). In − 6 fact, the necessary number of bits would be bounded if the information provided byeachpartitioning problem isinsomesenseindependent oftheothers. The question of how to efficiently find a partition places different demands on the number of bits in our integer approximation of the symmetric bit retrieval problem. The Lagarias-Odlyzko algorithm, associated with a single one of our M 1 partitioning problems, and assuming the specific non-random integers in − G are well modeled by average-case behavior, requires P = O(M2). We have not attempted to extend the analysis of the algorithm to the generator matrix G, and instead have performed experiments with the symmetric Hadamard sequence instances,definedinappendix9.1,thatwebelievetobeamongthehardest. Ineach experimentweapplytheMathematicaimplementationoftheLLLlatticereduction algorithm [LLL]toGandrecordasuccesswhenamongthereduced basiswefind avectorav +bv ,whereb = 0. 1 s 6 InFigure1weshowthedependence ofthebitlengthP onsuccessful retrieval of symmetric Hadamard sequences by LLL basis reduction of the generator ma- trix G given in (17). At each N for which such a sequence exists we plot the smallest P for which the retrieval was successful. We see that P appears to grow linearly withN. Thisbehavior isbelowthequadratic growthrequired byLLLfor solving a single random subset sum problem of the same size, but well above the probabilistically arguedbounded bit-length required forsolution uniqueness. Whereas anypolynomial growthoftherequired bitlength oftheFouriermag- nitude “data” is consistent with a polynomial-time algorithm, that the data preci- sion must grow at all eliminates the fixed-precision variant of bit retrieval. We willusethesymmetricHadamardinstancestoshowthatthelatticebasisreduction algorithm alsofailstosolvenoisybitretrieval. SymmetricHadamardsequences(appendix9.1)existforallprimelengthsN ≡ 1 (mod 4)andareinterestingbecauseoftheirlowautocorrelations, a 3,1 , k ∈{− } k = 0. Foranysuchsequence there existsane E suchthatthenoisy autocor- N 6 ∈ relation n = a+ehas n = 1, k = 0. Thesmallness ofthe k = 0autocorrela- k − 6 6 tions, relative to a = N, has by (3) the effect that the q = 0 Fourier magnitudes 0 6 are nearly equal. Using the fact that the symmetric Hadamard sequences have the 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.