Certification of Boson Sampling Devices with Coarse-Grained Measurements Sheng-Tao Wang and L.-M. Duan Department of Physics, University of Michigan, Ann Arbor, Michigan 48109, USA (Dated: January 13, 2016) A boson sampling device could efficiently sample from the output probability distribution of noninteracting bosons undergoing many-body interference. This problem is not only classically intractable, but its solution is also believed to be classically unverifiable. Hence, a major difficulty inexperimentistoensureabosonsamplingdeviceperformscorrectly. Wepresentanexperimental friendly scheme to extract useful and robust information from the quantum boson samplers based on coarse-grained measurements. The procedure can be applied to certify the equivalence of boson samplingdeviceswhilerulingoutalternativefraudulentdevices. Weperformnumericalsimulations todemonstratethefeasibilityofthemethodandconsidertheeffectsofrealisticnoise. Ourapproach is expected to be generally applicable to other many-body certification tasks beyond the boson 6 sampling problem. 1 0 PACSnumbers: 03.67.Lx,03.67.Ac,05.30.Jp 2 n a In the last three decades, quantum computation has from a large-scale boson sampler. In other words, would J stimulated considerable excitement among physicists, any filtered information be able to verify the equivalence 1 computer scientists, and mathematicians, with the gen- of two identical boson sampling devices while excluding 1 eral belief that quantum computers could solve certain possible fraudulent ones? In light of rapid experimental tasks significantly faster than current electronic comput- advances,thisissuewillbeincreasinglyrelevantwhenthe ] h ers [1, 2]. Especially after the discovery of Shor’s fac- system scales up: as the probabilities of generic output p toring algorithm [3], theorists and experimentalists have eventsbecomeexponentiallysmall,samplingnoisedueto - t been teaming up to converge on the goal of demonstrat- limited measurement trials may conceal any distinctive n ingquantumsupremacy[4]. Recently,animportantstep information. a u made by Aaronson and Arkhipov [5] is to formulate the In this paper, we introduce an experimental friendly q boson sampling problem, which is intractable for classi- scheme to extract useful structures from a boson sam- [ cal computers but remarkably amenable to quantum ex- pling device based on coarse-grained measurements. Us- periments. A number of elegant experiments have since 1 ing standard statistical tools, we simulate the experi- v implemented the problem with linear optics on a small mental certification process and show that the coarse- 7 scale [6–13]. Pushing the experiments beyond classical grained information is able to provide a quantitative as- 2 capabilities would constitute a strong demonstration of 6 sessmenttothe degreesofmatchingbetweentwoalleged the quantum speedup and in addition lead to important 2 boson samplers. This is important when one needs to implicationsinthefoundationsofcomputerscience[2,5]. 0 verify the equivalence of two quantum samples drawn . 1 from the same boson sampling device or from different At the core of the hardness-of-simulation property lie 0 devices with identical processes. It will also be crucial the exponential cost of computing a matrix permanent 6 in situations wherein we can completely trust one device 1 [14]andtheexponentialnumberofpossibleoutputevents and need to validate another possibly fraudulent device : inbosonsampling. Theseposemajordifficultiesincerti- v against the reliable one. Our numerical simulation in fying the correctness of a boson sampler on a large scale i X addition demonstrates our scheme could tolerate a mod- [11, 12, 15–18]. The credibility of a certification pro- erate amount of experimental noise [22, 23] while strong r cess thus relies on gathering convincing circumstantial a noise invalidates the equivalence due to mismatched in- evidence while ruling out alternative explanations. Sev- terference processes. On a broader scale, our method is eral efficient schemes have been proposed to validate a not specific to boson sampling, but could be applicable boson sampler against the uniform sampler [15] making to other generic many-body certification problems. use of the information in the unitary process [11, 12, 16] and against the classical sampler of distinguishable par- Before proceeding to our proposed scheme, we briefly ticles[6,11,19]exploitingthebosoniccloudingbehavior introduce the boson sampling problem and clarify why [12]. It is also possible to depart from the computation- large sampling errors are involved without coarse grain- ally hard space and design an efficient test based on pre- ing. In a typical setup, there are N indistinguishable dictable forbidden events with special inputs and scat- bosons prepared in M input modes and allowed to co- tering process [17, 20, 21], assuming the device would herently interfere with one another in a (random) uni- be equally operational in general. An unsettled prob- tary process. We abstract away from interactions be- lem especially pertinent to experiments is whether one tween particles, so the resultant many-body interference would be able to extract useful and robust information is purely due to the bosonic statistics. To compute the 2 (a) State Space + + + + + + + bility + + B3 a + b B1 + o r + p B2 + + (b)2×10-4 Simulated Exp 1 (c)2×10-4 Simulated Exp 1 + + +Unobserved state observed states Simulated Exp 2 Simulated Exp 2 Theory Theory y1 y1 Figure 2. Schematic for the coarse-graining procedure. The abilit0 abilit0 filled symbols are bubble centers. The bubble structure is b b formed as follows: (1) Pick one set of experimental samples o o pr1 pr1 and sort the observed states in descending probability. (2) Consider only states that are not included in previous bub- 2 2 bles. Select a bubble center as the state with the highest 1.75 1.8 1.85 1.9 1.95 2 7.25 7.3 7.35 7.4 7.45 7.5 probability (if multiple states attain the same highest prob- state ×105 state ×105 ability, choose any of them as the bubble center). Form a newbubblebyenclosingstateswithL distancesmallerthan Figure 1. Original Distributions. (a) Theoretical distribu- 1 a cutoff radius to this bubble center (the cutoff radius may tion PQ of a boson sampling device with N = 5 particles be increased for subsequent bubbles to ensure each one has in M = 40 modes after a random unitary transformation. comparable sample size). (3) The bubble structure is estab- (b) and (c) shows the zoom-in distributions of two simu- lishedwithcorrespondingcentersafterallobservedstatesare lated experimental samples (PQ and PQ) with sample size S1 S2 included. (4)Othersamplesandtheoreticaldistributionsare N =10000 drawn from PQ. The theoretical distribution is m coarse grained with the same bubble structure. superimposed onto the simulated samples. features and may falsely accept some fraudulent devices. probabilityofanoutputevent,oneneedstocalculatethe Ourproposedschemetakestheabovefactorsintoconsid- permanentoftheassociatedN×N matrix[6,24], which eration, and coarse grain the states based on the L dis- 1 requiresanexponentialcostforagenericcomplexmatrix. tancemeasure. TheL distancebetweentwooccupation- The possible number of output events is D =(cid:0)M+N−1(cid:1), number-basis states i1s defined as L = (cid:80)M|ψ − φ |, N 1 i i i which grows exponentially with N (for M ≥ N). As where ψ (φ ) is the occupation number in the ith mode i i the system scales up, the number of measurement runs forthestate|ψ(cid:105)(|φ(cid:105)). Detailsofthecoarse-grainingpro- willbeunabletokeeppacewiththeexponentialgrowth. cedure are shown in Fig. 2. We note that this scheme is Thisgivesrisetolargesamplingerrorswithlimitedsam- not the only way to perform coarse graining. We expect ple size. Fig. 1 shows an example in which the sample that other procedures meeting the above considerations size is less than 1% of the Hilbert space dimension. Two may also work. Below, we show that useful and robust samplesdrawnfromthesamedevicecouldberatherdis- structures can be extracted from the samples using our similar (with a low average fidelity F ≈ 0.039±0.002, method. where F = (cid:12)(cid:12)(cid:112)PQ ·(cid:112)PQ(cid:12)(cid:12)). From these distributions, We simulate the experimental certification process S1 S2 it is not quite possible to assess whether the samples are with two different systems, one with trapped ions and drawn from the same bona fide boson sampling device. onewithHaar-distributedrandomunitaries. Forallsim- With a proper coarse-graining procedure, we show, ulations, we choose a fixed sample size of Nm = 10000, however, that a reliable comparison between two given whichisareasonabledetectioncountinexperiments,but outputsamplesisachievable. Thereareafewfactorsthat is nevertheless smaller than 1% of the Hilbert space di- a reasonable coarse-graining procedure should consider. mension in study. First, it should be constructed from experimental sam- Fortrappedions,thetransverselocalphononsareused ples. On a large scale, classical simulation is no longer as indistinguishable bosons with the Hamiltonian given feasible. Experiments may nevertheless pick out the im- by [26–28] portant output states with higher probabilities. Second, M M (cid:88) (cid:88) (cid:16) (cid:17) the procedure should be scalable: not only the measured H = (cid:126)w a†a + (cid:126)t a†a +a†a , (1) c x,i i i ij i j j i events but all possible events should be grouped into i i<j some bubbles, where the number of bubbles should not be subject to the exponential growth. Third, the filtered where wx,i =−(cid:80)Mj(cid:54)=it0/|zi0−zj0|3, tij =t0/|zi0−zj0|3, information should still carry some knowledge of the full and t = e2/(8π(cid:15) mω ). z denotes the axial equilib- 0 0 x i0 correlationsintheoutputs. Thisisbecausetheessenceof rium position of the ith ion with mass m and charge e. the many-body interference lies in the many-mode cor- w (w ) is the axial (transverse) trapping frequency. In z x relations: previous works [17, 20, 25] have shown that the simulation, we use experimentally relevant parame- few-particleobservablesmaynotcapturethefullbosonic tersω =2π×0.03MHzandω =2π×4MHzfor171Yb+ z x 3 (a) 0.1 (b) 0.3 Simulated Exp 1 Simulated Exp 1 Simulated Exp 2 0.25 Distinguishable 0.08 Theory Uniform y y 0.2 bilit0.06 bilit a a0.15 ob0.04 ob pr pr 0.1 0.02 0.05 0 0 0 5 10 15 20 25 0 5 10 15 20 25 bubbles bubbles (c)0.15 (d)0.15 Simulated Exp 1 Simulated Exp 1 Simulated Exp 2 Distinguishable Theory Uniform y 0.1 y 0.1 bilit bilit a a b b o o pr0.05 pr0.05 0 0 0 5 10 15 20 25 0 5 10 15 20 25 bubbles bubbles Figure 3. Coarse-grained probability distributions. All samples have a sample size N =10000. The simulated experimental m samples (cid:0)PQ and PQ(cid:1) are drawn from the boson sampler (PQ); distinguishable (PC) and uniform (cid:0)PU(cid:1) samples are drawn S1 S2 S S respectively from the classical (PC) and uniform (PU) samplers. Errors for the probabilities follow the standard deviations of the multinomial distribution. (a) and (b): probability distributions for the trapped-ion system with intermediate-time dynamics. The simulated system has N = 12 phonons in a M = 12 ion chain with one phonon on each ion as the input state. (c) and (d): probability distributions after a Haar-distributed random unitary transformation. The simulated system has N =5 particles in M =40 modes with |1,1,1,1,1,0,···,0(cid:105) as the input state. ions and consider N =12 phonons on M =12 ions. The sensitivetodetailsofthecoarse-grainingmethod,suchas total Hilbert space size is D = 1352078. The evolution the particular sample used to initiate the bubble struc- time is chosen at some intermediate time (τ = 100µs) ture. withinterestingmany-bodydynamics(seeSupplemental To quantify the degrees of matching between coarse- Material [29] for further results in the long time limit). graineddistributions, weemploythetwo-sampleχ2 test. For the random unitary process, we simulate N =5 par- Underthenullhypothesiswhereintwosamplesaredrawn ticlesinM =40modes,asettingcomparabletothecur- from the same distribution, the χ2 statistics follow the rentexperimentalregimewithlinearoptics[6–13]andin χ2-distributionwithdegreesoffreedom(df)equaltothe a limit where M > N2 to suppress collision events. A number of bubbles (N ) minus one. If the χ2 statistic Haar-distributedM×M randomunitarymatrixisused, B is large with a small associated p-value, one nominally with a Hilbert space dimension of D =1086008. rejects the null hypothesis. Therefore, during the certi- fication process, one should ideally accept the null if the Based on the coarse-graining procedure outlined in pair of samples come from the same boson sampling de- Fig. 2, we group the sample events into different bub- vice and reject it if one is sampled from an alternative bles according to one set of experimental samples. The distribution. Twotypesoferrorscanbeincurred, atype theoreticaldistributionsarealsosubjecttothesamebub- Ierror(falsepositive)relatedtofalselyrejectingthetrue ble structure. Fig. 3 shows an example of the coarse- null hypothesis and a type II error (false negative) asso- grained distributions. This extracted information is ro- ciated with the failure to reject a false null hypothesis. bust and reliable with small sampling errors. By visual Prior to the test, one sets a significance level α, which comparison, we can see that the boson sampling data PQ and PQ match closely with each other, but differ will be the type I error rate if both samples are from the S1 S2 same distribution. significantly from samples drawn from alternative distri- butions, such as the distinguishable (PC) and uniform In simulation, we could generate many sets of samples S (PU) samples [29]. Here, it is also possible to compare and repeat the certification process to better gauge the S the simulated data with the theoretical distribution PQ, error rates. Each time, we grouped the sample events which will not be directly obtainable when experiments into bubbles and compared one set of boson sampling surpass classical simulation capabilities. By repeating dataPQ withvariousothersamples(PQ,PC,andPU). S1 S2 S S the procedure, we observe that the comparisons are not A χ2 statistic and a p-value were computed for each χ2 4 Table I. The pass rates R between simulated experimental sample 1 and various other samples. The two-sample χ2 test is performed to assess whether they come from the same dis- tribution. Thesignificancelevelαissetat1%. Foreachpair ofgeneratedsamples,ifthep-valueisgreaterthanα,thecom- parisonpassesthetest. ThisisrepeatedforN =10000runs s and pass rates are recorded. Noisy samples for the trapped- ion system include a 1% (3%) timing error, whereas a 1% (3%)randomerrorisincludedintherandomunitarymatrix. pass rate (%) compared to experimental sample 1 Trapped Ions Random Unitary Figure4. Distributionsofthetwo-sampleχ2test-statisticsfor N 24.2 40.5 69.9 25.9 40.8 70.5 B therandomunitaryprocesswithN ≈40.8. N =10000sets B s Exp 2 99.1 99.2 99.1 99.1 99.2 99.1 ofsamplesaregeneratedinthesimulationanddistributionsof theχ2 statisticsbetweenthecorrespondingpairsareplotted. Exp 2 (1% Noise) 98.0 98.2 98.2 98.9 98.9 99.1 Thesolidcurveistheχ2-distributionwithdf=N −1. The Exp 2 (3% Noise) 71.1 77.1 75.9 98.3 98.4 98.6 B dashed line marks the cutoff χ2 value at α=1% [30]. Distinguishable 0 0 0 6.7 0.3 0.03 Uniform 0 0 0 0 0 0 test. This process was repeated for N = 10000 runs, s with the distributions of the χ2 statistics and p-values in experiments. In the case of trapped ions, we included recorded. Fig.4presentsthedistributionsofχ2 statistics a 1% (3%) systematic error in the timing (1µs (3µs) for the random unitary process with the average number shift in τ). This is also equivalent to a 1% (3%) error of bubbles NB ≈ 40.8. It can be seen clearly that the in the hopping amplitude t ∼ω2/ω [28], which trans- test statistics between PQ and PQ follow the χ2 distri- ij z x S1 S2 lates to a respective shift in the trapping frequencies in bution with df = NB −1, whereas those between PSQ1 experiments. For the random unitary process, we added and PC fall on the far tail of the χ2 distribution, there- 1%(3%)randomnoisetotheunitarymatrix(seeSupple- S fore offering a definitive answer regarding whether the mentalMaterialfordetails[29]). AsseeninTableI,with samples are from the same distribution. More quantita- smallnoise(∼1%)thetypeIerrorratesarekeptincheck tively,wecalculatethepassrateRatagivensignificance (<∼2%). When the noise becomes substantial, pass rates level α = 1%. Specifically, if the p-value is greater than may drop sharply if the unitary process changes consid- α, the comparison passes the test. Out of Ns tests, the erably. This also shows the sensitivity of our method to passratesinpercentagearereportedinTableI.Between strong noise and it could serve as a stringent certifica- two sets of boson sampling data PQ and PQ, the pass tion test. The dissimilar sensitivity to noise for the two S1 S2 rates R ≈ 99% for all cases, with the type I error rates systemsisduetothedifferentwaysnoiseisincludedand 1−R being very close to α as expected. Between PQ natures of the noise (systematic versus random). S1 and alternative samples (PSC and PSU), the pass rates re- Inconclusion,wehaveshownthatusefulandrobustin- flect type II error. They can reach a few percent for formationcanbeextractedfromthecoarse-grainedmea- smallerNB butdropto<1%asNB increases. Thisalso surementsforbosonsampling. Thecoarse-graineddistri- presents a tradeoff between the information obscured by butionscanbefurtherusedtocertifythebosonsampling the sampling noise without coarse-graining and the in- device. We expect this method to be handy when exper- formationonediscardsbyheavycoarse-graining. Ingen- iments progress beyond classical capabilities. It should eral, to reduce sampling noise, each bubble should have also be a useful tool for other generic many-body certifi- a minimum number of observed events. For the χ2 test cation problems. to work reliably, this minimum number is conventionally ThisworkissupportedbytheARL,theIARPALogiQ chosen to be 10. For a detection count of N = 10000, m program, and the AFOSR MURI program. a range of 20 <∼ NB <∼ 100 works well, with the require- ment that the smallest bubble includes no less than 10 events (one could group together very small bubbles). Noticeably, this only depends on the number of detec- tioncounts,anddoesnotscaleupwiththeHilbertspace [1] M. A. Nielsen and I. L. Chuang, Quantum computation dimension. Our simulations demonstrate, in particular, and quantum information (Cambridge university press, that one could conclusively certify the boson sampling 2010). [2] S. Aaronson, Quantum computing since Democritus devicewithnumberofmeasurementslessthan1%ofthe (Cambridge University Press, 2013). Hilbert space dimension. [3] P. W. Shor, “Polynomial-time algorithms for prime fac- Furthermore, we consider the effects of realistic noise torization and discrete logarithms on a quantum com- 5 puter,” SIAM J. Comput. 26, 1484 (1997). [16] S. Aaronson and A. Arkhipov, “Bosonsampling is far [4] T. D. Ladd, F. Jelezko, R. Laflamme, Y. Nakamura, fromuniform,”QuantumInfo.Comput.14,1383(2014). C. Monroe, and J. L. O’Brien, “Quantum computers,” [17] M.C.Tichy,K.Mayer,A.Buchleitner, andK.Mølmer, Nature 464, 45 (2010). “Stringentandefficientassessmentofboson-samplingde- [5] S.AaronsonandA.Arkhipov,“Thecomputationalcom- vices,” Phys. Rev. Lett. 113, 020502 (2014). plexity of linear optics,” in Proceedings of the Forty- [18] L. Aolita, C. Gogolin, M. Kliesch, and J. Eisert, “Re- thirdAnnualACMSymposiumonTheoryofComputing, liable quantum certification of photonic state prepara- STOC ’11 (ACM, New York, NY, USA, 2011) pp. 333– tions,” Nat Commun 6, 8498 (2015). 342. [19] M. Tillmann, S.-H. Tan, S. E. Stoeckl, B. C. Sanders, [6] M. A. Broome, A. Fedrizzi, S. Rahimi-Keshari, J. Dove, H. de Guise, R. Heilmann, S. Nolte, A. Szameit, and S. Aaronson, T. C. Ralph, and A. G. White, “Photonic P.Walther,“Generalizedmultiphotonquantuminterfer- boson sampling in a tunable circuit,” Science 339, 794 ence,” Phys. Rev. X 5, 041015 (2015). (2013). [20] M. C. Tichy, “Interference of identical particles from [7] J. B. Spring, B. J. Metcalf, P. C. Humphreys, entanglement to boson-sampling,” Journal of Physics W. S. Kolthammer, X.-M. Jin, M. Barbieri, A. Datta, B: Atomic, Molecular and Optical Physics 47, 103001 N. Thomas-Peter, N. K. Langford, D. Kundys, J. C. (2014). Gates, B. J. Smith, P. G. R. Smith, and I. A. Walm- [21] V. S. Shchesnovich, “Partial indistinguishability theory sley, “Boson sampling on a photonic chip,” Science 339, formultiphotonexperimentsinmultiportdevices,”Phys. 798 (2013). Rev. A 91, 013844 (2015). [8] M. Tillmann, B. Dakic, R. Heilmann, S. Nolte, A. Sza- [22] P. P. Rohde and T. C. Ralph, “Error tolerance of the meit, and P. Walther, “Experimental boson sampling,” boson-samplingmodelforlinearopticsquantumcomput- Nat Photon 7, 540 (2013). ing,” Phys. Rev. A 85, 022332 (2012). [9] A. Crespi, R. Osellame, R. Ramponi, D. J. Brod, E. F. [23] V. S. Shchesnovich, “Sufficient condition for the mode Galvao, N. Spagnolo, C. Vitelli, E. Maiorino, P. Mat- mismatch of single photons for scalability of the boson- aloni, and F. Sciarrino, “Integrated multimode interfer- sampling computer,” Phys. Rev. A 89, 022333 (2014). ometers with arbitrary designs for photonic boson sam- [24] S.Scheel,“Permanentsinlinearopticalnetworks,”eprint pling,” Nat Photon 7, 545 (2013). arXiv:quant-ph/0406127 (2004), quant-ph/0406127. [10] N.Spagnolo,C.Vitelli,L.Sansoni,E.Maiorino,P.Mat- [25] K. Mayer, M. C. Tichy, F. Mintert, T. Konrad, and aloni, F. Sciarrino, D. J. Brod, E. F. Galv˜ao, A. Crespi, A. Buchleitner, “Counting statistics of many-particle R. Ramponi, and R. Osellame, “General rules for quantum walks,” Phys. Rev. A 83, 062307 (2011). bosonic bunching in multimode interferometers,” Phys. [26] D. Porras and J. I. Cirac, “Bose-einstein condensation Rev. Lett. 111, 130503 (2013). andstrong-correlationbehaviorofphononsiniontraps,” [11] N. Spagnolo, C. Vitelli, M. Bentivegna, D. J. Brod, Phys. Rev. Lett. 93, 263602 (2004). A.Crespi,F.Flamini,S.Giacomini,G.Milani,R.Ram- [27] S.-L. Zhu, C. Monroe, and L.-M. Duan, “Trapped ion poni, P. Mataloni, R. Osellame, E. F. Galvao, and quantum computation with transverse phonon modes,” F.Sciarrino,“Experimentalvalidationofphotonicboson Phys. Rev. Lett. 97, 050505 (2006). sampling,” Nat Photon 8, 615 (2014). [28] C. Shen, Z. Zhang, and L.-M. Duan, “Scalable imple- [12] J.Carolan,M.D.A.,P.J.Shadbolt,N.J.Russell,N.Is- mentation of boson sampling with trapped ions,” Phys. mail, K. Wo¨rhoff, T. Rudolph, M. G. Thompson, J. L. Rev. Lett. 112, 050504 (2014). O’Brien, M.C.F., andA.Laing,“Ontheexperimental [29] SeeSupplementalMaterialforfurtherresultsonthelong- verificationofquantumcomplexityinlinearoptics,”Nat time trapped-ion system and additional discussions on Photon 8, 621 (2014). thedifferencesbetweentrappedionsandtherandomuni- [13] M. Bentivegna, N. Spagnolo, C. Vitelli, F. Flamini, tary process. N. Viggianiello, L. Latmiral, P. Mataloni, D. J. Brod, [30] Rigorously speaking, each set of generated samples may E.F.Galva˜o,A.Crespi,R.Ramponi,R.Osellame, and have different N since we do not strictly control it. B F.Sciarrino,“Experimentalscattershotbosonsampling,” Hence each χ2 statistic may have different df. However, Science Advances 1, e1400255 (2015). N does not vary much in each run, and we can see B [14] L. Valiant, “The complexity of computing the perma- that the χ2 statistics for a pair of experimental samples nent,” Theoretical Computer Science 8, 189 (1979). roughlyfollowtheχ2-distributionwithdf=N −1.The B [15] C.Gogolin,M.Kliesch,L.Aolita, andJ.Eisert,“Boson- pass rates in table I are calculated properly as p-values Sampling in the light of sample complexity,” ArXiv e- arecomputedforeachpairofsampleswiththeactualdf. prints (2013), arXiv:1306.3995 [quant-ph]. 6 SUPPLEMENTAL MATERIAL: CERTIFICATION OF BOSON SAMPLING DEVICES WITH COARSE-GRAINED MEASUREMENTS In this Supplemental Material, we include additional results for the trapped-ion system with long-time dynamics and for boson sampling data with experimental noise. Further analysis for the two-sample χ2 test is provided. RESULTS FOR THE LONG-TIME TRAPPED-ION SYSTEM (a) 0.1 (b)0.15 Simulated Exp 1 Simulated Exp 1 Simulated Exp 2 Distinguishable 0.08 Theory Uniform y y 0.1 bilit0.06 bilit a a ob0.04 ob pr pr0.05 0.02 0 0 0 5 10 15 20 25 0 5 10 15 20 25 bubbles bubbles Figure 5. Coarse-grained probability distributions for the trapped-ion system with long-time dynamics. All samples have a sample size N = 10000. The simulated experimental samples (cid:0)PQ and PQ(cid:1) are drawn from the boson sampler (PQ); m S1 S2 distinguishable (PC) and uniform (cid:0)PU(cid:1) samples are drawn respectively from the classical (PC) and uniform (PU) samplers. S S Errors for the probabilities follow the standard deviations of the multinomial distribution. The simulated system has N =12 phonons in a M =12 ion chain with one phonon on each ion as the input state. Inthemaintext,weincludedresultsforboththetrapped-ionsystemwithintermediate-timedynamics(τ =100µs) and for the random unitary process (Fig. 3 and Table I of the main text). In Fig. 5 and Table II here, we add the coarse-grained distributions and the pass rate results for the long-time trapped-ion system (at τ = 10ms). With these coarse-grained distributions, we would like to discuss some differences between the trapped-ion system and the random unitary process. In the prototypical boson sampling problem, the hardness-of-simulation argument is based on randomly selected unitaries [5]. This is because for some structured unitary process, fast approximation algorithms may exist. For the trapped-ion system, the phonon normal modes are fixed (fixed time-independent Hamiltonian in Eq. 1 of the main text), so we may be able to extract some distinctive structures from the output probability distributions. Nevertheless, we still expect the complex many-body dynamics in trapped ions to be classically intractable. From the coarse-grained distributions, we can see that the boson sampling data PQ for intermediate-time trapped S ions are conspicuously different from alternative samples, such as the distinguishable sample PC and the uniform S sample PU. On the other hand, the boson sampling data bear resemblance to the distinguishable samples PC for S S random unitary processes, and the long-time trapped-ion phonon distributions are instead in closer proximity to the uniform samples PU. The qualitative difference arises from the fact that the random unitary process has a Haar- S distributed random unitary matrix whereas the long-time trapped-ion unitary only has random phases (eigenvalues) Table II. The pass rates R between simulated experimental sample 1 and various other samples for the trapped-ion system with long-time dynamics. The two-sample χ2 test is performed to assess whether they come from the same distribution. The significancelevelαissetat1%. Foreachpairofgeneratedsamples,ifthep-valueisgreaterthanα,thecomparisonpassesthe test. This is repeated for N =10000 runs and pass rates are recorded. s pass rate (%) compared to experimental sample 1 Trapped Ions (Long time) Average No. of bubbles 24.3 39.0 66.5 Experimental Sample 2 98.9 99.0 99.2 Distinguishable Sample 0 0 0 Uniform Sample 5.7 1.3 0.2 7 (a) 0.1 (b)0.25 Simulated Exp 1 Simulated Exp 1 Noisy Exp 2 Noisy Exp 2 0.08 Theory 0.2 Theory y y bilit0.06 bilit0.15 a a ob0.04 ob 0.1 pr pr 0.02 0.05 0 0 0 5 10 15 20 25 0 5 10 15 20 25 bubbles bubbles Figure6. Coarse-grainedprobabilitydistributionsforthenoisysamples. (a)Trapped-ionsystemofintermediate-timedynamics witha1%timingerrorincludedinthenoisysample. ThesimulatedsystemhasN =12phononsinaM =12ionchainwithone phonononeachionastheinputstate. (b)Distributionsafterarandomunitarytransformation. A1%randomnoiseisaddedto theunitaryprocessinthenoisysample. ThesimulatedsystemhasN =5particlesinM =40modeswith|1,1,1,1,1,0,···,0(cid:105) as the input state. withafixedcrystalmodestructure. Wehavefurthertestedthisideabysimulatingthelong-timetrapped-iondynamics with a unitary matrix given by the fixed mode structure and random phases. Certification results do support the notion that this system produces output probabilities in closer proximity to the uniform samples (for example, the pass rate R ≈ 3% between the simulated samples and uniform samples and R ≈ 0% between the simulated samples anddistinguishablesamplesforN ≈39.1). Thisinadditionshowsourmethodissensitivetosomestructureshidden B in the many-body interference process. NOISY SAMPLES Fig.6presentssomecoarse-graineddistributionsincludingthenoisysamples. Visually,the1%noiseintheunitaries does not lead to substantial changes to the distributions. The noise we include for the intermediate-time trapped-ion system is a 1% systematic error in the total time (shift from τ = 100µs to τ = 101µs). For the random unitary process,inordertopreservetheunitarityoftheprocess,wefirstfindtheeffectiveHamiltonianfortheHaar-distributed randomunitaryprocess,andsubsequentlyadda1%randomnoisetoeachentryofthehermitianHamiltonianmatrix. Figure 7. Distributions of p-values and the two-sample χ2 test-statistics for the trapped-ion system with intermediate-time dynamics and N ≈40.5. A χ2 statistic and a p-value are calculated for each pair of samples, with a total N =10000 runs B s generated in the simulation. (a) If two samples come from the same distribution, p-values should be uniformly distributed in [0,1]. (b)Thesolidcurveistheχ2-distributionwithN −1degreesoffreedom. Thedashedlinemarksthecutoffχ2 valueat B 1% significance level. 8 TableIII.Meanandstandarddeviationofp-valuesforthetwo-sampleχ2 testbetweenonesimulatedexperimentalsampleand variousothersamples. Ap-valueiscalculatedforeachpairofgeneratedsamplesandthisisrepeatedforatotalofN =10000 s sets. Mean and standard deviation (in brackets) of those p-values are tabled below. If two samples come from the same distribution, p-values should be uniformly distributed in [0,1] with mean 0.5 and standard deviation 0.289. If not, p-values should be small. Noisy samples for the trapped-ion system include a 1% timing error, whereas a 1% random error is included intherandomunitarymatrix. Avalueof0inthetableindicatesthatthep-valueisextremelysmall(<10−323 atthemachine level). Simulated experimental sample 1 (p-values) Trapped Ions (Int. time) Trapped Ions (Long time) Random Unitary Average No. of bubbles 24.2 40.5 69.9 24.3 39.0 66.5 25.9 40.8 70.5 Experimental sample 2 0.50 0.51 0.51 0.50 0.50 0.51 0.50 0.50 0.50 (0.29) (0.29) (0.29) (0.29) (0.29) (0.29) (0.29) (0.29) (0.29) Noisy experimental sample 2 0.44 0.45 0.44 - - - 0.50 0.49 0.50 (0.29) (0.29) (0.29) - - - (0.29) (0.29) (0.29) Distinguishable sample 0 0 0 0 0 0 0.0053 2.1×10−4 7.9×10−5 - - - - - - (0.033) (0.0076) (0.0071) Uniform sample 0 0 0 0.0043 8.4×10−4 1.1×10−4 0 0 0 - - - (0.029) (0.0098) (0.0027) - - - FURTHER ANALYSIS ON TWO-SAMPLE χ2 TEST For the two-sample χ2 test, if the pair of samples comes from the same distribution, the χ2 statistics should follow the χ2-distribution with degrees of freedom being the number of bubbles minus one. Also, if the null hypothesis is correct, p-values should be uniformly distributed in [0,1]. In the main text, we plotted the distributions of the χ2 statistics between two boson sampling data and between one boson sampling data and a distinguishable sample. One could clearly see the distinction in comparison. Here, we include in Fig. 7 the plots for the noisy samples too, in the case of intermediate-time trapped-ion system with N ≈40.5. Without noise in experiment, it is evident that the χ2 B statistics follow the χ2-distribution and the p-values are almost uniformly distributed. With experimental noise, χ2 statisticsshifttolargervalues, withp-valuestiltedtowardssmallervalues. Therefore, typeIerrorisgoingtoincrease with noise in experiments. Nevertheless, with small noise (1% in this case), one can still definitively and correctly conclude that these samples are equivalent with >∼98% confidence level (as tabled in the main text). OtherthanthepassratesRtabledinthemaintext,hereweaddatableforthep-valuestoo,whichoffersadditional information regarding the error rates without comparing to a specific significance level α. The mean and standard deviation of p-values are reported in Table III for a set of N = 10000 runs. In the case of two boson sampling s data, the mean and standard deviation are very close to the theoretical values 0.5 and 0.289 respectively. On the other hand, p-values are distinctively smaller against alternative samples. Some far-fetched distributions even have p-values smaller than the machine level (<10−323) with the boson sampling data, which illustrates the effectiveness of our certification method. With an appropriate significance level α and suitable number of bubbles N , one could B minimize both type I and type II errors in the experimental certification process.