ebook img

Recognition Performance of a Structured Language Model PDF

4 Pages·0.08 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Recognition Performance of a Structured Language Model

RECOGNITIONPERFORMANCEOFASTRUCTUREDLANGUAGEMODEL† CiprianChelba FrederickJelinek CenterforLanguageandSpeechProcessing TheJohnsHopkinsUniversity,Baltimore,MD-21218,USA {chelba,jelinek}@jhu.edu 0 0 0 ABSTRACT h’_{-1} = h_{-2} h’_0 = (h_{-1}.word, NTlabel) 2 Anewlanguagemodelforspeechrecognitioninspiredbylin- n guisticanalysisispresented.Themodeldevelopshiddenhierar- h_{-1} T’_0 h_0 a chicalstructureincrementallyandusesittoextractmeaningful T’_{-m+1}<-<s> J information from the word history — thus enabling the use of <s> ............... T’_{-1}<-T_{-2} T_{-1} T_0 extendeddistancedependencies—inanattempttocomplement 4 thelocalityofcurrentlyusedtrigrammodels.Thestructuredlan- 2 guagemodel,itsprobabilisticparameterizationandperformance Figure2. Resultofadjoin-leftunderNTlabel inatwo-passspeechrecognizerarepresented. Experimentson ] L theSWITCHBOARDcorpusshowanimprovementinbothper- h’_{-1}=h_{-2} h’_0 = (h_0.word, NTlabel) plexityandworderrorrateoverconventionaltrigrammodels. C h_{-1} h_0 . s T’_{-m+1}<-<s> c 1. INTRODUCTION <s> ............... T’_{-1}<-T_{-2} T_{-1} T_0 [ The main goal of the present work is to develop and evaluate a language model that uses syntactic structure to model long- 1 Figure3. Resultofadjoin-rightunderNTlabel distancedependencies. Themodelwepresentiscloselyrelated v 2 totheoneinvestigatedin[1],howeverdifferentinafewimpor- theword-parsek-prefix. Figure1showsaword-parsek-prefix; 2 tantaspects: h_0 .. h_{-m} are the exposed heads, each head being a •ourmodeloperatesinaleft-to-rightmanner,allowingthede- pair(headword, non-terminal label), or (word, POStag) in the 0 coding of word lattices, as opposed to theone referred topre- caseofaroot-onlytree. 1 viously, where only whole sentences could be processed, thus 0 reducingitsapplicabilitytoN-bestlistre-scoring;thesyntactic 2.1. ProbabilisticModel 0 structureisdevelopedasamodelcomponent; TheprobabilityP(W,T)ofawordsequenceW andacomplete 0 •ourmodelisafactoredversionoftheonein[1],thusenabling parseT canbebrokeninto: / s thecalculationofthejointprobabilityofwordsandparsestruc- c ture; thiswasnotpossibleinthepreviouscaseduetothehuge P(W,T)= v: comThpeutsattriuocntaulrceodmlapnlegxuiatygeofmthoadteml (oSdLelM. ), its probabilistic pa- Qnk=+11[ P(wk/Wk−1Tk−1)·P(tk/Wk−1Tk−1,wk)· Xi rameterizationandperformanceinatwo-passspeechrecognizer Nk r —weevaluatethemodelinalatticedecodingframework—are YP(pki/Wk−1Tk−1,wk,tk,pk1...pki−1)] (1) presented. Experimentson theSWITCHBOARDcorpus show a i=1 an improvement in both perplexity (PPL) and word error rate (WER)overconventionaltrigrammodels. where: •W T istheword-parse(k−1)-prefix k−1 k−1 2. STRUCTUREDLANGUAGEMODEL •wkisthewordpredictedbyWORD-PREDICTOR •t isthetagassignedtow bytheTAGGER k k AnextensivepresentationoftheSLMcanbefoundin[2]. The • N − 1 is the number of operations the PARSER executes k modelassignsaprobabilityP(W,T)toeverysentenceW and at sentence position k before passing control to the WORD- its every possible binary parse T. The terminals of T are the PREDICTOR (the N -th operation at position k is the null k words of W with POStags, and the nodes of T are annotated transition);N isafunctionofT k with phrase headwords and non-terminal labels. Let W be a •pkdenotesthei-thPARSERoperationcarriedoutatpositionk i inthewordstring;theoperationsperformedbythePARSERare h_{-m} = (<s>, SB) h_{-1} h_0 = (h_0.word, h_0.tag) illustratedinFigures2-3andtheyensurethatallpossiblebinary branching parses with all possible headword and non-terminal labelassignmentsforthew ...w wordsequencecanbegen- 1 k erated. (<s>, SB) ....... (w_p, t_p) (w_{p+1}, t_{p+1}) ........ (w_k, t_k) w_{k+1}.... </s> Our model is based on three probabilities, estimated using deletedinterpolation[7],parameterizedasfollows: Figure1. Aword-parsek-prefix P(wk/Wk−1Tk−1) = P(wk/h0,h−1) (2) sentence of length n words to which we have prepended <s> P(tk/wk,Wk−1Tk−1) = P(tk/wk,h0.tag,h−1.tag)(3) aWndkabpeptehnedewdor<d/ks->prseofixthwat0w.0..=w<kso>ftahnedswennt+en1c=e<an/ds>W.kLTekt P(pki/WkTk) = P(pki/h0,h−1) (4) †THIS WORK WAS FUNDED BY THE NSF IRI-19618874 It is worth noting that if the binary branching structure devel- GRANTSTIMULATE opedbytheparserwerealwaysright-branchingandwemapped thePOStagandnon-terminallabelvocabulariestoasingletype thestack,expandedwithallpossibleone-symbolcontinuations thenourmodelwouldbeequivalenttoatrigramlanguagemodel. ofxinLandthenalltheresultingexpandedprefixes—among Sincethe number of parsesfor agiven word prefix W grows whichtheremaybecompletehypothesesaswell—areinserted k exponentially withk, |{T }| ∼ O(2k), the state space of our back into the stack. The stopping condition is: whenever the k model is huge even for relatively short sentences so we had poppedhypothesisisacompleteone,retainthatoneastheover- to use a search strategy that prunes it. Our choice was a syn- allbesthypothesish∗. chronousmulti-stacksearchalgorithmwhichisverysimilartoa 3.2. A∗forLatticeDecoding beamsearch. There are a couple of reasons that make A∗ appealing for our Theprobabilityassignmentforthewordatpositionk+1in problem: theinputsentenceismadeusing: •thealgorithmoperateswithwholeprefixesx,makingitideal forincorporatinglanguagemodelswhosememoryistheentire prefix; PSLM(wk+1/Wk) = X P(wk+1/WkTk)·ρ(Wk,Tk), •areasonablygoodoverestimateh(y|x)andanefficientwayto Tk∈Sk calculatehL(x)(seeEq.6)arereadilyavailableusingthen-gram ρ(Wk,Tk) = P(WkTk)/ X P(WkTk) (5) model,aswewillexplainlater. Thelatticesweworkwithretainthefollowinginformationaf- Tk∈Sk terthefirstpass: whichensuresaproper probabilityoverstringsW∗, whereS •time-alignmentofeachnode; k isthesetofallparsespresentinourstacksatthecurrentstagek. • for each link connecting two nodes in the lattice we retain: wordidentity,acousticmodelscoreandn-gramlanguagemodel AnN-bestEM[5]variantisemployedtoreestimatethemodel score.Thelatticehasauniquestartingandendingnode,respec- parameters such that the PPL on training data is decreased — tively. thelikelihoodofthetrainingdataunderourmodelisincreased. ThereductioninPPLisshownexperimentallytocarryoverto A lattice can be conceptually organized as a prefix tree of thetestdata. paths. When rescoring the lattice using a different language model than the one that was used in the first pass, we seek to 3. A∗DECODERFORLATTICES findthecompletepathp=l0...lnmaximizing: 3.1. A∗Algorithm n TheA∗algorithm[8]isatreesearchstrategythatcouldbecom- f(p) = X[logPAM(li) pared to depth-first tree-traversal: pursue the most promising i=0 pathasdeeplyaspossible. Tobemorespecific,letasetofhypotheses + LMweight·logPLM(w(li)|w(l0)...w(li−1)) L = {h : x1,...,xn}, xi ∈ W∗ — to be scored using the − logPIP ] (8) functionf(·)—beorganizedasaprefixtree.Wewishtoobtain thehypothesish∗ = argmax f(h)withoutscoringallthe where: h∈L hypothesesinL,ifpossiblewithaminimalcomputationaleffort. •logPAM(li)istheacousticmodel log-likelihood assignedto Tobeabletopursuethemostpromisingpath, thealgorithm linkli; needs to evaluate the possible continuations of a given prefix •logPLM(w(li)|w(l0)...w(li−1))isthelanguagemodellog- x=w1,...,wpthatreachtheendofthelattice. LetCL(x)be probability assigned to link li given the previous links on the thesetofcompletecontinuationsofxinL—theyallreachthe partialpathl0...li; endofthelattice,seeFigure4.Assumewehaveanoverestimate • LMweight > 0 is a constant weight which multiplies the g(x.y)=f(x)+h(y|x)≥f(x.y)forthescoreofcompletehy- languagemodelscoreofalink;itstheoreticaljustificationisun- pothesisx.y—.denotesconcatenation;imposingthath(y|x)= clearbutexperimentsshowitsusefulness; 0foremptyy,wehaveg(x)=f(x),∀completex ∈L. This •logPIP > 0 isthe“insertion penalty”; again, itstheoretical justificationisunclearbutexperimentsshowitsusefulness. TobeabletoapplytheA∗ algorithmweneedtofindanap- propriate stack entryscoring function gL(x) where xis apar- tialpath and Listhe set of complete paths in thelattice. Go- w CL(x) ingbacktothedefinition(6)ofgL(·)weneedanoverestimate 2 w g(x.y)=f(x)+h(y|x)≥f(x.y)forallpossibley=lk...ln w p completecontinuationsofxallowedbythelattice. Wepropose 1 tousetheheuristic: n h(y|x) = X[logPAM(li)+LMweight·(logPNG(li) Figure4. PrefixTreeOrganizationofaSetofHypotheses i=k +logPCOMP)−logPIP] meansthatthequantitydefinedas: +LMweight·logPFINAL·δ(k<n) (9) . gL(x) = max g(x.y)=f(x)+hL(x), (6) Asimplecalculationshowsthatif y∈CL(x) . hL(x) = max h(y|x) (7) logPNG(li)+logPCOMP ≥logPLM(li),∀li y∈CL(x) isanoverestimateofthemostpromisingcompletecontinuation issatisfied then gL(x) = f(x)+maxy∈CL(x)h(y|x) isa an appropriatechoicefortheA∗search. off(xx),in∀Lco:mgpLl(exte)x≥∈fL(x..y),∀y ∈ CL(x) and that gL(x) = ThejustificationforthelogPCOMP termisthatitissupposed tocompensatefortheperworddifferenceinlog-probabilitybe- TheA∗ algorithmusesapotentiallyinfinitestack1 inwhich tweenthen-grammodelNGandthesuperiormodelLM with tpernesfiixoenssxteparteheortdoepr-emdoisntdperecfirexaxsin=gwor1d,e.r.o.f,wgLp(ixs)p;oaptpeeadchfoerxm- pwehcitcehdwvaelureesccaonrebteheesltaitmtiacteed—frhoemnctehelodgiPffCerOenMcPei>np0e.rpIltesxeixty- 1Thestackneednotbelargerthan|L|=n betweenthetwomodelsLM andNG. ThelogPFINAL > 0 termisusedforpracticalconsiderationsasexplainedinthenext usedforspeech recognition—SWB-CSR—isdifferent from section. theSWB-Treebankintwoaspects: Thecalculation of gL(x) (6) ismade veryefficient after re- •theSWB-TreebankisasubsetoftheSWB-CSRdata; alizing that one can use the dynamic programming technique • the SWB-Treebank tokenization isdifferent from that of the in the Viterbi algorithm [9]. Indeed, for a given lattice L, SWB-CSRcorpus;amongotherspurioussmalldifferences,the the value of hL(x) is completely determined by the identity mostfrequentonesareofthetypepresentedinTable1. of the ending node of x; aViterbibackward pass over the lat- ticecanstoreateachnodethecorrespondingvalueofhL(x)= SWB-Treebank SWB-CSR hL(ending node(x))suchthatitisreadilyavailableintheA∗ don’t don’t search. it’s it’s 3.3. SomePracticalConsiderations i’m i’m i’ll i’ll Inpracticeonecannotmaintainapotentiallyinfinitestack. We chose to control the stack depth using two thresholds: one on themaximumnumberofentriesinthestack,calledstack-depth- Table1.SWB-TreebankSWB-CSRtokenizationmismatch thresholdandanotheroneonthemaximumlog-probabilitydif- ferencebetweenthetopmostandthebottommosthypothesesin OurgoalistotraintheSLMontheSWB-CSRcorpus. thestack,calledstack-logP-threshold. 4.1.1. TrainingSetup A gross overestimate used in connection with a finite stack may lure the search on a cluster of paths which is suboptimal ThetrainingoftheSLMmodelproceededasfollows: —thedesiredclusterofpathsmayfallshortofthestackifthe • train SLM on SWB-Treebank — using the SWB-Treebank overestimatehappenstofavorawrongcluster. closedvocabulary—asdescribedin[2];thisispossiblebecause Also, longer partial paths — thus having shorter suffixes forthisdatawehaveparsetreesfromwhichwecangatherinitial — benefit less from the per word logPCOMP compensation statistics; which means that they may fall out of a stack already full • process the SWB-CSR training data to bring it closer to the with shorter hypotheses — which have high scores due to SWB-Treebank format. We applied the transformations sug- compensation. This is the justification for the logPFINAL gestedbyTable1;theresultingcorpuswillbecalledSWB-CSR- term in the compensation function h(y|x): the variance Treebank,althoughatthisstageweonlyhavewordsandnoparse var[logPLM(li|l0...li−1) − logPNG(li)] is a finite posi- treesforit; tive quantity so the compensation is likely to be closer to • transfer the SWB-Treebank parse trees onto the SWB-CSR- the expected value E[logPLM(li|l0...li−1) − logPNG(li)] Treebank trainingcorpus. Todosoweparsed theSWB-CSR- for longer y continuations than for shorter ones; introduc- TreebankusingtheSLMtrainedontheSWB-Treebank;thevo- ing a constant logPFINAL term is equivalent to an adap- cabularyforthisstepwastheunionbetweentheSWB-Treebank tive logPCOMP depending on the length of the y suffix — and theSWB-CSR-Treebankclosed vocabularies; at thisstage smaller equivalent logPCOMP for long suffixes y for which SWB-CSR-Treebankistrulya“treebank”; E[logPLM(li|l0...li−1)−logPNG(li)]isabetterestimatefor • retrainthe SLMon the SWB-CSR-Treebanktraining corpus logPCOMP thanitisforshorterones. usingtheparsetreesobtainedatthepreviousstepforgathering Because the structured language model is computationally initialstatistics;thevocabularyusedatthisstepwastheSWB- expensive, a strong limitation is being placed on the width CSR-Treebankclosedvocabulary. of the search — controlled by the stack-depth and the 4.1.2. LatticeDecodingSetup stack-logP-threshold. For an acceptable search width — runtime — one seeks to tune the compensation parameters To be able to run lattice decoding experiments we need to tomaximizeperformancemeasuredintermsofWER.However, bring the lattices — SWB-CSR tokenization — to the SWB- the correlation between these parameters and the WER is not CSR-Treebankformat.Theonlyoperationinvolvedinthistrans- clearandmakessearchproblemsdiagnosisextremelydifficult. formationissplittingcertainwordsintotwoparts,assuggested Ourmethodforchoosingthesearchandcompensationparame- byTable1. Eachlinkwhosewordneedstobesplitiscutinto terswastosampleafewcompletepathsp1,...,pN fromeach twopartsandanintermediatenodeisinsertedintothelatticeas lattice,rescorethosepathsaccordingtothef(·)function(8)and showninFigure5. Theacousticandlanguagemodelscoresof thenranktheh∗ pathoutputbytheA∗ searchamongthesam- theinitiallinkarecopiedontothesecondnewlink. Forallthe pled paths. A correct A∗ search should result in average rank 0. Inpracticethisdoesn’thappenbutonecantracethetopmost s s s_time s_time i pathp∗—intheoffendingcasesp∗ 6= h∗ andf(p∗) > f(h∗) e_time —andcheckwhetherthesearchfailedstrictlybecauseofinsuf- w_1, 0, 0 w, AMlnprob, NGlnprob ficientcompensation—aprefixofthep∗ hypothesisispresent inthestackwhenA∗returns—orbecausethepathp∗fellshort w_2, AMlnprob, NGlnprob ofthestackduringthesearch—inwhichcasethecompensation W -> W_1 W_2 andthesearch-widthinteract. Themethodwechoseforsamplingpathsfromthelatticewas e e anN-bestsearchusingthen-gramlanguagemodelscores;thisis e_time e_time appropriateforpragmaticreasons—onepreferslatticerescor- ingtoN-bestlistrescoringexactlybecauseofthepossibilityto Figure5. LatticeProcessing extractapaththatisnotamongthecandidatesproposedinthe N-bestlist—aswellaspracticalreasons—theyareamongthe decodingexperimentswehavecarriedout,theWERismeasured “better”pathsintermsofWER. afterundoingthetransformationshighlightedabove; therefer- ence transcriptions for the test data were not touched and the 4. EXPERIMENTS NISTSCLITEpackagewasusedformeasuringtheWER. 4.1. ExperimentalSetup In order to train the structured language model (SLM) as de- 4.2. PerplexityResults scribed in [2] we need parse trees from which to initializethe As a first step we evaluated the perplexity performance of the parametersofthemodel. FortunatelyapartoftheSwitchboard SLM relative to that of a deleted interpolation 3-gram model (SWB)[6]datahasbeenmanuallyparsedatUPenn;letusre- trainedinthesameconditions. Weworked ontheSWB-CSR- fertothiscorpusastheSWB-Treebank.TheSWBtrainingdata Treebankcorpus. Thesizeofthetrainingdatawas2.29Mwds; the size of the test data set aside for perplexity measurements ones—afterrescoringthemusingthestructuredlanguagemodel was28Kwds—WS97DevTest[4]. Weusedaclosedvocab- interpolatedwiththetrigram—was1.07(minimumachievable ulary — test set words included in the vocabulary — of size valueis0). Therewere585offendingsentences—outofato- 22Kwds. Similarto the experiments reported in [2], we built tal of 2427 test sentences — in which the A∗ search led to a adeletedinterpolation3-grammodelwhichwasusedasabase- hypothesiswhosescorewaslowerthanthatofthetophypoth- line;wehavealsolinearlyinterpolatedtheSLMwiththe3-gram esis among the N-best (1-best). In 310 cases the prefix of the baselineshowingamodestreductioninperplexity: rescored 1-best was still in the stack when A∗ returned — in- adequatecompensation—andintheother275casesthe1-best P(wi|Wi−1)=λ·P(wi|wi−1,wi−2)+(1−λ)·PSLM(wi|Wi−1) hypothesiswaslostduringthesearchduetothefinitestacksize. One interesting experimental observation was that even TheresultsarepresentedinTable2. though in the 585 offending cases the score of the 1-best was higher than that of the hypothesis found by A∗, the WER of LanguageModel L2RPerplexity thosehypotheses—asaset—washigherthanthatofthesetof DEVset TESTset A∗hypotheses. λ 0.0 1.0 0.0 0.4 1.0 3-gram+InitlSLM 23.9 22.5 72.1 65.8 68.6 5. CONCLUSIONS 3-gram+ReestSLM 22.7 22.5 71.0 65.4 68.6 Similar experiments on the Wall Street Journal corpus are re- portedin[3]showingthattheimprovementholdsevenwhenthe Table2. PerplexityResults WERismuchlower. We believe we have presented an original approach to lan- guagemodelingthattakesintoaccountthehierarchicalstructure 4.3. LatticeDecodingResults innatural language. Our experiments showed improvement in We proceeded to evaluate the WER performance of the SLM bothperplexityandworderrorrateovercurrentlanguagemod- using the A∗ lattice decoder described previously. Before de- elingtechniquesdemonstratingtheusefulnessofsyntacticstruc- scribingtheexperimentsweneedtomakeclearonepoint;there tureforimprovedlanguagemodels. aretwo3-gramlanguagemodelscoresassociatedwiththeeach linkinthelattice: 6. ACKNOWLEDGMENTS •thelanguagemodelscoreassignedbythemodelthatgenerated Theauthors would liketothank toSanjeev Khudanpur for his thelattice,referredtoastheLAT3-gram;thismodeloperateson insightful suggestions. Also thanks to Bill Byrne for making textintheSWB-CSRtokenization; availabletheSWBlattices,VaibhavaGoelformakingavailable •thelanguage model score assigned by rescoringeach linkin theN-bestdecoderandVaibhavaGoel,HarrietNockandMurat thelatticewiththedeletedinterpolation3-grambuiltonthedata Saraclarforusefuldiscussionsaboutlatticerescoring. in the SWB-CSR-Treebank tokenization, referred to simply as the3-gram—usedintheexperimentsreportedintheprevious REFERENCES section. [1] C.Chelba,D.Engle,F.Jelinek,V.Jimenez,S.Khudanpur, Theperplexityresultsshowthatinterpolationwiththe3-gram L. Mangu, H. Printz, E. S. Ristad, R. Rosenfeld, A. Stol- modelisbeneficialforourmodel.Notethattheinterpolation: cke,andD.Wu.Structureandperformanceofadependency P(l)=λ·PLAT3−gram(l)+(1−λ)·PSLM(l) languagemodel. InProceedingsofEurospeech, volume5, pages2775–2778.Rhodes,Greece,1997. betweentheLAT3-grammodelandtheSLMisillegitimatedue [2] Ciprian Chelba and Frederick Jelinek. Exploiting syn- tothetokenizationmismatch. tactic structure for language modeling. In Proceedings As explained previously, due to the fact that the SLM’s of COLING-ACL, volume 1, pages 225–231. Montreal, memory extends over the entire prefix we need to ap- Canada,1998. ply the A∗ algorithm to find the overall best path in the lattice. The parameters controlling the A∗ search were [3] Ciprian Chelba and Frederick Jelinek. Structured lan- guagemodeling forspeechrecognition. InProceedingsof set to: logPCOMP = 0.5, logPFINAL = 2, LMweight NLDB99,toappear.Klagenfurt,Austria,1999. = 12, logPIP = 10, stack-depth-threshold=30, stack-depth-logP-threshold=100—see(8)and(9). [4] CLSP.WS97.InProceedingsofthe1997CLSP/JHUWork- TheparameterscontrollingtheSLMwerethesameasin[2].The shop on Innovative Techniques forLarge Vocabulary Con- resultsfordifferentinterpolationcoefficientvaluesareshownin tinuousSpeechRecognition.Baltimore,July-August1997. Table3. [5] A.P.Dempster, N.M.Laird,andD.B.Rubin. Maximum likelihood fromincompletedataviatheEMalgorithm. In LanguageModel WER Journal of the Royal Statistical Society, volume 39 of B, Search A∗ Vite pages1–38.1977. λ 0.0 0.4 1.0 1.0 [6] J.J.Godfrey,E.C.Holliman,andJ.McDaniel. SWITCH- LAT-3gram+SLM 42.4 40.3 41.6 41.3 BOARD telephone speech corpus for research and devel- opment. InProceedingsofIEEEConferenceonAcoustics, Table3. LatticeDecodingResults Speech and Signal Processing, volume 1, pages 517–520. SanFrancisco,March1992. Thestructuredlanguagemodelachievedanabsoluteimprove- [7] FrederickJelinekandRobert Mercer. Interpolatedestima- mentof1%WERoverthebaseline;theimprovementisstatisti- tion of markov source parameters from sparse data. In callysignificantatthe0.002levelaccordingtoasigntest.Inthe E. Gelsema and L. Kanal, editors, Pattern Recognition in 3-gramcase,theA∗ searchlooses0.3%overtheViterbisearch Practice,pages381–397.1980. duetofinitestackandheuristiclookahead. [8] N. Nilsson. Problem Solving Methods in Artificial Intelli- 4.4. SearchEvaluationResults gence,pages266–278. McGraw-Hill,NewYork,1971. For tuning the search parameters we have applied the N-best [9] A. J. Viterbi. Error bounds for convolutional codes and lattice sampling technique described in Section 3.3. As a by- an asymmetrically optimum decoding algorithm. In IEEE product,theWERperformanceofthestructuredlanguagemodel Transactions on Information Theory, volume IT-13, pages onN-bestlistrescoring—N=25—was40.9%. Theaverage 260–267,1967. rankofthehypothesisfoundbytheA∗searchamongtheN-best

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.