Who’s Afraid of George Kingsley Zipf? CharlesYang∗ DepartmentofLinguistics&ComputerScience UniversityofPennsylvania [email protected] June2009 Abstract WeexplorethestatisticaldistributionsofnaturallanguageknownasZipf’slawandde- velopanewapproachtoassessthepropertiesoftheunderlyinggrammargivenasample oflinguisticproduction. Focusingonlanguageacquisition,weshowthattheitemorusage basedapproachtolanguageandlanguagelearningfailstoprovideadequatestatisticaltests oflinguisticproductivity,andthatevenveryyoungchildren’sgrammarisabstract,system- atic,andfullygenerative. 1 Introduction Einsteinwasaverylatetalker.Aslegendhasit,thefirstthingtheyoungEinsteineveruttered,at theageofthree,was“Thesoupistoohot”.Apparentlytheboygeniushadnothinginterestingto saybeforethat. The credibility of such tales aside — similar stories with other famous subjects abound — theydocontainakerneloftruth: achilddoesn’thavetosaysomething,anything,justbecause he can. And this poses a challenge for the study of child language, when children’s linguistic productionisoftentheonly,andcertainlythemostaccessible,dataonhand.Languageuseisthe compositeoflinguistic,cognitiveandperceptualfactorsmanyofwhich,inthechild’scase,are stillintheprocessofdevelopmentandmaturation;anditisdifficulttodrawinferencesaboutthe learner’slinguisticknowledgefromhislinguisticbehavior.Thismuchhasbeenwellappreciated eversinceChomsky[1]drewthecompetence/performancedistinction.Thepioneeringworkon childlanguagethatsoonfollowed,mostnotablyRogerBrown’slandmarkstudy[2],recognized such potential gaps between what the child knows and what the child says and proposed to interpretchildlanguageintermsofadult-likegrammaticaldevices. (See,inparticular,Brown’s ∗Forhelpfulcomments,IwouldliketothankVirginiaValian,JulieAnneLegate,SamGutmann,BobFrank,Mark Libermanandtheaudienceatthe2009SchultinkLecture,theUniversityofGroningen,wherethesematerialswere firstpresented.SpecialthankstoErwinChanandCostasLignosfortheirhelpwithTable3andFigure4. 1 critiqueofthePivotGrammarhypothesis[3],bearsmorethanapassingresemblancewithsome contemporarytheorizingofchildlanguagereviewedhere.) This tradition is now challenged by the item or usage-based approach to language [4–7]. This change reflects a current trend in the study of language, as can be seen in these pages [8–12]), which emphasizes the storage of specific linguistic forms and constructions at the ex- pense of general combinatorial linguistic principles and overarching points of language varia- tion[1,13]. Childlanguage,especiallyintheearlystages,isclaimedtoconsistofspecificitem- basedschemas,ratherthanproductivelinguisticsystemaspreviouslyconceived. Consider,for instance,threecasestudiesinBox1[5],whichhavebeencitedasevidencefortheitem-based viewatnumerousplaces. Box1:ProductionEvidenceforItemBasedApproachtoLanguageLearning • TheVerbIslandHypothesis[4].Inalongitudinalstudyofearlychildlanguage,itisnoted that“ofthe162verbsandpredicatetermsused,almosthalfwereusedinoneandonly oneconstructiontype,andovertwo-thirdswereusedineitheroneortwoconstruction types ...”. Hence, “the 2-year-old child’s syntactic competence is comprised totally of verb-specificconstructionswithopennominalslots”,ratherthanabstractandproduc- tivesyntacticrules. • Limitedmorphologicalinflection. AstudyofchildItalian[14]findsthat47%ofallverbs used by 3 children (1;6 to 3;0) were used in 1 person-number agreement form, and an additional40%wereusedwith2or3forms,wheresixformsarepossible(3person×2 number).Only13%ofallverbsappearedin4ormoreforms. • Unbalanceddeterminerusage.[15]notesthatwhenchildrenbegantousethedetermin- ers a and the with nouns, “there was almost no overlap in the sets of nouns used with the two determiners, suggesting that the children at this age did not have any kind of abstractcategoryofDeterminersthatincludedbothoftheselexicalitems”. Thisobser- vationisheldtocontradicttheearlieststudy[16]whichmaintainsthatchilddeterminer useisadult-likebytheageof2;0. Sofaraswecantell,however,thesepurportedevidenceforitem-basedlearningisgivenen- tirelyonthebasisofinformalinspectionsratherthanrigorousempiricaltests. Forallexamples and observations from child language, not a single statistical test can be found in Tomasello’s work[4]wheretheVerbIslandHypothesisandrelatedideasaboutitem-basedlearningarefirst putforward.Ingeneral,theoreticalconclusionsareonlyconvincingwhenthenulloralternative hypothesiscanbestatisticallyrejected; thatis,theobservationinBox1beshowntobeincon- sistentwiththeexpectationfromafullyproductivegrammar.Inthisnote,weprovidestatistical analysisofwhatsuchalternativehypothesiswouldbe.Wedemonstratethatchildren’slanguage useshowsexactlytheoppositeoftheitem-basedview,andthehypothesisofearlyproductivityis 2 infactsupported. Morebroadly,wedirectcognitivescientiststocertainstatisticalpropertiesof naturallanguagethatarewidelyknownbutnotwidelyappreciated,andtodiscussthechallenges theirpropertiesposeforthetheoryoflanguageandlanguagelearning.Ourpointofdepartureis anamethatoughttostrikefearineverylivingsoul:GeorgeKingsleyZipf. 2 ZipfianPresence Undertheso-calledZipf’slaw[17],thedistributionsofwordsfollowacuriouspattern:relatively fewwordsareusedfrequently—veryfrequently—whilemostwordsarerare,withmanyoccur- ringonlyonceinevenlargesamplesoftexts.Moreprecisely,thefrequencyofawordtendstobe approximatelyinverselyproportionaltoitsrankinfrequency.Let f bethefrequencyoftheword w withtherankofr inasetofN,then: C f = whereC issomeconstant r IntheBrowncorpus[18],forinstance,thewordwithrank1is“the”,whichhasthefrequency of about 70,000, and the word with rank 2 is “of”, with the frequency of about 36,000: almost exactlyasZipf’slawentails.TheZipfiancharacterizationofwordfrequencycanbevisualizedby plottingthelogofwordfrequencyagainstthelogofwordrank.Bytakingthelogonbothsidesof theequationabove(logf = constant −logr),aperfectZipfianfitwouldbeastraightlinewith theslope-1. Indeed, Zipf’slawhasbeenobservedinvocabularystudiesacrosslanguagesand genres, andthelog-logslopefitisconsistentlyinthecloseneighborhoodof-1.0[19]. Thetop lineinFigure1plotswordrankandfrequencyonalog-logscalebasedontheBrowncorpus:the Zipfianfitisexcellent. Top:words,Bottom:pseudowords 12 10 8 log(freq) 6 4 2 0 0 2 4 6 8 10 12 log(rank) 3 Figure1.ZipfiandistributionofwordsandpseudowordsintheBrowncorpus[18].Thelower lineisplottedbytaking“words”tobeanysequenceoflettersbetweene’s[23].Thetwostraight dottedlinesarelinearfunctionswiththeslope-1,whichillustratethegoodnessoftheZipfian fit. TherehasbeenagooddealofcontroversyovertheinterpretationofZipf’slaw,whichshows up not only in the context of words but also other physical and social systems. It is now clear that the observation of Zipfian distribution alone is of no inherent interest or significance, as certain random generating processes can produce outcomes that follow Zipf’s law [20–22]. As notedin[23],ifweredefine“words”asalphabetsbetweenanytwooccurrencesofsomeletter, say, “e”, ratherthanspaceasinthecaseofwrittentext, theresultingdistributionmayfitZipf’s lawevenbetter.ThisisillustratedbythelowerlineinFigure1,whichfollowstheZipfianstraight lineatleastaswellasrealwords. Itisoftenthecasethatwearenotparticularlyconcernedwiththeactualfrequenciesofwords buttheirprobabilityofoccurrence. Zipf’slawallowustoestimatetheprobabilityp oftheword n,whoserankisr amongN wordsinalinguisticsample: C N 1 (cid:88)1 p = r = whereH istheNthHarmonicNumber (1) (cid:80)N C rH N i i=1 i N i=1 Zipf’slawasappliedtothedistributionofwordshasbeenwellknownandstudied. Yetrela- tivelylittleattentionhasbeengiventothecombinatoricsofwordsunderagrammarandmore important, how one might draw inference about the grammar given the distribution of word combinatorics.Weturntothesequestionsimmediately. 3 TheUnbearableLightnessofProductivity Claimsofitem-basedlearningareestablishedontheassumptionthatlinguisticproductivityen- tails usage diversity in linguistic production. Take the notion of “overlap” in the case of deter- mineruseinearlychildlanguage(Box1),followsthelogicoftheVerbIslandhypothesis[4]. If the child has fully productive use of the syntactic category determiner, then one might expect her to use determiners with any noun for which they are appropriate. Since the determiners “the” and “a” have (virtually) identical syntactic distributions, a linguistically productive child thatuses“a”withanounisexpectedtoautomaticallytransfertheuseofthatnounto“the”.Thus, determiner-nounoverlapisdefinedasthepercentageofnounsthatappearswithbothdetermin- ersoutofthosethatappearwitheither. Thelowoverlapsinchildren’sdetermineruse[15]are takentosupporttheitem-basedviewofchildlanguage. However,usingasimilarmetric,Valian andcolleagues[24]findthattheoverlapmeasuresforyoungchildrenandtheirmothersarenot significantly different, and they are both very low. Indeed, when applied to the Brown corpus (seeBox3formethods),weobtainanoverlap25.2%,whichisactuallylowerthanthoseofsome childrenreportedin[15]. ItwouldfollowthatthelanguageoftheBrowncorpus, whichdraws 4 fromvariousgenresofprofessionalprintmaterials,islessproductiveandmoreitem-basedthan thatofatoddler—whichseemsabsurd. The reason for these seemingly paradoxical findings lies in the Zipfian distribution of syn- tacticcategoriesandthegenerativecapacityofnaturallanguagegrammar.Considerafullypro- ductiverule“DP→DN”,where“D→ a|the”and“N→ cat|book|desk|...”. Weusethisruleforits simplicityandforthereadilyavailabledataforempiricaltestsbutonecaneasilysubstitutethe rulefor“VP→VDP”,“VP→VinConstruction ”,“V →V +Person+Number+Tense”. x inflection stem Allsuchcasescanbeanalyzedwiththemethodsprovidedhere. Suppose a linguistic sample containsS determiner-noun pairs, which consist of D and N distinctdeterminersandnouns. (InthepresentcaseD =2for“a”and“the”.) Thefullproduc- tivityoftheDPrule,bydefinition,meansthatthetwocategoriescombineindependently. Two observationscanbemadeaboutthedistributionsofthetwocategoriesandtheircombinations. First,nouns(andopenclasswordsingeneral)willfollowzipf’slaw:nounsintheBrowncorpus, forinstance,showsalog-logslopeof-0.97(seeBox3formethods).Thismeansthatinanygiven sample, relatively few nouns occur often but most can be expected to occur only once or less thanonce,whichnecessarilyhavezerooverlapwithdeterminers. Second,whilethecombinationofD andN issyntacticallyinterchangeable,N’stendtofa- vor one of the two determiners, a consequence of linguistic pragmatics and conventions. For instance, we say “the bathroom” more often than “a bathroom” but “a bath” more often than “the bath”, even though all four DPs are perfectly grammatical. As noted earlier, about 75% of nounsintheBrowncorpusoccurwitheither“the”or“a”butnotboth. Eventheremaining25% which do occur with both show strong biases: only a further 25% (297) are used with “a” and “the”equallyfrequently. Overall,fornounsthatappearwithbothdeterminersasleastonce,the frequencyratiobetweenthemoreoverthelessfavoreddetermineris2.86:1. Thesegeneralpat- ternsholdforchildandadultspeechdataaswell. Inthesixchild-adultpairs(thus12samples) we examined from the CHILDES database (Box 3), the average percentage of balanced nouns amongthosethatappearwithboth“the”and“a”is22.8%,andthemorefavoredvs. lessfavored determinerhasanaveragefrequencyratioof2.54:1. Eventhoughtheseratiosdeviatefromthe perfect2:1ratiounderthestrictversionofZipf’slaw—themorefavoredisevenmoredominant overthe less—theyclearly pointout theconsiderable asymmetryin categorycombination us- age. Asaresult,evenwhenanounappearsseveraltimesinasample,thereisstillasignificant chancethatithasbeenpairedwithasingledeterminerinallinstances. Together,theseconsequencesofZipf’slawensurethattheaveragedeterminer-nounoverlap mustberelativelylowunlessthesamplesizeS isverylarge. Box2givesthetheoreticalanalysis ofoverlapinformallysketchedabove,andFigure2givesanillustration. 5 Box2.CalculatingExpectedOverlapinDeterminerNounUsage LetO(N,S) be the overlap value of N nouns in a sample S pairs of determiner-noun pairs. Consideranounnwhichhastherankofr outofN.Followingequation(1),ithasaprobability ofp =1/(rH )ofbeingdrawnatanysingletrialinS;itsexpectedoccurrenceinSisthussimply N (Sp). Itsexpectedprobabilityofbeingusedwithmorethan1determinersis1-itsexpected probability of n being used with exactly 1 determiner in all of the (Sp) trials. Obviously, if a nounisexpectedtobesampledonceorless,itwillhaveanoverlapofzero. Lettheexpected overlapofn beO(r,N,S). N 1 (cid:88) O(N,S)= O(r,N,S) (2) N r=1 (cid:40)1−(cid:80)D d(Sp) ifSp >1whered = 1 ,p = 1 O(r,N,S)= i=1 i i iHD rHN (3) 0 otherwise Theprobabilityofdeterminersd (i =1,2)in(3)alsofollowsZipf’slaw(1), i.e., d =2/3and i 1 d =1/3.a 2 aAlthoughtheempiricalfrequenciesofdeterminersdeviatesomewhatfromthestrictZipfianratioof2:1,numer- icalresultsshowthatthe2:1ratioisaveryaccuratesurrogateforawiderangeofactualrationsinthecalculationof (2)and(3). Thisisbecausemostofaverageoverlapvaluecomesfromtherelativelyfewandhighfrequentnouns; seeFigure2. 1 ×× × 0.9 × × 0.8 × 0.7 × × 0.6 × Expected 0.5 × Overlap × 0.4 × × 0.3 × × 0.2 × × × 0.1 × × × 0 ××××××××××××××××××××××××××××× 0 5 10 15 20 25 30 35 40 45 50 Rank Figure2.Expectedoverlapvaluesfornounsorderedbyrank,forN =50nounsinasamplesizeof 6 S=100.WordfrequenciesareassumedtofollowtheZipfiandistribution.Ascanbeseen,fewofnouns havehighprobabilitiesofoccurringwithbothdeterminers,butmostare(far)belowchance.Theaverage overlapis20.6%. Wenowstudydeterminer-nounoverlapinchildlanguageandcomparetheresultswiththe theoretical expectations. The empirical methods for data analysis are given in Box 3, and the resultsaresummarizedinTable1. Box3.EmpiricalStudiesofOverlapinLanguageProduction a. WeconsiderthedatafromAdam,Eve,Sarah,Naomi,Nina,andPeter[25]. Thesearetheall andonlychildrenintheCHILDESdatabasewithsubstantiallongitudinaldatathatstartsat theverybeginningofsyntacticdevelopment(i.e,oneortwowordstage)sothattheitem- basedstage,ifexists,couldbeobserved. b. We first removed the extraneous annotations from the child text and then applied an open source implementation of a rule-based part-of-speech tagger [26] (available http://gposttl.sourceforge.net/): wordsarenowassociatedwiththeirpart-of-speech((e.g., preposition,singularnoun,pasttenseverbetc.). ForlanguagessuchasEnglish,whichhas relativelysalientcuesforpart-of-speech(e.g.,rigidwordorder,lowdegreeofmorphologi- calsyncretism),suchtaggerscanachievehighaccuracyatover97%,whichissufficientfor our purposes. For comparison, we also consider the Brown corpus [18], which has been manuallytagged. c. WithPOStaggeddatasets,weextractedadjacentdeterminer-nounpairssuchasD iseither “a”or“the”,andN hasbeentaggedasasingularnoun. Wordsthataremarkedasunknown asdiscarded.Asisstandardinchildlanguageresearch,repetitionscountsonlyoncetoward the tally. For instance, when the child says “I made a queen. I made a queen. I made a queen”,“aqueen”iscountedonceforthesample. d. Foranadditionaltest,wehavepooledtogetherthefirst100,300,and500determiner-noun tokens of the six children and created three hypothetical children from the very earliest ageoflanguageacquisition,whichwouldpresumablybetheleastproductiveknowledgeof determinerusage. e. Foreachlearner,thetheoreticalexpectationofoverlapiscalculatedbasedonequationsin Box2,thatis,onlywiththesamplesizeSandthenumberofdistinctnounsN indeterminer- nounpairs. 7 Sample a&theNoun aortheNoun Overlap Overlap S Child Size(S) types types(N) (expected) (empirical) N Naomi(1;1-5;1) 884 60 349 19.0 19.8 2.53 Eve(1;6-2;3) 831 61 283 22.7 21.6 2.94 Sarah(2;3-5;1) 2453 187 640 26.4 29.2 3.83 Adam(2;3-4;10) 3729 252 780 32.0 32.3 4.78 Peter(1;4-2;10) 2873 194 480 43.0 40.4 5.99 Nina(1;11-3;11) 4542 308 660 47.2 46.7 6.88 First100 600 53 243 19.6 21.8 2.47 First300 1800 141 483 26.7 29.1 3.73 First500 3000 219 640 32.3 34.2 4.68 Browncorpus 20650 1175 4664 23.8 25.2 4.43 Table1.Empiricalandexpecteddeterminer-nounoverlapsinchildspeech.TheBrowncorpus isincludedforcomparison.Resultsincludethedatafromsixindividualchildrenandthefirst 100,300,500determiner-nounpairsfromallchildrenpooledtogether,whichreflecttheearliest stagesoflanguageacquisition.Theexpectedvaluesincolumn5arecalculatedusingonlythe samplesizeSandthenumberofnounsN (column2and4respectively),followingtheanalytic resultsinBox2. Thetheoreticalexpectationsandtheempiricalmeasuresofoverlapagreeextremelywell(col- umn5and6inTable1). Neitherpairedt-testnorpairedWilcoxontestshowsignificantdiffer- encebetweenthetwosetsofvalues. Perhapsamorerevealingtestislinearregression:aperfect agreement between the two sets of value would have the slope of 1.0, and the actual slope is 1.08(adjustedR2 =0.9705). Inotherwords, thedeterminerusagedatafromchildlanguageis consistentwiththeproductiverule“DP→DN”. Theempiricalstudiesalsorevealconsiderableindividualvariationintheoverlapvalues,and itisinstructivetounderstandwhy. AstheBrowncorpusresultsshow(Table1lastrow),sample sizeS,thenumberofnounsN,orthelanguageuser’sagealoneisnotpredictiveoftheoverlap value. Thevariationcanbeformallyanalyzed. GivenN nounsinasampleofS,greateroverlap valuewillbeobtainedwhenmorenounsareexpectedtooccurmorethanonce,orSp >1. That is,wordswhoseoccurrenceprobabilitiesthataregreaterthan1/S cancontributetotheoverlap value; Zipf’s law allows us to express this probability cutoff line in terms with ranks, following equation (1). The approximation below uses a well-known result from Euler’s summation for- mula. 1 S =1 rH N S S r = ≈ (4) H lnN N Thatis,onlynounswhoseranksarelowerthanS/(lnN)canbeexpectedtobenon-zerooverlaps. ThetotaloverlapisthusamonotonicallyincreasingfunctionofS/(NlnN)which,giventheslow growth of lnN, is approximately S/N, a term that must be positively correlated with overlap 8 measures.Thisisconfirmedinstrongestterms:S/N isanearperfectpredictorfortheempirical valuesofoverlap(lasttwocolumnsofTable1):r =0.986,p <0.00001. Wenowbrieflyexplorethequestionwhetherthedeterminerusagedatabychildrencanbe accounted for by the item based approach to language learning. Our effort is hampered by thelackofconcretemodelsfortheitem-basedlearningapproach,apointthatTomasellocon- cedes [4, p274]. Analytical results like those in Box 2 cannot be similarly obtained. A plausi- ble approach can be construed based on a central tenet of item-based learning, that the child doesnotformgrammaticalgeneralizationsbutrathermemorizesspecificanditemizedcombi- nations. Similarapproachessuchasconstructiongrammar[10],usage[27]andexemplarbased models [28] make similar commitment to the role of verbatim memory. To this end, we con- sider a type of learning model that memorizes determiner-noun pairs in the input, and these pairs are then sampled jointly, following the commitment of item-based learning, rather than independently(whichwouldbetheproductiverule-basedview). Sample Overlap Overlap Overlap Child Size(S) (BIGlearner) (smalllearner) (empirical) Eve 831 16.0 17.8 21.6 Naomi 884 16.6 18.9 19.8 Sarah 2453 24.5 27.0 29.2 Peter 2873 25.6 28.8 40.4 Adam 3729 27.5 28.5 32.3 Nina 4542 28.6 41.1 46.7 First100 600 13.7 17.2 21.8 First300 1800 22.1 25.6 29.1 First500 3000 25.9 30.2 34.2 Table2.Isthefullproductivitydatainchildlanguageconsistentwithitem-basedlearning?Two variantsoflearnersareconsidered.Onetype,theBIGlearner,isdesignedtomimicthelong termcommitmenttomemory;itstoresalargesetofdeterminer-nounpairs,whichconsistsof asampleof1.1millionchilddirectedutterancesfromtheCHILDESdatabase(methodsas describedinBox3).Theothervariant–thesmalllearner–onlymemorizestheadultutterances presentineachchild’stranscript.Forbothlearners,wedrawanindependentandrandom samplefromthesestoredD-Npairswithrespecttotheirjointempiricalfrequencies;thisis contrastedwiththerule-basedmodelinwhichDandNaredrawnindependently.Thesample sizematchesthoseineachchild’sproduction(Table1,column2).Theoverlapvaluesarethen calculatedasthepercentageofnounsthatappearwithboth“a”and“the”overthosethat appearwitheither.TheresultsaregiveninTable2,averagedover1000trialsperchild. Bothsetsofoverlapvaluesfromitem-basedlearning(column3and4)aresignificantlydif- ferent from the empirical measures (column 5): p < 0.005 for both paired t-test and paired Wilcoxon test. This suggests that children’s use of determiners do not follow the predictions oftheitem-basedlearningapproach.Naturally,ourevaluationhereistentativesincetheproper test can be carried out only when the theoretical predictions of item-based learning are made 9 clear.Andthatisexactlythepoint:theadvocatesofitem-basedlearningnotonlyrejectedtheal- ternativehypothesiswithoutadequatestatisticaltests,butalsoacceptedthefavoredhypothesis withoutadequatestatisticaltests. Intuitionisnosubstitutefortheoreticalanalysisorstatistical validation. 4 AnItemizedLookatVerbs Theformalanalysisinsection3canbegeneralizedtothestudyofchildverbsyntaxandmor- phology(Box1).Unfortunately,theacquisitiondatainsupportoftheVerbIslandHypothesis[4] andtheitem-basednatureofearlymorphology[14]isnotavailableinthepublicdomain. ButthereisnoescapefromtheZipfiangrasp:thecombinatoricsofverbsandtheirmorpho- logicalandsyntacticassociatesaresimilarlylopsidedintheirusagedistributionasinthecaseof determiners. ConsiderfirstthekindofverbalsyntaxdistributionsattributedtotheVerbIsland Hypothesis.Wefocusonconstructionsthatinvolveatransitiveverbanditsnominalobjects,in- cludingpronounsandnounphrases.Followingthedefinitionof“sentenceframe”inTomasello’s original Verb Island study [4, p242], each unique lexical item in the object position counts as a unique construction for the verb. Figure 3 shows the construction frequencies of the top 15 transitiveverbsin1.1millionchilddirectedutterances. 8 × 7.5 7 × 6.5 × log(freq) 6 × × 5.5 × 5 × × 4.5 × × 4 0 0.5 1 1.5 2 2.5 log(rank) Figure3.Rankandfrequencyofverb-objectconstructionsbasedon1.1millionchild-directed utterances.ProcessingmethodsareasdescribedinBox3exceptherewefocusonadjacentverb- nominalpairsinpart-of-speechtaggedtexts. Theverbsarethetop15mostfrequenttransitive verbs: put,tell,see,want,let,give,take,show,got,ask,makeeat,like,bring andhear. Foreach verb,wecounteditstop10mostfrequentconstructions,whicharedefinedastheverbfollowed auniquelexicalitemintheobjectposition(e.g.,“askhim”and“askJohn”aredifferentconstruc- tions.) For each of the 10 ranks, we tallied the construction frequencies for all 15 verbs: the 10
Description: