Corpus Linguistics Corpus linguistics is the study of language data on a large scale – the computer-aidedanalysisofveryextensivecollectionsoftranscribedutter- ancesorwrittentexts.Thistextbookoutlinesthebasicmethodsofcorpus linguistics, explains how the discipline of corpus linguistics developed, andsurveysthemajorapproachestotheuseofcorpusdata.Itusesabroad rangeofexamplestoshowhowcorpusdatahasledtomethodologicaland theoreticalinnovationinlinguisticsingeneral.Clearanddetailedexplana- tionslayoutthekeyissuesofmethodandtheoryincontemporarycorpus linguistics.Astructuredandcoherentnarrativelinksthehistoricaldevel- opmentofthefieldtocurrenttopicsin‘mainstream’linguistics.Practical activitiesandquestionsfordiscussionattheendofeachchapterencourage studentstotesttheirunderstandingofwhattheyhavereadandanextensive glossaryprovideseasyaccesstodefinitionsofalltechnicaltermsusedin thetext. tony mcenery is Professor of English Language and Linguistics at LancasterUniversity. andrew hardie is Lecturer in Corpus Linguistics at Lancaster University. CAMBRIDGE TEXTBOOKS IN LINGUISTICS Generaleditors:p. austin, j. bresnan, b. comrie, s. crain, w. dressler, c. ewen, r. lass, d. lightfoot, k. rice, i. roberts, s. romaine, n. v. smith Corpus Linguistics: Method, Theory and Practice Inthisseries: S. C. LEVINSONPragmatics G. BROWNandG. YULEDiscourseAnalysis R. HUDDLESTONIntroductiontotheGrammarofEnglish R. LASSPhonology B. COMRIETense W. KLEINSecondLanguageAcquisition A. J. WOODS,P. FLETCHERandA. HUGHESStatisticsinLanguageStudies D. A. CRUSELexicalSemantics A. RADFORDTransformationalGrammar M. GARMANPsycholinguistics G. G. CORBETTGender H. J. GIEGERICHEnglishPhonology R. CANNFormalSemantics J. LAVERPrinciplesofPhonetics F. R. PALMERGrammaticalRolesandRelations M. A. JONESFoundationsofFrenchSyntax A. RADFORDSyntacticTheoryandtheStructureofEnglish:AMinimalistApproach R. D. VAN VALIN, JR,andR. J. LAPOLLASyntax:Structure,Meaningand Function A. DURANTILinguisticAnthropology A. CRUTTENDENIntonationSecondedition J. K. CHAMBERSandP. TRUDGILLDialectologySecondedition C. LYONSDefiniteness R. KAGEROptimalityTheory J. A. HOLMAnIntroductiontoPidginsandCreoles G. G. CORBETTNumber C. J. EWENandH. VAN DER HULSTThePhonologicalStructureofWords F. R. PALMERMoodandModalitySecondedition B. J. BLAKECaseSecondedition E. GUSSMANPhonology:AnalysisandTheory M. YIPTone W. CROFTTypologyandUniversalsSecondedition F. COULMASWritingSystems:AnIntroductiontotheirLinguisticAnalysis P. J. HOPPERandE. C. TRAUGOTTGrammaticalizationSecondedition L. WHITESecondLanguageAcquisitionandUniversalGrammar I. PLAGWord-FormationinEnglish W. CROFTandA. CRUSECognitiveLinguistics A. SIEWIERSKAPerson A. RADFORDMinimalistSyntax:ExploringtheStructureofEnglish D. BU¨RINGBindingTheory M. BUTTTheoriesofCase N. HORNSTEIN,J. NUN˜ESandK. GROHMANNUnderstandingMinimalism B. C. LUSTChildLanguage:AcquisitionandGrowth G. G. CORBETTAgreement J. C. L. INGRAMNeurolinguistics:AnIntroductiontoSpokenLanguageProcessing anditsDisorders J. CLACKSONIndo-EuropeanLinguistics:AnIntroduction M. ARIELPragmaticsandGrammar R. CANN,R. KEMPSONandE. GREGOROMICHELAKISemantics:An IntroductiontoMeaninginLanguage Y. MATRASLanguageContact D. BIBERandS. CONRADRegister,GenreandStyle L. JEFFRIESandD. MCINTYREStylistics R. HUDSONAnIntroductiontoWordGrammar M. L. MURPHYLexicalMeaning J. M. MEISELFirstandSecondLanguageAcquisition T. McENERYandA. HARDIECorpusLinguistics:Method,TheoryandPractice Corpus Linguistics Method, Theory and Practice TONY MCENERY AND ANDREW HARDIE LancasterUniversity cambridge university press Cambridge,NewYork,Melbourne,Madrid,CapeTown, Singapore,Sa˜oPaulo,Delhi,Tokyo,MexicoCity CambridgeUniversityPress TheEdinburghBuilding,CambridgeCB28RU,UK PublishedintheUnitedStatesofAmericabyCambridgeUniversityPress,NewYork www.cambridge.org Informationonthistitle:www.cambridge.org/9780521547369 (cid:2)c TonyMcEneryandAndrewHardie2012 Thispublicationisincopyright.Subjecttostatutoryexception andtotheprovisionsofrelevantcollectivelicensingagreements, noreproductionofanypartmaytakeplacewithoutthewritten permissionofCambridgeUniversityPress. Firstpublished2012 PrintedintheUnitedKingdomattheUniversityPress,Cambridge AcataloguerecordforthispublicationisavailablefromtheBritishLibrary LibraryofCongressCataloguinginPublicationdata McEnery,Tony,1964– Corpuslinguistics:method,theoryandpractice/TonyMcEnery,AndrewHardie. p. cm.–(Cambridgetextbooksinlinguistics) Includesindex. ISBN978-0-521-83851-1(hardback) 1.Corpora(Linguistics) I.Hardie,Andrew. II.Title. P128.C68M38 2011 410.1(cid:3)88–dc23 2011026519 ISBN978-0-521-83851-1Hardback ISBN978-0-521-54736-9Paperback CambridgeUniversityPresshasnoresponsibilityforthepersistenceor accuracyofURLsforexternalorthird-partyinternetwebsitesreferredto inthispublication,anddoesnotguaranteethatanycontentonsuch websitesis,orwillremain,accurateorappropriate. Contents Listoffigures pagex Listoftables xi Acknowledgements xii Preface xiii 1 Whatiscorpuslinguistics? 1 1.1 Introduction 1 1.2 Modeofcommunication 3 1.3 Corpus-basedversuscorpus-drivenlinguistics 5 1.4 Datacollectionregimes 6 1.5 Annotatedversusunannotatedcorpora 13 1.6 Totalaccountabilityversusdataselection 14 1.7 Monolingualversusmultilingualcorpora 18 1.8 Summary 21 Furtherreading 21 Practicalactivities 22 Questionsfordiscussion 23 2 Accessingandanalysingcorpusdata 25 2.1 Introduction 25 2.2 Arecorporatheanswertoallresearchquestionsinlinguistics? 27 2.3 Corpusannotation 29 2.4 Introducingconcordances 35 2.5 Ahistoricaloverviewofcorpusanalysistools 37 2.6 Statisticsincorpuslinguistics 48 2.7 Summary 53 Furtherreading 54 Practicalactivities 55 Questionsfordiscussion 55 3 Theweb,lawsandethics 57 3.1 Introduction 57 3.2 Thewebandlegalissues 57 3.3 Ethicalissues 60 3.4 Summary 69 Furtherreading 69 Practicalactivity 70 Questionsfordiscussion 70 vii viii Contents 4 EnglishCorpusLinguistics 71 4.1 Introduction 71 4.2 UniversityCollegeLondon(UCL) 74 4.3 LancasterUniversity 76 4.4 UniversityofBirmingham 79 4.5 Universite´CatholiquedeLouvain 81 4.6 UniversityofNottingham 84 4.7 NorthernArizonaUniversityandtheUSA 88 4.8 Summary 91 Furtherreading 91 Practicalactivities 92 Questionsfordiscussion 92 5 Corpus-basedstudiesofsynchronicanddiachronic variation 94 5.1 Introduction 94 5.2 DiachronicchangefromOldEnglishtoModernEnglish 94 5.3 DiachronicvariationincontemporaryModernEnglish 96 5.4 Themulti-dimensionalapproachtovariation 104 5.5 Corporaandvariationistsociolinguistics 115 5.6 Summary 118 Furtherreading 119 Practicalactivities 119 Questionsfordiscussion 120 6 Neo-Firthiancorpuslinguistics 122 6.1 Introduction 122 6.2 Collocation 122 6.3 Discourse 133 6.4 Semanticprosodyandsemanticpreference 135 6.5 Lexisandgrammar 142 6.6 Corpus-as-theoryversuscorpus-as-method 147 6.7 Summary:Sinclair’scontributiontocorpuslinguistics 162 Furtherreading 164 Practicalactivities 164 Questionsfordiscussion 165 7 Corpusmethodsandfunctionalistlinguistics 167 7.1 Introduction 167 7.2 Functionalisminlinguistics:abriefoverview 168 7.3 Corpus-basedresearchfromafunctionalistperspective 171 7.4 Corporaandtypology 176 7.5 Corporaandcognitiveapproachestolinguistics 179 7.6 Corporaintheanalysisofmetaphor 185 7.7 Summary 188 Furtherreading 189 Practicalactivities 189 Questionsfordiscussion 191