ebook img

Open Access Meets Discoverability: Citations to Articles Posted to Academia. edu PDF

23 Pages·2016·2.13 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Open Access Meets Discoverability: Citations to Articles Posted to Academia. edu

RESEARCHARTICLE Open Access Meets Discoverability: Citations to Articles Posted to Academia.edu YuriNiyazov1,CarlVogel2,RichardPrice1*,BenLund1,DavidJudd1,AdnanAkil1, MichaelMortonson1,JoshSchwartzman1,MaxShron2 1Academia.edu,SanFrancisco,California,UnitedStatesofAmerica,2Polynumeral,NewYork,NewYork, UnitedStatesofAmerica *[email protected] Abstract Usingmatchingandregressionanalyses,wemeasurethedifferenceincitationsbetween articlespostedtoAcademia.eduandotherarticlesfromsimilarjournals,controllingforfield, impactfactor,andothervariables.Basedonasamplesizeof31,216papers,wefindthata OPENACCESS paperinamedianimpactfactorjournaluploadedtoAcademia.edureceives16%morecita- tionsafteroneyearthanasimilararticlenotavailableonline,51%morecitationsafterthree Citation:NiyazovY,VogelC,PriceR,LundB,Judd D,AkilA,etal.(2016)OpenAccessMeets years,and69%afterfiveyears.WealsofoundthatarticlesalsopostedtoAcademia.edu Discoverability:CitationstoArticlesPostedto had58%morecitationsthanarticlesonlypostedtootheronlinevenues,suchaspersonal Academia.edu.PLoSONE11(2):e0148257. anddepartmentalhomepages,afterfiveyears. doi:10.1371/journal.pone.0148257 Editor:PabloDorta-González,UniversidaddeLas PalmasdeGranCanaria,SPAIN Received:August31,2015 Accepted:January15,2016 Introduction Published:February17,2016 Academia.eduisawebsitewhereresearcherscanposttheirarticlesanddiscoverandreadarti- Copyright:©2016Niyazovetal.Thisisanopen clespostedbyothers.ItcombinesthearchivalroleofrepositorieslikeArXiv,SSRN,orPubMed accessarticledistributedunderthetermsofthe withsocialnetworkingfeatures,suchasprofiles,newsfeeds,recommendations,andtheability CreativeCommonsAttributionLicense,whichpermits tofollowindividualsandtopics.Thesitelaunchedin2008andasofJanuary2016hasapproxi- unrestricteduse,distribution,andreproductioninany mately30millionregistereduserswhohaveuploadedapproximately8.5millionarticles.Regis- medium,providedtheoriginalauthorandsourceare credited. trationonthesiteisfreeanduserscanfreelydownloadallpaperspostedtothesite. Thereisalargebodyofresearchonthecitationadvantageofopenaccessarticles,and DataAvailabilityStatement:Datafromthe"Open researchersarestilldebatingthesizeandcausesoftheadvantage.Somestudieshavefound AccessMeetsDiscoverability"paperareavailableat https://github.com/polynumeral/academia-citations. thatopenaccessarticlesreceivesubstantiallymorecitationsthanpay-for-accessarticles,even aftercontrollingforcharacteristicsofthearticlesandtheirauthors[1,2].Otherstudiesusing Funding:Academia.edupaiditsemployees, experimentalandquasi-experimentalmethodshaveconcludedthatanymeasuredcitation contractorsandanexternalconsultancy (Polynumeral)toperformthisstudy.AuthorsYuri advantageismostlyduetoselectionbiasandotherunobserveddifferencesbetweenfreeand Niyazov,RichardPrice,BenLund,DavidJudd, paidarticles[3–5]. AdnanAkil,MichaelMortonsonandJosh Boththesupportiveandcriticalstudieshavefocusedontheaccessibilityofarticles:once SchwartzmanareemployedbyAcademia.edu. found,canthearticlebeobtainedforfree?Theyhavegivenlessconsiderationtothediscover- AuthorsCarlVogelandMaxShronareemployedby abilityofarticles:howeasilycanthearticlebefound?Thismakessense;themethodsresearch- Polynumeral.Academia.eduprovidedsupportinthe ersoftenusetofindarticlesdon’tprivilegeopenaccessoverpaidsourcesorviceversa.Google formofsalariesforauthorsYN,RP,BL,DJ,AA,MM andJS,butdidnothaveanyadditionalroleinthe Scholar,forexample,returnsbothfreeandpaidsources,asdomanylibrarydatabases. PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 1/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu studydesign,datacollectionandanalysis,decisionto Academia.edu,ontheotherhand,hasuniquefeaturesfordiscoveringarticles,makingitan publish,orpreparationofthemanuscript.Thespecific interestingvenueforanalyzingacitationadvantage.Usersarenotifiedwhenauthorstheyfol- rolesoftheseauthorsarearticulatedinthe“author lowpostarticlestothesite.Theycanthensharethosearticleswiththeirfollowers.Ausercan contributions”section.Polynumeralprovidedsupport taganarticlewithasubjectlike“HighEnergyPhysics”andusersfollowingthatsubjectwillbe intheformofsalariesforauthorsCVandMS,butdid nothaveanyadditionalroleinthestudydesign,data notifiedaboutthepaper. collectionandanalysis,decisiontopublish,or AnumberofusershavereportedtotheAcademia.eduteamthattheyobservedincreasedcita- preparationofthemanuscript.Thespecificrolesof tionsafterpostingtheirarticlestothesite[6,7].Motivatedbythoseanecdotalreports,aformal theseauthorsarearticulatedinthe“author statisticalanalysiswasconductedofthecitationadvantageassociatedwithpostinganarticle. contributions”section. WefindthatatypicalarticlepostedonAcademia.edureceivesapproximately16%morecita- CompetingInterests:Theauthorsofthismanuscript tionscomparedtosimilararticlesnotavailableonlineinthefirstyearafterupload,risingto havereadthejournal’spolicyandhavethefollowing 51%afterthreeyears,and69%afterfiveyears.WealsofindthatatypicalarticlepostedonAca- competinginterests:Academia.edupaidits demia.edureceivesmorecitationsthananarticleavailableonlineonanon-Academia.edu employees,contractorsandanexternalconsultancy venue,suchasapersonalhomepage,adepartmentalhomepage,orajournalsite.Atypical (Polynumeral)toperformthisstudy.AuthorsYuri Niyazov,RichardPrice,BenLund,DavidJudd, paperpostedonlytoAcademia.edureceives15%fewercitationsthananarticleuploadedtoa AdnanAkil,MichaelMortonsonandJosh non-Academia.edusiteinthefirstyear,but19%moreafterthreeyears,and35%afterfiveyears. SchwartzmanareemployedbyAcademia.edu. Ourstudyisobservational,requiringustocarefullyaccountforpossiblesourcesofselection AuthorsCarlVogelandMaxShronareemployedby bias.Wefindthatthecitationadvantagepersistsevenaftercontrollingforanumberofpossible Polynumeral.Therearenopatents,productsin selectionbiases. developmentormarketedproductstodeclare.This doesnotaltertheauthors’adherencetoallthePLOS ONEpoliciesonsharingdataandmaterials. Background TheOpenAccessCitationAdvantage EventhoughAcademia.edudiffersfromtraditionalvenuesforopenaccess,thehypothesesand methodsinthispaperoverlapwithresearchontheopenaccesscitationadvantage.Theterm “openaccess”typicallyreferstoarticlesmadefreelyavailableaccordingtospecificOpenAccess policiesofacademicjournals:forexample“GoldOpenAccess”policieswhereauthorsorinsti- tutionspaythejournaltomakeanarticlefreelyavailable,or“GreenOpenAccess”wherean authormayarchiveafreeversiontheirarticleonline.Sometimes,though,“openaccess”isused morelooselytorefertoanymannerbywhicharticlesaremadefreelyavailableonline.Some authorsusetheterm“freeaccess”forthisbroaderdefinition,todistinguishitfromGreenand GoldOpenAccesspolicies.Ourstudydoesnotrelyonthesedistinctions,andwewillusethe terms“openaccess”and“freeaccess”interchangeablytorefertothebroaderdefinitionof freelydownloadablearticles. Manyresearchers,beginningwith[8],havefoundthatfree-accessarticlestendtohave morecitationsthanpay-for-accessarticles.Thiscitationadvantagehasbeenobservedina numberofstudies,spanningavarietyofacademicfieldsincludingcomputerscience[8],phys- ics[9],andbiologyandchemistry[1]. Theestimatedsizeofthecitationadvantagevariesacrossandevenwithinstudies,butis oftenmeasuredtobebetween50%and200%morecitationsforopenaccessarticles.[10]The varietyofestimatesisunsurprising,sincebothopenaccessandcitationpracticesvarywidely acrossdisciplines,andcitationsaccumulateatdifferentratesfordifferentarticlespublishedin differentvenues.Differentstatisticalmethodsalsoleadtodifferentestimates.Somestudies havesimplycomparedunconditionalmeansofcitationsforsamplesoffreeandpaidarticles [8],whileothers,suchas[1]measuredtheadvantageinaregressionanalysiswithabatteryof controlsforcharacteristicsofthearticlesandtheirauthors. CritiquesoftheCitationAdvantage Otherstudieshavepresentedevidenceagainstanopenaccesscitationadvantage,arguingthat althoughthereiscorrelationbetweenopenaccessandmorecitations,openaccessdoesnot PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 2/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu causemorecitations.(See,e.g.,[11]and[12]forcriticalreviewsofthecitationadvantage literature.) Kurtzetal.[13]—inaframeworkadoptedbyseveralsubsequentauthors(e.g.,[3,11,14].)— putforththreepostulatestoexplainthecorrelationbetweenopenaccessandincreased citations: 1. TheOpenAccesspostulate.Sinceopenaccessarticlesareeasiertoobtain,theyareeasierto readandcite. 2. TheEarlyViewpostulate.Openaccessarticlestendtobeavailableonlinepriortotheir publication.Theycanthereforebeginaccumulatingcitationsearlierthanpaid-accessarti- clespublishedatthesametime.Whencomparingcitationsatfixedtimessincepublication, theopen-accessarticleswillhavemorecitations,becausetheyhavebeenavailablefor longer. 3. TheSelectionBiaspostulate.Ifmoreprominentauthorsaremorelikelytoprovideopen accesstotheirarticles,orifauthorsaremorelikelytoprovideaccesstotheir“highestqual- ity”articles,thenopenaccessarticleswillhavemorecitationsthanpaid-accessarticles. Kurtzetal.[13],andlater[14],concludedthattheEarlyViewandSelectionBiaseffects werethemaindriversofthecorrelationbetweenopen-accessandincreasedcitations.Alackof causalopen-accesseffectwasfurthersupportedinotherstudies,suchastherandomizedtrials in[3]and[4],andtheinstrumentalvariablesregressionsin[5]. Buteventhesestudiesarenotconclusive.Forexample,Kurtzetal.[13]pointoutthattheir conclusionsmaybespecifictotheirsample:articlespublishedinthetopfewastronomyjour- nals.Theexperimentaltreatmentin[3]and[4]wastomakerandomly-chosenarticlesfreeto downloadonthepublisher’swebsite.Howeasilyresearcherscoulddeterminethesearticles wereavailableforfreeisunclear.And,whiletheinstrumentalvariableanalysisof[5]foundevi- denceofselectionbiasinopenaccess,theystillestimatedastatisticallyandpracticallysignifi- cantcitationadvantageevenaftercontrollingforthatbias. Regardlessofthevalidityorgeneralityoftheirconclusions,thesestudiesdoestablishthat anycitationadvantageanalysismusttakeintoaccounttheeffectsoftimeandselectionbiason citationdifferentials. SourcesofSelectionBiasinAcademia.eduCitations Likemostcitationadvantagestudies,oursisobservational,notexperimental.Articlesarenot uploadedtoAcademia.edurandomly.Authorschoosetoregisterasusersonthesite,andthen choosewhichoftheirarticlestoupload.Whenmakingcomparisonstoarticlesnotpostedto thesite,thiscreatesseveralpotentialsourcesofbiasinunconditionalcitationcomparisons. 1. Self-selectionofdisciplines.Academia.eduusersmaybemorelikelytocomefromparticu- lardisciplines.Sincethecitationfrequencydiffersacrossdisciplines,acitationadvantage estimatethatdoesn’tcontrolforacademicdisciplinemightover-orunderestimatethetrue advantage. 2. Self-selectionofauthors.ResearcherswhopostpapersonAcademia.edumightdifferfrom thosewhodonot.Usersmightskewyounger,orbemorelikelytoworkatlesser-known institutions.Ifso,wewouldexpecttofindthatpaperspostedtothesitetendtohavefewer citationsthanthosenot.Orusersmightskewintheotherdirection—havingmoreestab- lishedreputations,orcomingfrombetter-knowninstitutions,inwhichcasewecouldover- estimatetheactualadvantage.Furthermore,userswhopostpapersmayalsobegenerally PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 3/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu moreproactiveaboutdistributingandmarketingtheirwork,boththroughAcademia.edu andothervenuesonlineandoff.Ifthisweretrue,itwouldalsocauseustooverestimatethe actualadvantage. 3. Self-selectionbyarticlequality.EvenifAcademia.eduuserswerenotsystematicallydiffer- entthannon-users,theremightbesystematicdifferencesbetweenthepaperstheychooseto postandthosetheydonot.As[13]andothershavehypothesized,usersmaybemorelikely toposttheirmostpromising,“highestquality”articlestothesite,andnotpostarticlesthey believewillbeofmorelimitedinterest. 4. Self-selectionbytypeofarticle.Academicjournalspublishcontentbesidesoriginal researchorscholarship:bookreviews,errata,responsestorecentlypublishedarticles,con- ferenceabstracts,editorials,etc.Theseothertypesofcontenttypicallyreceivefewercitations thanresearcharticles.IfAcademia.eduusersarelesslikelytoposttheseothertypesofcon- tenttothesite,thenwemightoverestimatetheadvantagerelativetoanoff-Academiagroup thatcontainsmore“non-research”content. 5. Self-selectionbyarticleavailability.Ausermaybemorelikelytopostapapertothesiteif theyhavealreadymadeitavailablethroughothervenues,suchastheirpersonalwebsiteor institutionalorsubject-specificrepositories.Inthiscase,acitationadvantageestimatedfor Academia.edupapersmightbemeasuringinpartorwhole,ageneralopenaccesseffect fromthearticles’availabilityattheseothervenues. Manyofthesefactorscannotbeobserveddirectlyorcompletely,andtheiraggregateeffect oncitationadvantageestimatesisdifficulttopredict.Wehavecollecteddataandemployed matchingandregressionstrategiestomitigateeachoftheabovepotentialbiases,andcontinue tofindasubstantivecitationadvantagetoarticlespostedtoAcademia.edu. MaterialsandMethods Werelyondatafromseveralsources:(1)articlestheAcademia.eduwebsite,(2)citationcounts andfree-accessstatusfromGoogleScholar,(3)journalrankingsfromSCIMago/Scopus,and (4)journalresearchfieldsfromtheAustralianResearchCouncil.Alldataandcodeusedinthe analysisareavailablefordownloadathttps://github.com/polynumeral/academia-citations. On-AcademiaandOff-AcademiaArticles OuranalysisisacomparisonofcitationsbetweenarticlespostedtoAcademia.edutoarticles notposted.Werefertothesetwosamplesasthe“On-Academia”sampleandthe“Off-Acade- mia”sample.Articlescomprisingeachsamplewereselectedinthefollowingway. On-AcademiaSample:ThearticlesinouranalysiswereuploadedtotheAcademia.edu between2009and2012,inclusive.Wechosetostartat2009becausethiswasthefirstfullyear thatthesitewasactive.Westoppedat2012sothatallarticlesinthesampleareatleasttwo- yearsoldandhavehadtimetoaccumulatecitations.Werestrictoursampletoarticlesthat werepostedtothesiteinthesameyeartheywerepublished.Werefertothisasthe“P=U” (Published=Uploaded)restriction.Thisensuresthatallofthearticlesareexposedtoanycita- tionadvantageeffectstartingfromtheirpublication.Italsomitigatesbiasfromauthorsfavor- ingtheir,expost,most-citedarticleswhenuploadingtothesite. OuranalysisreliesoninformationfromGoogleScholarandCrossRef.Thelatterisadata- basecontainingjournals,articles,authors,andDigitalObjectIdentifiers(DOIs).Therefore,we restrictedtheon-Academiasampletoarticlesthatcouldbematchedbytitleandauthortoboth GoogleScholarresultsandCrossRefentries. PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 4/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu Table1.Samplesizeofpapers,bycohort. Year Off-Academia On-Academia 2009 4,600 149 2010 5,768 490 2011 6,989 2,236 2012 8,368 2,616 Total 25,725 5,491 doi:10.1371/journal.pone.0148257.t001 Off-AcademiaSample:UsingtheCrossRefdatabase,weselectedarandomsubsetofarticles publishedintheyearsasarticlesintheon-Academiasample,butwhichhadnotbeenpostedto Academia.edu. CitationCounts Forallarticlesinboththeon-andoff-Academiasamples,weobtainedcitationcountsfrom GoogleScholarbetweenAprilandAugust2014. Table1showsthenumberofarticlesineachcohortandsample.Theon-Academiasample eachyearisasubsetofpaperspostedtothesitethatyear.Weexcludedpapersuploadedtothe sitethatwerepublishedinanearlieryear,andpapersthatcouldnotbematchedtoaGoogle ScholarsearchresultoraCrossRefentrybasedontheirtitlesandauthors.Usersmanually enterapaper’stitlewhentheyuploadittothesite,andwhattheyentermaydifferfromthe paper’scanonicaltitle.(Forexample,ausermayadd“forthcominginPLoS”tothetitle.)This sortofdiscrepancywasacommonreasonforafailuretomatch.Wedonotbelievethatfailure tomatchapaperisrelatedtoitscitations,andthereforetheseexclusionsshouldnotbiasour results. Articlesinthesamplecomefrom5,725differentjournals,butthereisaconcentratedrepre- sentationofjournals.Table2liststhetenjournalswiththehighestnumberofarticlesinour sample.Themost-representedjournal,AnalyticalChemistrycomprises4.6%ofthesample, andthetoptenjournalscomprise12%. JournalImpactFactorsandDivisions Weusedtheimpactfactorofanarticle’sjournalasamatchingvariableandregressionpredic- tor.JournalimpactfactorswereobtainedfromSCIMagoJournalandCountryRank,which usescitationdatafromScopus[15].Themetricwerefertoasthe“impactfactor”isthe“Cites Table2.Journalswiththemostnumberofarticlesinthesample. Journal #Articles %Total AnalyticalChemistry 1,422 4.56% BiologicalandPharmaceuticalBulletin 329 1.05% AnalyticalMethods:advancingmethodsandapplications 316 1.01% AnalyticalBiochemistry 303 0.97% BioconjugateChemistry 285 0.91% AppliedMechanicsandMaterials 282 0.90% PLoSOne 194 0.62% AppliedPhysicsLetters 179 0.57% AAPSPharmSciTech 164 0.53% AnesthesiaandAnalgesia 155 0.50% doi:10.1371/journal.pone.0148257.t002 PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 5/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu Table3.Toptenjournalsinsamplebyimpactfactor.Impactfactorisaveragedbyyear. Journal ImpactFactor ChemicalReviews 41.92 AnnualReviewofImmunology 39.88 ChemicalSocietyReviews 31.76 AnnualReviewofBiochemistry 31.52 AnnualReviewofAstronomyandAstrophysics 28.48 NatureReviewsNeuroscience 28.34 NatureMaterials 28.26 ProgressinPolymerScience 26.7 Nature 25.87 LancetOncology 25.48 doi:10.1371/journal.pone.0148257.t003 perDoc,2year”metricontheSCIMagosite.Ajournal’simpactfactorin,forexample2012,is calculatedastheaveragenumberofcitationsreceivedin2012bypapersthatwerepublishedin thejournalin2010and2011.Wematchedeacharticletoitsjournal’simpactfactorintheyear thearticlewaspublished.Thisensuresthattheimpactfactorwasnotaffectedbythearticle itself,onlyarticlespublishedinthejournalinprioryears.Thejournalsinoursamplewiththe highestimpactfactorsarelistedinTable3. Wealsoobtaineddataonthejournals’fieldsofresearchfromtheAustralianResearch Council’sExcellenceinResearchforAustraliareport[16].Thereportcontainsdataonaca- demicjournalsthatincludeslabelsfortheirFieldsofResearch,definedusingahierarchicaltax- onomyfromtheAustralianNewZealandStandardResearchClassification[17].Fieldof Researchisthesecondleveloftaxonomy,andthejournalsinoursamplecoveraround200dif- ferentFields. Weinsteadrelyonthefirstlevelofthetaxonomy,the“Division”ofthejournal,which describesbroaddisciplinesofresearch.Thereare22Divisionsinthetaxonomyandajournal canbelabelledwithuptothreedifferentDivisions.Multidisciplinaryjournals,whichcover morethanthreeFieldsofResearch,arelabelledwitha23rdDivisionlabelof “Multidisciplinary.” Alloftheanalysesinthepaperwerealsoconductedwiththe“FieldofResearch”labels, usingtextanalysisanddimensionreductiontechniquestoaccountforthelargenumberof labelsandhighcorrelationsamongstthem.Theseanalysesgavenearlyidenticalresultstothose basedontheDivisionlabels,soweusethelattersincetheyareeasiertointerpret. Table4providessummarydataabouttheDivisionsinoursample:theshareofarticlesin thefullandon-andoff-Academiasamplesineachdiscipline,andthemedianimpactfactorof journalsinoursampleineachDivision.NearlyathirdofarticlesinoursampleareinMedical andHealthSciencesjournals,whileEngineeringandBiologicalScienceseachrepresentafifth ofarticles.Thecolumnsadduptomorethan100%becausejournalscanbelabeledwithupto threedisciplines. DocumentTypes Weincludeinouranalysisonlyarticleswithoriginalresearch,analysisorscholarship,orsur- veyarticles.Weexcludebookreviews,editorials,errata,andother“non-research”content.Our procedureforobtainingon-andoff-Academiaarticlesprovided37,266articles.Fromthissam- ple,weremovedanyarticlesnotidentifiedtobeoriginalresearch. PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 6/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu Table4.JournalDivisions,definedaccordingtothetaxonomyin[17].Shareofarticlesinthefullsample,theon-Academiasample,andtheoff-Acade- miasampleineachDivision,andthemedianimpactfactorofsamplearticlesintheDivision.Journalscanbelabelledwithbetweenoneandthreedisciplines. Division %All %On %Off Med.IF MedicalandHealthSciences 33.0% 18.6% 36.1% 2.58 Engineering 22.9% 12.0% 25.3% 2.77 BiologicalSciences 20.6% 19.6% 20.8% 2.55 ChemicalSciences 18.7% 6.3% 21.4% 3.79 PsychologyandCognitiveSciences 7.7% 17.5% 5.6% 2.46 PhysicalSciences 7.2% 8.3% 7.0% 2.41 MathematicalSciences 7.1% 5.0% 7.5% 1.36 Multidisciplinary 5.1% 11.5% 3.7% 3.20 InformationandComputingSciences 4.9% 5.2% 4.8% 1.95 EarthSciences 4.0% 8.7% 2.9% 2.28 StudiesinHumanSociety 3.7% 9.8% 2.4% 1.15 AgriculturalandVeterinarySciences 3.7% 4.6% 3.5% 2.16 EnvironmentalSciences 3.4% 5.3% 3.0% 2.48 Commerce,Management,TourismandServices 2.8% 4.4% 2.5% 1.30 Technology 2.2% 1.9% 2.3% 1.96 Education 1.8% 4.5% 1.2% 1.12 Economics 1.6% 1.9% 1.5% 1.15 Language,CommunicationandCulture 1.4% 4.6% 0.8% 0.63 PhilosophyandReligiousStudies 1.4% 4.2% 0.8% 0.64 HistoryandArchaeology 1.3% 4.8% 0.5% 0.92 BuiltEnvironmentandDesign 0.9% 1.7% 0.8% 1.84 CreativeArtsandWriting 0.5% 1.4% 0.3% 0.76 LawandLegalStudies 0.4% 0.8% 0.3% 0.77 doi:10.1371/journal.pone.0148257.t004 Toidentifythetypeofeacharticle,weusedAmazonMechanicalTurk(MTurk),acrowd- sourcingmarketplace.CommonusesofMTurkinacademicresearchincludecollectingsurvey data,performingonlineexperiments,andclassifyingdatatotrainandvalidatemachinelearn- ingalgorithms.Anappendixwithamorecompletedescriptionofthedocumentclassification process,includingtheworkerquestionnaire,andaccuracystatistics,isavailableatthispaper’s Githubrepo,https://github.com/polynumeral/academia-citations/.Therepoalsoincludes underlyingdataonworkerresponses. WeprovidedDOIlinkstoarticlesinoursampletoover300MTurkworkers.Theworkers wereaskedtofilloutanonlineformbasedoninformationfromtheabstractorfulltextatthe DOIlink.Theywerefinallyaskedtoclassifythearticleasoneofthefollowingtypes: 1. Asummaryofameetingorconference 2. AnEditorialorCommentary 3. Aresponsetoarecentarticleinthesamejournal; 4. Anarticlewithoriginalresearch,analysisorscholarship,orabroadsurveyofresearchona topic 5. ThisisaBookReview,SoftwareReview,orreviewofsomeotherrecentworkorperformance 6. AnErratum,Correction,orRetractionofanearlierarticle 7. Somethingelse PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 7/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu Workersmightfailtocategorizeanarticle,givingoneofthesereasons:thelinkwasbroken, therewasnoabstractortextavailableonthesite,thearticlewasinaforeignlanguage,orthey otherwisecouldn’ttell.Someworkers’resultswereexcludediftheyexhibitedsuspiciouspat- terns,suchasgivingallarticlesthesameclassification,orcompletingalargenumberoftasks inanunreasonablyshorttime.Theirtaskswerethenresubmittedsothateacharticlehadthree independentreviews. Eacharticlewasreviewedbythreedifferentworkers.Oursampleonlyincludesarticlesthat allthreeworkersidentifiedas“originalresearch”(option4).Oftheoriginal37,266articles,this left31,216“originalresearch”articles.Relyingonamajority,2-of-3votetoclassifyarticles wouldhaveresultedin35,311“originalresearch”articles.Unanimityisaconservativeclassifi- cationrule,butgiventhatfalsepositiveclassificationof“originalresearch”articlescould upwardlybiasourresult,weconsideritappropriate. OnlineAvailability Inthelastsection,weconsideredseveralpotentialsourcesofselectionbiasintheon-Academia sample.Onewasthatusersmightbemorelikelytouploadarticlestothesiteiftheyhavealso madethosearticlesavailableelsewhereonline.Toexaminethispossibility,wecollecteddataon whetherallpapersinoursamplewerefreelyavailablefromnon-Academiasources.Fortheon- Academiaarticles,thiswouldmeantheywereavailablefromatleasttwoonlinesources. Todeterminewhetherapaperwasavailableelsewhere,wesearchedforitstitleonGoogle Scholar,andcheckedwhethertheresultscontainedalinktoanon-paywalledfull-textarticle. Thismethodissubjecttofalsenegatives,butsincethefailuretomatchatitle,orcorrectlyiden- tifyafull-textarticleonanon-Academiasiteshouldbeindependentofwhetherthearticleis alsopostedtoAcademia,weexpectitserrorratetobesimilarforbothon-andoff-Academia articles. Table5liststhenumberofarticlessearched,andthepercentagewithfree-accesstofulltext onnon-Academia.edusites.Wefindthatpapersintheon-Academia.edusamplearemore likelytobeavailableonlineaspapersintheoff-Academiasample.Thisindicatesthatthere maybesomeself-selectionbyavailabilityinourdata.Ourregressionanalysescontrolfor online-availability,mitigatingpotentialbiasfromthediscrepancy. Theuseofabinaryindicatorforonlineavailabilitydoesconcealsomepotentiallyuseful informationaboutthearticle’savailability.Forexample,howmanydifferentvenuesismaybe availableon,orwhatthosespecificvenuesare.Suchmetricsaredifficulttomeasureaccurately, butcouldbeinteresting.Indeed,thispaperarguesthatvenue-specificeffectscanbemeaning- ful.Nonetheless,wedonotbelievethisun-measuredinformationwillcontributetoanysub- stantialbiasforseveralreasons;theprimaryonebeingthatwefindasignificantcitation advantageamongstarticlesthatarenotonlineonanynon-Academiavenue;aneffectgenerally largerthantheaverageonlineadvantagewemeasurewiththebinaryvariable.Werewetousea richermetricforonlineavailability,thosearticlewouldnotbeaffected,andtheirAcademia advantagewouldremainroughlythesame. Table5.Shareofsamplearticlesfreelyavailablefromnon-Academia.edusites. Off-Academia On-Academia a. Full-textavailableelsewhere 9,487 3,652 b. Articlessearched 25,725 5,491 c. Share(a(cid:1)b) 36.9% 66.5% doi:10.1371/journal.pone.0148257.t005 PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 8/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu Table6.Citationssummarystatistics. Sample Min. 1stQu. Median Mean 3rdQu. Max. off-Academia 0 2 5 10.19 12 1237 on-Academia 0 3 7 12.77 15 721 doi:10.1371/journal.pone.0148257.t006 QuantifyingtheCitationAdvantage Ourgeneralempiricalstrategyistoestimatethedistributionofthecitationcountofarticlei, publishedinjournaljattimet,conditionalonitbeingpostedtoAcademia.edu,andcompare thisdistributiontothesamearticle,butconditionalonitnotbeingpostedtothesite.Denoting thenumberofcitationsasarandomvariableY,weareinterestedinthedistributions P1ðyÞ ¼ProbðY (cid:3)y jj;t;on(cid:4)AcademiaÞ ijt P0ðyÞ ¼ProbðY (cid:3)y jj;t;off (cid:4)AcademiaÞ: ijt Wecancomputethechangeinanarticle’scitationsassociatedwithpostingtoAcademia. edu,Δ ,bycomparingsummarystatisticsofthesedistributions.Forexample,thedifferencein ijt means D ¼E1ðYÞ(cid:4)E0ðYÞ; ijt ijt ijt ormedians, D ¼Med1ðYÞ(cid:4)Med0ðYÞ: ijt ijt ijt Oneapproachwouldbetodirectlyestimatethesesummarystatisticsbycomputingaverage ormediancitationswithineachjournal×yeargroup.Unfortunatelymanyofthesegroupscon- taintoofewarticlestoaccuratelyestimatesummarystatistics.Instead,weusejournal-specific covariatestorepresentjournals,mostprominentlythejournal’simpactfactor.Thisleadsto twoapproaches:anon-parametricmatchinganalysis,andaregressionanalysis. PropertiesofCitationCountDistributions Citationcountsarenon-negativeintegerswithahighlyright-skeweddistribution.Thiscanbe seeninTable6andFig1,thelatterofwhichalsoshowsthatthemodalarticlehasoneorno citations.Ourmatchinganalysisaccountsforthisaspectofthedatabycomparingquantilesof on-andoff-Academiacitationcounts.Ourregressionanalysisappliesseveralparametricmod- elsthataccommodateright-skewedcountdata. Results MatchingbyImpactFactor Ourfirstanalysiscomparescitationsofon-andoff-Academiaarticlesgroupedbycohortand theirjournals’impactfactors.Thisiseffectivelyamatchingstrategywithyearandimpact-fac- torasthecovariates;thepurposebeingtoprovidearelativelysimplenon-parametricestimator ofthedifferencewhilecontrollingforimportantcovariates.Theregressionanalysesinthesub- sequentsectionswillexpandonthisanalysiswithalargerarrayofcontrols. Tomatchon-Academiaarticlestooff-Academiaarticles,wecomputeddecilebinsofimpact factorsamongsttheon-Academiaarticlesinacohort.Therefore,eachimpactfactorbin PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 9/23 OpenAccessMeetsDiscoverability:CitationstoArticlesPosted toAcademia.edu Fig1.Distributionsofcitations(x-axisistruncatedat100). doi:10.1371/journal.pone.0148257.g001 represents10%ofarticlesintheon-Academiasampleforthatyear.Wethengroupedtheoff- Academiaarticlesintothosebins,andcomparedsampleswithineachbin. Fig2showsboxplotsofcitationstoon-andoff-Academiaarticlesineachcohortandimpact factorbin.(Bornmannetal.[18],amongothers,advocateusingboxplotstocomparecitation differencesacrosssamples.)Evidentinthefigurearethatolderpapershavemorecitations,and thatarticlespublishedinhigherimpactfactorjournalshavemorecitations.Furthermore,we findthatmediannumberofcitationstoon-Academiaarticlesisconsistentlyhigherthanoff- Academiaarticlesacrosscohortsandimpactfactorbins.Table7providesthemediansand PLOSONE|DOI:10.1371/journal.pone.0148257 February17,2016 10/23

Description:
Introduction. Academia.edu is a website where researchers can post their articles and discover and read arti- cles posted by others. It combines the .. in a cohort. Therefore, each impact factor bin. Table 6. Citations summary statistics. Sample. Min. 1st Qu. Median. Mean. 3rd Qu. Max. off-Academia
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.