ebook img

A novel triangle mapping technique to study the h-index based citation distribution. PDF

0.7 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A novel triangle mapping technique to study the h-index based citation distribution.

A novel triangle mapping technique to study the h-index based citation SUBJECTAREAS: distribution COMPUTATIONAL BIOLOGYAND BIOINFORMATICS Chun-TingZhang BIOPHYSICS SYSTEMSBIOLOGY DepartmentofPhysics,TianjinUniversity,Tianjin300072,China. BIOTECHNOLOGY Theh-indexhasreceivedwideattentioninrecentyears.Theareaunderthecitationfunctionisdividedby Received theh-indexintothreeparts,representingh-squared,excessandh-tailcitations.Theh-indexbyitselfdoes 16November2012 notcarryinformationforexcessandh-tailcitations,whichcanplayanevenmoredominantrolethanh- indexindeterminingthecitationcurve,andthereforeitisnecessarytoexaminetherelationsamongthem.A Accepted trianglemappingtechniqueisproposedheretomapthethreepercentagesofthesecitationsontoapoint 3December2012 withinaregulartriangle.Byviewingthedistributionofmappingpoints,shapesofthecitationfunctionscan bestudiedinaperceivableform.Asanexample,thedistributionofthemappingpointsfor100mostprolific Published economistsisstudiedbythistechnique. 3January2013 Theh-index,proposedbyHirsch1forevaluatingtheacademicimpactofindividualresearchers,hasreceived wideattentioninrecentyears.Thecitationsreceivedbyallpapersofagivenresearchercanbecharacterized Correspondenceand byacitationdistributionfunction,wherethey-axiscorrespondstothecitationsreceivedbyapaper,whereas requestsformaterials thex-axisrepresentsthepaperrankarrangedindescendingorderofcitations(Fig.1).Thedistributionofcitation shouldbeaddressedto versepaperrankiscalledthecitationdistributionfunctionorcurve,denotedbyC(x),inwhichthepaperxreceives C.T.Z.(ctzhang@tju. C(x)citations.Theh-indexwassimplydefinedasC(h)~h1.Theareaunderthecitationdistributioncurveis edu.cn) dividedbytheh-indexintotwoparts:thoseoftheh-core2andtheh-tail3.Theformerisfurtherdividedinto anothertwoparts:thoseofexcesscitations4andh-squaredcitations1.Asaconsequence,thetotalcitationsare divided into three different parts: h-squared, excess and h-tail citations (Fig. 1). Indeed, the h-index lacks information for the excess and the h-tail citations, keeping only the citations related to the h-index (h2). Theoretically,onlywhenh2 isdominantamongthethreeparts,theh-indexcanproperlyreflecttheacademic performanceofthescientistunderstudy,otherwise,theh-indexleadstobiasedevaluation.Thequestionthat whethertheh-indexdominatesthecitationsornotdependsontheshapeofcitationdistributionfunction. AspointedoutbyBornmannandco-workers5,foranisohindexgroup(scientistshavingthesameh-index), theirassociatedcitationdistributionfunctionsmaydisplayquitedifferentshapes.Therefore,tostudyhowto applyh-indexfairly,itisnecessarytostudytheshapeofthecitationdistributionfunctions,andthecurrentstudy aimstoaddressthisquestionbyusingatrianglemappingtechnique.Oneoftheadvantagesofcitationtriangle method is that the comparison of different shapes of the citation distribution functions can be performed intuitively.Byviewingthedistributionofmappingpointswithinthetriangle,theshapesofthecitationdistri- butionfunctionscanbestudiedwithaperceivablemanner.Basedonthismethod,weareabletostudythedegree withwhichtheh-indexisapplicableproperly.Itishopedthatthetechniquepresentedhereisusefulforusingthe h-indextoevaluateacademicperformanceinamoreunbiasedway. Results Wehereproposeanoveltrianglemappingtechniquetostudytherelationsamongh-squared,excessandh-tail citations.Foraregulartriangle,thesumofthedistancesfromanyinteriorpointtothethreesidesisequaltoa constant,theheightofthetriangle.Notethatthesumofthepercentagesforh2,e2andt2isalsoaconstant,which equalsto1.Basedonthischaracteristic,percentagesforthese3kindsofcitationsaremappedontoapointina regulartriangle(Fig.2A).RefertotheMethodsectionfordetails. Firstofall,letusconsidertwoconcreteexamples.AccordingtothecitationinformationprovidedbyDodson6, C ~1700,h2~625(h~25),e2~477(e~21:84),andwefindH~0:37,E~0:28andT~0:35.Therefore,the total mappingpointcorrespondingtoDodsonissituatedattheregionNo.4,wheretheh-indexisapplicable(Fig.2B). SCIENTIFICREPORTS |3:1023|DOI:10.1038/srep01023 1 www.nature.com/scientificreports tox~{0:456,y~{0:183.Itsmappingpointissituatedatthesmall triangleNo.5,ane-indexabsolutelydominatedregions.Accordingto Bornmannetal5andColeandCole8,ScientistAiscalledperfectionist- typescientist,whohasratherfewbutveryhighlycitedpublications. ForscientistB,E50.39,H50.48andT50.13,correspondingto x~{0:150, y~0:147. Its mapping point is situated at the small triangleNo.2,aboundaryregionbetweenh-indexande-indexdomi- nated regions. According to references5,8, ScientistB is called a pro- lific-type scientist, who publishes a large number of high-impact papers. For scientist C, E 5 0.10, H 5 0.33 and T 5 0.57, corres- pondingtox~0:271,y~{0:003.Itsmappingpointissituatedatthe smalltriangleNo.8,at-indexdominatedregion.ScientistCiscalleda mass producer5,8, who publishes a larger number of papers that are lowlycited.Itcanbeseenfromtheaboveanalysis,thelocationsofthe mapping points carry the information of the types of scientists. Therefore,thetrianglemappingtechniqueisparticularlyusefulwhen theacademicimpactofalargenumberofscientistsisstudied.Inthat case,clusteringanalysiscanbeperformedbasedonthemappingpoint Figure1|Thecitationdistributioncurve. They-axiscorrespondstothe locations,andthereforescientistscanbeclassifiedaccordingtotheir citationsreceivedbyapaper,whereasthex-axisrepresentsthepaperrank academicperformance. arrangedindescendingorderofcitations.Theareaunderthecitation Recently, Baum introduced a new parameter, called Excess-Tail distributioncurveisdividedbytheh-indexintothreeparts:h2,e2(excess) Ratio9,denotedbyR,whereR~E=T~e2=t2.Baumfoundthatfor andt2(h-tail). mostcaseshestudied,Rv1,evenRvv1:Onlyforfewcases,Rw1. The shapes of citation distribution functions for Rw1 are peaked, ThesecondexampleisforthechemistBerniAlder,whereh2~2500 (h~50), e2~12996 (e~114) and C ~184004, and we find total H~0:14,E~0:71andT~0:15.Therefore,themappingpointcor- respondingtoAlderissituatedattheregionNo.5,wherethee-index isabsolutelydominant(Fig.2B).ThisexampleshowsthatAlder’sh- indexseverelyunder-estimateshisacademicimpact,andinthiscase, thee-indexshouldbeusedtogetherwiththeh-indexforafairevalu- ation4. Inwhatfollows,letusapplythecitationtrianglemethodtostudy thecasesofcitationsofthe100mostprolificeconomists7.Thedata usedtoderivethecorrespondingh-index,e-indexandt-indexwere kindly provided by Dr. Tol. As a consequence, we calculated the coordinatesx andy ofeachmapping pointcorresponding toeach economist.Thedistributionofthe100pointsisshowedinFig.3A.As wecanseethatonlytwopointsaresituatedattheregionNo.3,i.e.,an h-indexdominantregion.Meanwhile,only11points(11%)aresitu- atedabovethehorizontallineH~1=3ory~0,wheretheh-index can be properly applicable. Accordingly, for the remaining cases (89%), where Hv1=3 or yv0, the h-index should be used jointly withthee-index,eventhet-index.Theaverageh-indexande-index overthe100pointsare19and28.14,respectively,correspondingto theaverage H and E being 0.26 and 0.48 (averagex~{0:13, y~ {0:07).Overall,tohaveafairandaccurateevaluation,theh-index shouldbeusedtogetherwiththee-indexeventhet-indexformostof the100mostprolificeconomists. Theh-indexcapturesonlytheinformationofthecitationfunction partially.However,theabovedistributionofthe100mappingpoints withinthetriangleprovidesmoreinformationabouttheshapesofthe corresponding citation functions. For example, the mapping points withinthesmalltriangleNo.5indicatethattheircitationdistribution functions are peaked on the beginning part. On the contrary, the mapping points within the small triangle No. 9 indicate that their citationdistributionfunctionsareflatwithalongtail.Inbothcases, theh-indexseemsnotappropriateincapturingthemaininformation Figure2|Thecitationtrianglemethodinstudyingh-indexbased ofcitationfunction.Tocomplementtheh-index,Bormannandco- citations. (A)AregulartriangleABCwithitsheightbeingequalto1and workers5 introduced three parameters: the h2upper, h2center and centersituatedatO.ADescartescoordinatesystemx{yissetupwithits h2lower,whichcorrespondtoE,HandT,respectively,inthispaper. originatO.Thethreesidesofthetrianglearedenotedbyh,eandt,andthe Inotherwords,thetrianglemappingtechniqueprovidesanintuitive distancesofaninteriorpointP(x,y)tothemareequaltoH,EandT, representationoftheh2upper,h2centerandh2lower.Bornmannand respectively.Therefore,thepointP(x,y)isthemappingpointforthethree co-workers5studiedtheshapesofthecitationdistributionfunctionsof realnumbersH,EandT.(B)Theregulartriangleisdividedintonine threescientists,A,BandC,belongingtoanisohindexgroupwithh5 smallerregulartriangles.TheintervalsofH,EandTforeachofthe9 14.ForscientistA,E50.82,H50.15andT50.03,corresponding smallertrianglesareshowninTable1. SCIENTIFICREPORTS |3:1023|DOI:10.1038/srep01023 2 www.nature.com/scientificreports Figure3|Distributionsofmappingpointsinthecitationtriangle. (A)Thedistributionofthe100mappingpointsforeachofthe100mostprolific economists. Notethatonly11points(11%)aresituatedattheregionswheretheh-indexcanbeapplicable(Hw1=3),indicatingthattheh-index shouldbeusedjointlywiththee-index,eventhet-index,fortheremaining89economists.(B)Anexampletodemonstratethatthepowerparameter lisoneofthekeyfactors,whichdeterminesthepositionofthemappingpoint.GivenC ~512andN~100,startingfromtheregionNo.3(theh-index 1 dominatedregion)withl~0:4,themappingpointmovestotheregionNo.6(thee-indexdominatedregion)withl~1:6.Interestingly,thetrack ofthemappingpointsformsaclockwiserotatingcurve. whereasforRv1theshapesofthecitationfunctionsareflatwitha asimplemathematicalmodelforthecitationdistributioncurveC(x)4 longtail.Therefore,theExcess-Tailratioisanappropriateparameter C CðxÞ~ 1,C ~C(1)w0, x§1, lw0: ð1Þ tocapturetheoverallshapesofthecitationfunctions.Accordingto xl 1 eq.(12),Rw1orRv1correspondstoxv0,orxw0,respectively. ThetotalcitationsreceivedbyNpapers,C ,is total N Discussion C ~ðC(x)dx~ C1 (cid:2)N1{l{1(cid:3): ð2Þ Inwhatfollows,wewanttoexplorethekeyfactorsthatdeterminethe total 1{l shapeofthecitationdistributionfunction.Aspreviously,weassume 1 SCIENTIFICREPORTS |3:1023|DOI:10.1038/srep01023 3 www.nature.com/scientificreports Table1|IntervalsofH,EandTforeachofthe9regions(smalltriangles)withinthecitationtriangle No. H E T Featureremark 1 Hw2=3 EzTv1=3 EzTv1=3 Theh-indexabsolutelydominatedregion 2 Hw1=3 Ew1=3 Tv1=3 Boundarybetweenh-indexande-indexdominatedregions 3 Hw1=3 Ev1=3 Tv1=3 Theh-indexdominatedregion 4 Hw1=3 Ev1=3 Tw1=3 Boundarybetweenh-indexandt-indexdominatedregions 5 HzTv1=3 Ew2=3 HzTv1=3 Thee-indexabsolutelydominatedregion 6 Hv1=3 Ew1=3 Tv1=3 Thee-indexdominatedregion 7 Hv1=3 Ew1=3 Tw1=3 Boundarybetweene-indexandt-indexdominatedregions 8 Hv1=3 Ev1=3 Tw1=3 Thet-indexdominatedregion 9 HzEv1=3 HzEv1=3 Tw2=3 Thet-indexabsolutelydominatedregion Basedoneq.(1),itwasshownthat4,10 receivedbythepapersintheh-coreareequaltoh2ze2.Theh-indexdividesthetotal citationsofascientistintotwoparts:thefirstpartisoftheh-core,whereasthesecond hlz1~C1: ð3Þ oneisoftheh-tail3.Forconvenience,wedefinethesquarerootofcitationsreceivedby However,weshouldhavehvN,whichleadsto ablylpaallppearpseinrsthoefah-stcaiielnatsistht,eCtt-oitnald,eisx.cTohmepreofsoerde,otfhtehnreuempbaerrtso:fht2o,tae2lcaintadtito2n,si.ree.,ceived lnC lwl0~lnN1{1: ð4Þ Ctotal~h2ze2zt2, ð9Þ Meanwhile,wehave4 whereh,eandtaretheh-,e-andt-index,respectively.Letting e2~ 1 (cid:4)C {lClz21(cid:5): ð5Þ H~h2=Ctotal,E~e2=Ctotal,T~t2=Ctotal, ð10Þ l{1 1 1 wehave Usingeqs.(1)–(5),wefind HzEzT~1: ð11Þ H~ h2 ~(1{l)|C111{zll,lwl ,l=1: ð6Þ Fsiodresanisyerqeuguallatrottrhiaenhgeleig,hthteosfutmheotfritahnegdleis.tCanocnessidfreormaraengyulianrtetrriioarngploeinAtBtCotwheiththirtese C N1{l{1 0 heightequalto1(Fig.2A).LetthecenterofthetrianglebedenotedbyO,andanx2y total coordinatesystemissetupasshowninFig.2A.Basedoneq.(11)andthefeatureof 1{l theregulartriangle,thesetofthreerealnumbersH,EandTismappedontoapoint e2 l|C1zl{1 P(x,y)withinthetriangle,asshowninFig.2A.Simplecalculationshowsthat E~Ctotal~ N1{1l{1 ,lwl0,l=1: ð7Þ (x~(T{E)=pffi3ffiffi~(1{H{2E)=pffi3ffiffi, ð12Þ y~H{1=3: Therefore,theconditionunderwhichtheh-indexcanbedominant shouldsatisfyHw1=3,or Thetrianglecanbedividedinto9smallertriangles(regions)asshowninFig.2B.We 1{l denotethembyNo.1throughNo.9,respectively.Eachregionischaracterizedbya (1{l)|C1zl 1 specialintervalofthethreerealnumbersH,EandT,respectively.Forexample,atthe N1{l{11 w3: ð8Þ regionNo.1,Hw2=3andEzTv1=3,indicatingthath2isabsolutelydominantat thisregionascomparedwithe2andt2.Similarly,attheregionNo.5,Ew2=3and To have an intuitive picture, we consider some numerical HzTv1=3,indicatingthate2isabsolutelydominantascomparedwithh2andt2.At examples as follows. Taking C ~512, N~100 and letting theregionNo.9,Tw2=3andHzEv1=3,indicatingthatt2isabsolutelydominant l~0:4, 0:5, 0:6, 0:7, 0:8, 0:9, 1:1,11:2, 1:3, 1:4, 1:5, 1:6, respect- ascomparedwithh2ande2.Furthermore,attheregionNo.3,Hw1=3,Ev1=3, Tv1=3,so,itiscalledanh-indexdominantregion;attheregionNo.6,Ew1=3, ively, we calculate the values of H and E for each case. Using eq. Hv1=3,Tv1=3,so,itiscalledane-indexdominantregion;andattheregionNo.8, (12),wefind12mappingpointsinthetriangle,asshowninFig.3B.It Tw1=3,Hv1=3,Ev1=3,so,itiscalledat-indexdominantregion.Finally,the isinterestingtoseethatwiththeincreaseofthelvalue,thetrackof regionNo.2istheboundaryregionbetweentheh-indexande-indexdominant themappingpointsformsaclockwiserotatingcurve.Thisexample regions,theregionNo.4istheboundaryregionbetweentheh-indexandt-index dominantregions,andtheregionNo.7istheboundaryregionbetweenthee-index showsthatthepowerparameterlisoneofthekeyfactorstodeter- andt-indexdominantregions.Theabovedescriptionhassymmetryofaregular minetheshapeofthecitationfunction.GivenC1 andN,thereisa triangle.ThetotaldescriptionissummarizedinTable1. thresholdofl,whenlislessthanthisthreshold,theh-indexcanno ThethreerealnumbersH,EandTarethepercentagesofcitationsassociatedwith longerbeproperlyapplicable.Infact,l??,h?1. theh-,e-andt-index,respectively.Ingeneral,Hshouldbegreaterthan1/3(oryw0), wheretheh-indexisproperlyapplicable,otherwise,ifHv1=3(oryv0),theh-index The main contribution of this paper is to propose the citation under-evaluatestheacademicimpactoftheresearcherconcerned.Therefore,thefour trianglemethod,bywhichtheshapesofcitationdistributionfunc- regionsNo.1,No.2,No.3andNo.4aretheregionswheretheh-indexcanbeproperly tionscanbestudiedinaperceivableform.Basedonthedistribution applied(Hw1=3).TheregionsNo.2,No.5,No.6andNo.7aretheregionswherethee- ofmappingpoints,applicabilityandlimitationoftheh-indexcanbe indexcanbeproperlyapplied(Ew1=3),whereasthoseofNo.4,No.7,No.8andNo.9 aretheregionswherethet-indexcanbeproperlyapplied(Tw1=3).Insummary,the studied. Generally, the h-index is not properly applicable in the h-indexcanonlybeproperlyappliedintheregionsNo.1,No.2,No.3andNo.4 e-index or t-index dominated regions. In those cases, the h-index (Hw1=3oryw0);andtheh-indexshouldbejointlyappliedtogetherwiththee-index shouldbejointlyappliedtogetherwiththee-indexort-index.The ort-indexintheremainingregionsNo.5throughNo.9(Hv1=3oryv0). proposedmappingtechniqueprovidesaplatformtostudytheaca- demicimpact ofa group of scientists, because some mathematical 1. Hirsch,J.E.Anindextoquantifyanindividual’sscientificresearchoutput.PNatl methods,suchasclusteringanalysis,canbeusedtostudythedistri- AcadSciUSA102,16569–16572(2005). butionofmappingpoints,andtheacademicimpactofthesescien- 2. Rousseau,R.NewdevelopmentsrelatedtotheHirschindex.ScienceFocus1, tistscanthenbeclassifiedandcompared. 23–25(2006). 3. Ye,F.Y.&Rousseau,R.Probingtheh-core:aninvestigationofthetail-coreratio forrankdistributions.Scientometrics84,431–439(2010). Methods 4. Zhang,C.T.Thee-Index,Complementingtheh-IndexforExcessCitations.Plos Theh-indexwasproposedbyHirschin20051.Thesetofhpapersofascientistwas One4,e5429(2009). calledtheh-core2,inwhichatleasthcitationswerereceivedbyeachofthehpapers. 5. Bornmann,L.,Mutz,R.&Daniel,H.D.Thehindexresearchoutput Thee-indexwasproposedbyZhang4,whichwasdefinedasthesquarerootofexcess measurement:Twoapproachestoenhanceitsaccuracy.JInformetr4,407–414 citationsoverthoseusedforcalculatingtheh-index.Therefore,thetotalcitations (2010). SCIENTIFICREPORTS |3:1023|DOI:10.1038/srep01023 4 www.nature.com/scientificreports 6. Dodson,M.V.Citationanalysis:Maintenanceofh-indexanduseofe-index. Authorcontributions BiochemBiophResCo387,625–626(2009). CTZdesignedthestudy,performedmostoftheexperiments,analyzeddataandwrotethe 7. Tol,R.S.J.Theh-indexanditsalternatives:Anapplicationtothe100mostprolific manuscript.Allauthorsreviewedthemanuscript. economists.Scientometrics80,317–324(2009). 8. Cole,S.&Cole,J.R.ScientificOutputandRecognition-StudyinOperationof RewardSysteminScience.AmSociolRev32,377–390(1967). Additionalinformation 9. Baum,J.TheExcess-TailRatio:CorrectingJournalImpactFactorsforCitation Competingfinancialinterests:Theauthordeclaresnocompetingfinancialinterests. Quality.SSRN,http://ssm.com/abstract52038102(2012). License:ThisworkislicensedunderaCreativeCommons 10.Egghe,L.&Rousseau,R.AninformetricmodelfortheHirsch-index. Attribution-NonCommercial-NoDerivs3.0UnportedLicense.Toviewacopyofthis Scientometrics69,121–129(2006). license,visithttp://creativecommons.org/licenses/by-nc-nd/3.0/ Howtocitethisarticle:Zhang,C.T.Anoveltrianglemappingtechniquetostudythe h-indexbasedcitationdistribution.Sci.Rep.3,1023;DOI:10.1038/srep01023(2013). Acknowledgements IthankDr.F.GaoandDr.K.SongandforhelpsinpreparingFigures2–3.Iamgratefulto Dr.RichardTolforkindlyprovidingthedatausedtocalculatethee-indexandt-indexfor eachofthe100mostprolificeconomists. SCIENTIFICREPORTS |3:1023|DOI:10.1038/srep01023 5

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.