RESEARCHARTICLE Valence, arousal, familiarity, concreteness, and imageability ratings for 292 two-character Chinese nouns in Cantonese speakers in Hong Kong LydiaT.S.Yee1,2,3* 1 DepartmentofPsychology,TheEducationUniversityofHongKong,TaiPo,HongKong,2 Centrefor PsychosocialHealth,TheEducationUniversityofHongKong,TaiPo,HongKong,3 CentreforBrainand a1111111111 Education,TheEducationUniversityofHongKong,TaiPo,HongKong a1111111111 a1111111111 *[email protected] a1111111111 a1111111111 Abstract Wordsarefrequentlyusedasstimuliincognitivepsychologyexperiments,forexample,in recognitionmemorystudies.Intheseexperiments,itisoftendesirabletocontrolforthe OPENACCESS words’psycholinguisticpropertiesbecausedifferencesinsuchpropertiesacrossexperi- Citation:YeeLTS(2017)Valence,arousal, mentalconditionsmightintroduceundesirableconfounds.Inordertoavoidconfounds,stud- familiarity,concreteness,andimageabilityratings for292two-characterChinesenounsinCantonese iestypicallychecktoseeifvariousaffectiveandlexico-semanticpropertiesarematched speakersinHongKong.PLoSONE12(3): acrossexperimentalconditions,andsodatabasesthatcontainvaluesfortheseproperties e0174569.https://doi.org/10.1371/journal. areneeded.WhilewordratingsforthesevariablesexistinEnglishandotherEuropeanlan- pone.0174569 guages,ratingsforChinesewordsarenotcomprehensive.Inparticular,whileratingsforsin- Editor:ChristosPapadelis,BostonChildren’s glecharactersexist,ratingsfortwo-characterwords—whichoftenhavedifferentmeanings Hospital/HarvardMedicalSchool,UNITED STATES thantheirconstituentcharacters,arescarce.Inthisstudy,ratingsfor292two-characterChi- nesenounswereobtainedfromCantonesespeakersinHongKong.Affectivevariables, Received:January22,2017 includingvalenceandarousal,andlexico-semanticvariables,includingfamiliarity,concrete- Accepted:March12,2017 ness,andimageability,wereratedinthestudy.Thewordswereselectedfromafilmsubtitle Published:March27,2017 databasecontainingwordfrequencyinformationthatcouldbeextractedandlistedalong- Copyright:©2017LydiaT.S.Yee.Thisisanopen sidetheresultingratings.Overall,thesubjectiveratingsshowedgoodreliabilityacrossall accessarticledistributedunderthetermsofthe rateddimensions,aswellasgoodreliabilitywithinandbetweenthedifferentgroupsofpar- CreativeCommonsAttributionLicense,which ticipantswhoeachratedasubsetofthewords.Moreover,severalwell-establishedrelation- permitsunrestricteduse,distribution,and reproductioninanymedium,providedtheoriginal shipsbetweenthevariablesfoundconsistentlyinotherlanguageswerealsoobservedin authorandsourcearecredited. thisstudy,demonstratingthattheratingsarevalid.Theresultingworddatabasecanbe DataAvailabilityStatement:Allrelevantdataare usedinstudieswherecontrolfortheabovepsycholinguisticvariablesiscriticaltothe withinthepaperanditsSupportingInformation researchdesign. files. Funding:Thisworkwassupportedbygrants awardedtoL.T.S.Y.bytheEducationUniversityof HongKong.Thegrantswere:RG64/2014-2015R, DRG/2014-15,andDRG/2015-16.Thefundershad noroleinstudydesign,datacollectionand analysis,decisiontopublish,orpreparationofthe manuscript. PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 1/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns Competinginterests:Theauthorhasdeclaredthat Introduction nocompetinginterestsexist. Wordsarecharacterizedbyanumberofaffectiveandlexico-semanticvariables.Examplesof affectivevariablesincludevalenceandarousal,whichrefertotheextentofpleasantorunpleas- antfeeling,andtheamountofphysiologicalresponse,thatawordevokesrespectively[1]. Valenceandarousalcanberatedsubjectively,andtheirnormativevaluesarederivedfromrat- ingstudies[2].Ontheotherhand,lexico-semanticvariablescanbeobjectiveorsubjective.An exampleofanobjectivepropertyiswordfrequency,whichcanbeassessedbyhowfrequentlya wordappearsinprint[3,4],ormorerecently,intelevisionandfilmsubtitles[5].Commonly studiedsubjectivelexico-semanticvariablesincludeconcreteness,imageability,andfamiliar- ity.Concretenessistheextenttowhichawordistangible[6].Imageabilityreferstotheeaseof generatingamentalimageforaword[6].Familiarityreferstohowfrequentlyawordisseen, heard,orusedineverydaylife[7].Likevalenceandarousal,valuesforthesesubjectivelexico- semanticvariablesareobtainedfromratingstudiesandusedtocreatepopulationnorms. Wordsarecommonlyusedasexperimentalstimuliincognitivepsychologystudies.For example,inmemoryresearch,wordsarefrequentlyusedtostudyrecollectionandfamiliarity, processesthatsupportrecognitionjudgments[8,9].Whileaffectiveandlexico-semanticpsy- cholinguisticpropertiesofwordsareactiveareasofresearchontheirown,theyarenuisance variablesthatmustbecarefullycontrolledforinmemorystudies.Thisisbecauseinmemory studies,theresearchfocusisnotonthepropertiesofthewordsbutonmemoryprocessing. Moreimportantly,psycholinguisticpropertiesareknowntoinfluencememoryprocesses.For example,researchhasshownthathighfrequencywordsarebetterrecalledthanlowfrequency words[10],whilelowfrequencywordsarebetterrecognized[11,12].Concretewords[11], wordsthatcaneasilyformanimageinthemind[13],andwordsthatarehighinvalenceor arousal[14],arealsofoundtobebetterremembered.Therefore,thesevariablesmustbecon- trolledforwhenwordsareusedasexperimentalstimuli.Thisisaccomplishedbysettingranges forpsycholinguisticvariableswhenwordsareselectedfromdatabases,e.g.,limitingtheselec- tiontoneutralwordsratedbetween3.5and4.5ona7-pointscale,andalsocounterbalancing thewords’appearanceacrossexperimentalconditionsand/oracrossparticipants.Inthe methodssectionofresearcharticles,itiscommonpracticetoreportwordlength,numberof syllables,andwordfrequencyofthewordstimuliselected,andhowthestimuliarecounterbal- anced[15–18].Althoughlesscommonlyreported,affectiveandlexico-semanticpsycholin- guisticpropertiesshouldbelistedinthemethodssectionsaswell,sincetheytoomight influencetheresults,asillustratedabove. Tohelpresearchersgeneratesuitablestimuliforexperiments,itisthereforecriticaltohave reliableandcomprehensivepsycholinguisticdatabases.Ratingsforawiderangeofpsycholin- guisticvariablesareavailableforEnglishwordsandwordsinotherEuropeanlanguages.Typi- cally,ratingsforaffectiveandlexico-semanticvariablesarecollectedinseparatestudies,as theyareoftenconsideredasdifferentresearchinterests.InEnglish,acomprehensivedatabase foraffectivevariablesisBradley&Lang[2],whichwasrecentlyvastlyexpanded[19].The Bradley&Langdatabasehasbeenusedtobuildsimilaraffectivedatabasesforotherlanguages includingSpanish[20],Italian[21],EuropeanPortuguese[22],andGerman[23].Lexico- semanticdatabasesarealsoavailableformanylanguages,includingEnglish[24–26],Italian [27],Spanish[28],andFrench[29].Recently,thereisincreasedinterestincollectingboth affectiveandlexico-semanticratingswithinthesamestudy,e.g.,[23,30–34],becausethis wouldpermitrelationshipsbetweenaffectiveandlexico-semanticvariablestobeexplored. Bycomparison,therearefewstudiesonaffectiveorlexico-semanticcharacteristicsofChi- nesewords.Althoughlexico-semanticratingsforsinglecharactersexist[35],totheauthor’s knowledge,noextensiveworkhasbeendonefortwo-characterwords.Ratingsforsingle PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 2/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns charactersareoftenuninformativeonratingsfortwo-characterwords.Thisisbecausein Chinese,whentwocharacterscombinetoformaword,thewordoftenconveysadifferent meaningthaneitherofitsconstituentcharacters.Forexample,thecharacter“電”means“elec- tricity”,whilethecharacter“腦”means“brain”.However,whenthetwocharacterscombine toformtheword“電腦”,itmeans“computer”.Therefore,ratingsfortwo-characterwords cannotbesimplyderivedfromratingsoftheirconstituentcharactersandhavetobecollected separately.Thereareonlyafewstudiesonpsycholinguisticpropertiesoftwo-characterwords, andnostudyhasconsideredbothaffectiveandlexico-semanticvariables,bothobjectiveand subjectiveones,withinthesamestudy.WhileHoetal.[36]andWangetal.[37]provided affectiveratingsfor160and1500two-characterwordsrespectively,lexico-semanticvariables werenotcollectedinthesestudies.Morerecently,Yaoetal.[38]collectedbothaffectiveand lexico-semanticratingsfor1100words,butdidnotprovideobjectivewordfrequencyinforma- tionforthewords.Asillustratedabove,wordfrequencyinformationisalsocrucialfordesign- ingpsychologyexperiments.Althoughtheoreticallyonecanretrievewordfrequency informationfromthedatabasesfromwhichYaoetal.[38]selectedtheirwords,namelythe ChineseAffectiveWordsSystem[37]andtheModernChineseDictionaryofCommonlyUsed Words,thesedatabasesarenotreadilyaccessibleonline,thushinderingeasyaccesstoword frequencyandsubjectiveaffectiveandlexico-semanticratingssimultaneously. Theaimofthecurrentstudywastofillthisgapbyestablishingadatabasefortwo-character Chinesewordsthatcanbeusedincognitivepsychologyexperiments.Ratingsfor292nouns onvalence,arousal,familiarity,concreteness,andimageabilitywerecollectedfromCantonese speakersinHongKong(CantoneseisthevarietyofChinesespokeninHongKong).These fivevariableswerechosenbecausetheyareknowntoinfluencememory,asreviewedabove. Wordswereselectedfromawordfrequencydatabase[39],whichpermittedwordfrequency informationtobeextractedandlistedinthedatabasecreated.Inter-raterreliabilitywascalcu- latedinordertoprovideameasureofqualityfortheratings.Also,relationshipsbetweenthe variouspsycholinguisticvariableswereexplored.Itwasexpectedthatrelationshipsthatare well-establishedinotherlanguageswouldbefound(e.g.,thestrongpositivecorrelation betweenconcretenessandimageability[6,32,40]),whichwouldserveasfurtherevidencefor thevalidityoftheratings. Methods Participants Atotalof164youngadults,recruitedfromtheundergraduatestudentbodyoftheEducation UniversityofHongKongviaconveniencesampling,participatedinthestudy.Allparticipants reportedthemselvestobenativespeakersofCantonese(theChinesedialectspokeninHong Kong),useTraditionalChinesecharacters(themorewidelyusedformatofwrittenChinesein HongKong,comparedtoSimplifiedChinesecharacters)forthemajorityoftheirday-to-day textcommunicationsincludingreadingandwriting,andhavenormalorcorrected-to-normal vision.TheyeitherreceivedextracoursecreditorHKD$20fortheirparticipation.Thestudy procedureswereapprovedbytheHumanResearchEthicsCommitteeoftheEducationUni- versityofHongKong.Writteninformedconsentwasobtainedfromparticipantsbeforethe studybegan. Materials Thewordpoolwasgeneratedfromthewordandcharacterdatabaseof[39].Thisdatabase containsChinesesinglecharactersandmulti-characterwordsalongwithfrequencyinforma- tion,whichisestablishedfromacorpusoffilmandtelevisionsubtitlesthatcontained46.8 PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 3/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns millioncharactersand33.5millionwords.Usingfilmandmoviesubtitlestoassesswordfre- quencyisdemonstratedtobeavalidmethodofobtainingwordfrequencyinformation[5,41], andhasbeenusedtoassesswordfrequencyforseveralotherlanguagesaswell[42–45].293 two-characternounswereselectedfromthelow-/medium-frequencysectionofthedatabase, withalogfrequencythatrangedbetween1.81(1.91wordspermillion)and2.71(15.23words permillion).Wordselectionwasrestrictedtothelowtomediumfrequencyrange,because wordsinthisfrequencyrangewereneededforasubsequentexperimentreportedelsewhere. Specificnames(e.g.,placenamesorpersonnames),itemsthatwerenotcomposedoftwochar- acters,oritemsthatwereinotherpartsofspeech(e.g.,connectivesorverbs),wereexcluded. Thefinalwordpoolhadameanlogfrequencyof2.29(9.87wordspermillion)andstandard deviationof0.20(1.22wordspermillion).SincethewordsintheCai&Brysbaert[39]database werewritteninSimplifiedChinesecharacters,conversiontoTraditionalChinesecharacters wasmadebyMSWord,andthenmanuallycheckedforaccuracy.Thenumberofstrokesof eachcharacterofthetwo-characterwordwascountedbyanonlinetool(https://name. longwin.com.tw/nos.php). The293wordsweredividedintofivelists.Thefirstlistcontained65wordsandtheother fourlistscontained57wordseach.Eachlistwouldgoontoberatedbyaseparategroupof participants(seetheProceduresectionbelow).Fivelistsweresetupbecausethewordpool waslarge.Ifparticipantsweretorateallwords,longratingsessionswouldbeneeded,which wouldlikelyresultinfatigueandnon-complianceintaskperformance(e.g.,makingthesame responseforallratings).Foreachlist,fivewordswerechosenatrandomtorepeatwithinthe list,soastoobtainameasureofinternalratingconsistency.Next,fromthe65uniquewordsin thefirstlist,eightwordswerechosenatrandomtoberepeatedintheotherfourlists,inorder toestablishinter-raterreliabilitybetweengroupsofparticipants.Asaresult,eachlistcon- tained70wordsintotal,uniqueandrepeatedwordscombined.Apartfromthewordsthat wereplannedtorepeatwithinorbetweenthefivelists,onewordwasrepeatedinanotherlist bymistake.Thisrepeatedwordwasremovedfromanalyses,resultinginratingsfor292unique wordsintotal. Procedure Eachparticipantreceivedoneofthefivelists,andratedeachofthe70wordsinthelistalong fivedimensions:valence,arousal,familiarity,concreteness,andimageability.Participants ratedall70wordsinonedimensionfirst,beforemovingontothenextdimension,untilrat- ingsforallfivedimensionswerecompleted.Thisdesignwasadoptedtominimizethepossibil- itythatratingsofdifferentdimensionswouldinfluenceeachother.Theorderofdimensions wascounterbalancedbetweenparticipants.Withinparticipants,foreachdimension,theorder ofwordpresentationwasrandomized.Participantswereinstructedtorateattheirownpace. Theywerealsotoldthattherewerenocorrectorincorrectanswers,andthattheyshould respondwiththeirfirstimpression.Eachratingsessionlastedforabout25minutes,including thetimerequiredforgivinginstructions. Participantssatabout60cmawayfromacomputermonitorandmaderesponsesusinga keyboard.ThestudywasprogrammedinE-prime.WordswerepresentedinthefontPMin- gLiU(新細明體)withafontsizeof24.Atthebeginningofthestudy,participantsweregiven ageneraloverviewoftheaimofthestudy.Next,thedefinitionsofthefivedimensionswere explained.Foreachdimension,twowordsthathadreceivedalowandahighratinginprevi- ousratingstudieswereprovidedasexamples.Theyalsoservedasanchorsfortheratingpro- cess.Theseexamplewordswerenotpartofthewordlisttoberated.Instructionswere presentedinaself-pacedmanner,andparticipantswereencouragedtoaskclarification PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 4/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns questions.Beforetheratingofeachdimensionbegan,participantswereagainremindedofits definitionandthetwoanchorexamples. Measures A5-pointLikertscalewasusedfortheratingsofalldimensions.Ahorizontalbarwithfive evenlyspacedresponseoptionsremainedonthescreenthroughouttheratingprocessto remindparticipantsaboutthedirectionofthescale.Thestudywasperformedinsmallgroups ofonetofourparticipants,dependingonthesign-uprate.Participantswereinstructedto pressa6thkeyiftheydidnotknowthemeaningofaword,andthosewordswereexcluded fromanalysisforthatparticularparticipant. Forthevalence(愉悅度)dimension,instructionswereadaptedfrom[2].Participantswere askedtojudgethedegreetowhichtheyfindthewordstobepleasant,ona5-pointscalewith1 beingunpleasant(不悦)and5beingpleasant(愉快).Theexamplesprovidedwerethefollow- ing:mostpeoplefindtheword“rejection”(拒絕)tobeunpleasant,sotheywillchoose1or2. Bycontrast,mostpeoplefindtheword“romance”(浪漫)tobepleasant,sotheywillchoose4 or5. Forthearousal(激烈度)dimension,instructionswereadaptedfrom[2].Participants wereaskedtojudgethedegreetowhichtheyfindthewordstobearousing,ona5-point scalewith1beingcalm(平靜)and5beingexcited(激動).Theexamplesprovidedwerethe following:mostpeoplefindtheword“leisure”(悠閒)tobecalm,sotheywillchoose1or2. Bycontrast,mostpeoplefindtheword“nightmare”(惡夢)tobearousing,sotheywill choose4or5. Forboththevalenceandthearousaldimensions,besidesthehorizontalbarthatrepre- senteda5-pointscale,weadoptedtheratingscalesoftheSelf-AssessmentManikin(SAM) [2,46,47]tofacilitaterating.SAMisascalemadeupofpictorialdepictionsofdifferentlevelsof valenceandarousal.Theoriginal9-pointscalewasadaptedtoa5-pointscale,usingthe1st,3rd, 5th,7th,and9thfiguretorepresentthefivelevels.Increasinglevelsofvalencewererepresented byamanchangingfromfrowningtosmiling,whileincreasinglevelsofarousalwererepre- sentedbyamanchangingfromsleepytobeingwidelyawake,withanincreasinglylarge “explosion”insidehisbody.Also,thelanguageusedin[2]wasadopted:forthevalencescale, participantswereaskedtochoose1whentheywere“completelyunhappy,annoyed,unsatis- fied,melancholic,despaired,orbored”(p.2)andtochoose5whentheywere“happy,pleased, satisfied,contented,hopeful”(p.2).Iftheyfelt“completelyneutral,neitherhappynorsad” (p.2),theyshouldchoose3.Forthearousalscale,participantswereaskedtochoose1when theywere“completelyrelaxed,calm,sluggish,dull,sleepy,orunaroused”(p.2)andtochoose 5whentheywere“stimulated,excited,frenzied,jittery,wide-awake,oraroused”(p.2).Ifthey were“notexcitednoratallcalm”(p.2),theyshouldchoose3. Forthefamiliarity(熟悉度)dimension,instructionswereadaptedfrom[7,26,40].Partici- pantswereaskedtojudgethedegreetowhichtheyfindawordtobefamiliar,intermsofhow oftentheyencounterawordineverydaylife,ona5-pointscalewith1beingunfamiliar(陌生) and5beingfamiliar(熟悉).Theexamplesprovidedwerethefollowing:mostpeopleencoun- tertheword“Yucca”(絲蘭,akindofplant)infrequentlyandfinditlessfamiliar,sotheywill choose1or2.Bycontrast,mostpeopleencountertheword“newspaper”(報紙)frequently andfinditfamiliar,sotheywillchoose4or5. Fortheconcreteness(具體性)dimension,instructionswereadaptedfrom[6].Participants wereaskedtojudgethedegreetowhichtheyfoundthewordstobeconcrete,ona5-point scalewith1beingabstract(抽象)and5beingconcrete(具體).Theexamplesprovidedwere thefollowing:mostpeoplefindtheword“how”(如何)tobeabstract,sotheywillchoose1or PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 5/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns 2.Bycontrast,mostpeoplefindtheword“banana”(香蕉)tobeconcrete,sotheywillchoose4 or5. Fortheimageability(想象度)dimension,instructionswereadaptedfrom[6,48].Partici- pantswereinstructedtojudgetheeasewithwhichtheycanformanimageofthewordintheir mind,ona5-pointscalewith1beingdifficulttoformanimageinthemind(難以在腦海裡形 成影像),and5beingeasytoformanimageinthemind(容易在腦海裡形成影像).The examplesprovidedwerethefollowing:mostpeoplefinditdifficulttocomeupwithamental imagefortheword“taxationrate”(稅額),sotheywillchoose1or2.Bycontrast,mostpeople finditeasytocomeupwithamentalimagefortheword“beef”(牛肉),sotheywillchoose4 or5. Supplementarymaterial TheresultingratingsarelistedintheSupportingInformationsection.InanExcelspreadsheet, the292wordsaresortedindescendingorderbytheirlogfrequency.Englishtranslations(per- formedonlineusingGoogleTranslate,BingTranslate,PROMT(Online-Translator.com),and LanguageWeaver(www.reverso.net))ofthewordsareprovidedforreference.Thefivevari- ablesareorderedfromlefttorightinthefollowingmanner:valence,arousal,familiarity,con- creteness,andimageability.Foreachwordandforeachratingdimension,themean(M) ratingforthewordanditsstandarddeviation(SD)arelisted.Foreachword,thenumberof strokesofeachcharacter,aswellasthewordasawhole,arealsolisted.Thispropertymightbe usefultoresearcherswhoareinterestedinthevisualcomplexityofChinesewords. Wordfrequencyforeachwordisexpressedinwordspermillion(wpm),logwpm,andthe Zipfvalue.wpmandlogwpmarewidelyusedfrequencymeasuresinpsycholinguisticstudies. Likelogwpm,theZipfscaleisalogarithmicscale.Itisderivedusingtheformulalog ofthe 10 frequencypermillionwords+3,whichcanbeconceptualizedaslog ofthefrequencyperbil- 10 lionwords.TheZipfvalueisprovidedalongsidethewidelyusedwpmandlogwpmmeasures becausetheZipfscaleismoreintuitive[44]:ittypicallyvariesbetween1to7likeaLikert- scale,andlowfrequencywordsthathaveanegativevalueafterthelog frequencypermillion 10 transform(lowfrequencywordsmayhavelessthanoneoccurrencepermillionwords,which resultinnegativelogvalues)hasamoreintuitivevalueof1or2afteralog frequencyperbil- 10 liontransform. Resultsanddiscussion Datatrimming Beforetheratingswereanalyzed,therawdatawasexaminedforoutliersinfoursteps.Inthe firststep,foreachparticipant,theproportionofresponsesthathadareactiontimeof300ms orbelowwascalculated.Sixparticipantswerefoundtohavemademorethan15%oftheir responsesinlessthan300ms.Theywereremovedfromfurtheranalyses,asthehighpropor- tionofaberrantlyquickresponsessuggestedthattheseparticipantsdidnotperformthetask properly.A300msresponseisconsideredasaberrantlyquick,asperstandardpracticeinpre- viousstudies[49,50].Also,asinthesetwopreviousstudies,wedidnotsetanupperlimitfor reactiontimebecausespeedwasnotemphasizedintheinstructions. Inthesecondstep,participantswhoreportedanaberrantlylargenumberofwordsthat theydidnotknowwereexcludedfromfurtheranalysis.Consideringthatthemajority(83%) ofparticipantsdidnothaveanysuchresponses,thosewhoreportedthattheydidnotknow morethan10wordsoutofatotalof70wereconsideredasaberrant.Eightparticipantswere excludedbythiscriterion.Threeadditionalparticipantswereexcludedbecausetheyhadmore than10suchresponsesforallratingdimensionscombined.Althoughwithinasingle PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 6/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns dimension,thenumberofwordsthattheyreportedthattheydidnotknowwassmall,those wordswerenotconsistentacrossdimensions,castingdoubtsontheseparticipants’compliance withtaskinstructions. Afterthefirsttwosteps,147participantsremained.Oftheseparticipants,only17reported aword/wordsthattheydidnotknow.Inthese17participants,theyreportedanaverageof 0.82±0.58words(mean±S.D.)perratingdimensionthattheydidnotknow.Thisnumberis consideredreasonable.Next,fortheseremainingparticipants,trialsunder300mswere excludedfromanalysis.Thisledtoaremovalof523ratingsoutofatotalof51,450ratings acrossalldimensionsacrossallparticipants. Inthethirdstep,themeanandstandarddeviationoftheratingswerecalculatedforeach word.Responsesthatwereoutsidetherangeof±2.5S.D.wereremovedfromfurtheranalysis, resultinginafurtherremovalof585ratingsacrossalldimensionsacrossallparticipants. Inthelaststep,internalreliabilitywascalculatedforeachsubject,usingthefivewordsthat wereratedtwicewithinparticipants.Asin[48],participantswithconsistencythatwaslower than0.2,asindicatedbythePearsoncorrelationcoefficient,wereremovedfromfurtheranaly- sis.Fourparticipantswereremovedbythiscriterion.Themeanandstandarddeviationofcon- sistencyintheremaining143participantswas0.74±0.18,demonstratinggoodconsistency. Therefore,thefinaldatasetconsistedofratingsfrom143participants(87%ofparticipants tested).Eachwordreceivedatleast22andatmost33validratings.Forvalence,arousal,famil- iarity,concreteness,andimageability,eachwordreceived27.8±3.10,28.3±2.99,27.3±2.74, 28.2±2.97,and28.4±3.02(mean±S.D.)validratingsrespectively. Reliability First,ratingconsistencywithineachsampleofparticipants(25ormoreparticipantspersam- ple,whoratedthesamewordlist)wasassessed.Adoptingthesameapproachas[32],theinter- raterreliabilityoftheratingswasassessedbycalculatingtheintra-classcorrelationcoefficients (ICCs).TheICCswerecalculatedforthefivedimensionsseparately.Foreachdimension,an ICCwasobtainedforeachsampleofparticipants.Then,ameanICCforeachdimensionwas obtainedbyaveragingtheICCsofthefivesamplesofparticipants.Themeanandstandard deviationoftheICCsforeachdimensionare:valence:0.95±0.02,arousal:0.86±0.04;famil- iarity:0.77±0.08;concreteness:0.84±0.04;andimageability:0.81±0.07.Overall,theICCs werehighforallfivedimensions,demonstratinghighinter-raterreliabilityinoursamples.In ordertoprovideastandardizedmeasureofvariability,thecoefficientofvariation(CV)was alsocomputedforeachratingdimension.TheCVswere3%,5%,10%,5%,and8%forvalence, arousal,familiarity,concreteness,andimageabilityrespectively.Therefore,boththeunstan- dardizedandstandardizedmeasuresofvariabilityindicatethatvalencewasthemostconsis- tentlyrateddimension,whilefamiliaritywastheleastconsistentlyrated,althoughstill reachingasatisfactorylevel.Thefindingthatvalencehasamoreconsistentratingthanarousal isconsistentwithpreviousstudiesformanylanguages[20,22,32,51,52]. Next,ratingconsistencyamongthefivedifferentsamplesofparticipantswasassessedby calculatingtheratingconsistencyoftheeightwordsthatrepeatedacrossthedifferentsamples ofparticipants.AnICCwascalculatedforeachdimension.TheICCswere0.99,0.98,0.82, 0.98,and0.98forvalence,arousal,familiarity,concreteness,andimageabilityrespectively, demonstratinghighlyconsistentratingsforthefivesamplesofparticipants. Relationshipbetweenvariables Pearsoncorrelationswerecalculatedforallpossiblepair-wisecombinationsofvariables.There areseveralnotablerelationships(Table1).First,concretenesshadastrongpositivecorrelation PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 7/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns Table1. Correlationbetweenvariables. Valence Arousal Familiarity Concreteness Imageability Wordfrequency(logwpm) Numberofstrokes Valence — -.20** .38*** -.12* -.01 .03 -.00 Arousal — -.11 -.02 .02 -.01 .10 Familiarity — .34*** .41*** .16** .07 Concreteness — .88*** .21*** .02 Imageability — .09 -.00 Wordfrequency(logwpm) — -.13* Numberofstrokes — *p<0.05 **p<0.01 ***p<0.001 Pearsoncorrelationcoefficientsforallpair-wisecombinationsofvalence,arousal,familiarity,concreteness,imageability,wordfrequency(measuredinlog wordspermillion(wpm)),andnumberofstrokes(ofbothcharactersofthewordsummedtogether). https://doi.org/10.1371/journal.pone.0174569.t001 withimageability(r=.88,p<0.001).Thesamestrongpositivecorrelationhasbeenfoundin English[6,40],Spanish[32],French[30],EuropeanPortuguese[53],andveryrecently,Chi- nese[38].Theconvergencebetweenseverallanguagesindicatesthat,acrossthesedifferentlan- guages,ingeneralitiseasiertoformamentalimageforconcretewordsthanforabstract words.Thisfindingisconsistentwiththedual-codingtheory[54,55],whichpostulatesthat twosystems,oneverbal-basedandtheotherimagery-based,areinvolvedintherepresentation ofthesemanticsofastimulus.Accordingtothistheory,concretewordsareprocessedfaster thanabstractwords,aphenomenonknownastheconcretenesseffect[55],becausewhile abstractwordsarecodedbytheverbal-basedsystemonly,concretewordscanbecodedby boththeverbal-andtheimagery-basedsystems.Theadditionalimagery-basedcodingpro- videsanadditionalformofrepresentationtofacilitateprocessing. Itisimportanttohighlightthat,althoughconcretenessandimageabilityarehighlycorre- lated,theyarenotidenticalconstructs,asseveralauthorsadvocatedinrecentstudies[53,56– 58].Guasch[32]foundthatwhilewordstendtoco-varyinthesetwodimensions,thereare alsowordsthatarehighinconcretenessbutlowinimageability(e.g.,tuberculosis),andwords thatarelowinconcretenessbuthighinimageability(e.g.,issue).Hereinthisstudy,itwas foundthatconcretenessandimageabilityeachhadoverlappingbutdifferentpatternsofcorre- lationswithotheraffectiveandlexico-semanticvariables.Specifically,althoughbothconcrete- nessandimageabilitycorrelatewithfamiliarity,onlyconcretenessisfoundtofurthercorrelate withvalenceandwordfrequency,furtherrevealingthatconcretenessandimageabilityare overlappingbutnotsynonymousconstructs.Theiroverlapislikelybecauseoneofthedecision criteriaforconcretenessiswhetheraworddescribesanobjectthatexistsinreallife,whichnat- urallycorrespondstowhetherthewordcanarousevisualimagery.Thiscriteriontherefore overlapswiththedecisioncriterionforimageability,andexplainstheirsharedvariance.On theotherhand,theirnon-overlappingvariancesinvitefurtherresearchintowhatcontributeto thedecisioncriteriawhenpeoplemaketheseratingjudgments.Forexample,Connell&Lynott [56]proposedwhetherawordrepresentsatypicalexemplarofaconcretecategory(objects, materials,etc.)alsocontributestoitsconcretenessrating. Bothconcretenessandimageabilitycorrelatepositivelywithfamiliarity.Thesamecorrela- tionswerefoundinpreviousstudies:forthemoderatepositivecorrelationbetweenconcrete- nessandfamiliarity(r=.34,p<0.001),asimilarcorrelationwasfoundin[38];forthe moderatepositivecorrelationbetweenimageabilityandfamiliarity(r=.41,p<0.001), PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 8/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns similar-sizedcorrelationshavebeenobtainedin[26,31,32,38].Theimplicationofthesecorre- lationsisthatmoreconcreteandmoreimageablewordstendtobeperceivedasmorefamiliar. Besidessubjectivefamiliarity,concretenessalsohadamoderatepositivecorrelationwith objectivewordfrequency(r=.21,p<0.001).Previousstudieshavealsoidentifiedapositive correlationbetweenconcretenessandprintedfrequency[6,59].Together,theyillustratethat moreconcretewordsnotonlyareperceivedtobemorefamiliar,buttheyalsoobjectively occurmorefrequentlyinthelanguage. Concretenesswasfoundtocorrelatewithvalencenegatively(r=-.12,p<0.05).Thiseffect wasalsoobservedin[60],andthesizeofthecorrelationishighlysimilar(r=-.11in[60]).It meansthatwordsthataresubjectivelyperceivedtobemorepleasantarealsoperceivedtobe moreabstract.Thisfindinghighlightstheimportanceoftakingconcretenessintoconsider- ationwheninvestigatingtheprocessingofemotionalwords,andviceversa.Indeed,ithasbeen shownthatconcretenessinfluencestheprocessingofemotionalwords[58,61,62].Broadly speaking,thisfindingdemonstratedthataffectivevariablescorrelatewithlexicalvariables,and highlightstheimportanceofinvestigatingtherelationshipbetweenthetwoinfuturestudies. Familiaritycorrelatedpositivelywithvalence(r=.38),consistentwith[31],wheretheyfur- therdemonstratedthatthecorrelationwasdrivenbyaresponsebias,inthatparticipantswere morewillingtosaythattheywerefamiliarwithpositivewords.Familiarityalsocorrelatedpos- itivelywithwordfrequency(r=.16),consistentwithseveralpreviousstudies[24,26,31], althoughthesizeofthecorrelationissmallerthaninthesestudies,whichhadr’sof.35or above.Thesmallcorrelationislikelyduetotherestrictedfrequencyrangeofthewords includedinthisstudy,makingitdifficulttofindalargecorrelation.Inaddition,subjective familiarityforlowfrequencywordsislikelyinfluencedbywordprevalence,ameasureofthe numberofpeopleinthepopulationwhoknowtheword[63]. Lastly,weexaminedwhetherthereisaquadraticrelationshipbetweenvalenceandarousal, ashasbeenwidelyreported.Bradley&Lang[2]observedaquadraticrelationshipbetween valenceandarousal,wherehighlypositiveandnegativewordshadhigherratingsonarousal. Sincethen,thesameeffecthasbeenobservedmanytimesinratingstudiesindifferentlan- guages,includingFinnish[51],Spanish[20,32,64,65],German[34,66],EuropeanPortuguese [22],andChinese[36].Astep-wiseregressionanalysiswithalinearandaquadratictermwas performed,withvalenceanditssquareastheindependentvariablesandarousalasthedepen- dentvariable.Althoughalinearmodelwassignificant(R=0.20,F(1,290)=11.61,p<0.005), itonlyaccountedfor3.8%ofthevariance.Bycontrast,thequadraticmodelexplainedanaddi- tional42.4%ofthevariance(R=0.68,F(2,289)=124.20,p<0.001),hencedemonstratinga betterfitforthequadraticmodel.Fig1showsthistypicalU-shapedfunctionbetweenvalence andarousal. ItisofparticularinteresttocomparethecurrentstudywithYaoetal.[38],sincebothstud- iesacquiredratingsonvalence,arousal,familiarity,concreteness,andimageabilityforChinese words,andonlydifferedintwoaspects:Yaoetal.[38]didnotprovidewordfrequencyinfor- mation,whilethecurrentstudydidnotcollectcontextavailabilityratings.Wordsinthetwo studiesonlyoverlappedtoasmallextent:amongthe292wordsratedinthecurrentstudy, only26wereratedbyYaoetal.Despitethesmalloverlap,thehighlystatisticallysignificant relationshipsbetweenvariablesfoundinthecurrentstudy(p’s<0.01)wereallfoundinYao etal.[38],includingthequadraticrelationshipbetweenvalenceandarousal,andthecorrela- tionsbetweenconcretenessandfamiliarity,betweenconcretenessandimageability,between familiarityandimageability,andbetweenfamiliarityandvalence,indicatingthattheserela- tionshipsarelikelystableforChinesewordsingeneral. PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 9/16 Affectiveandlexico-semanticratingsfor292two-characterChinesenouns Fig1.Distributionofthemeanratings(providedbyatleast22participants)forthe292wordsinthevalenceandarousal dimensions. https://doi.org/10.1371/journal.pone.0174569.g001 Potentialusesofthecurrentdatabase AsmentionedintheIntroduction,themotivationforconductingthisstudywastocreatea databasethatcanbeusedtogeneratesetsofChinesewordsthatarematchedonvariouspsy- cholinguisticvariables.Toshowthatthedatabasecanbeusedforthispurpose,analyseswere conductedinwhichrandomsamplesofwordsweredrawnfromthedatabase,splitintotwo halves,andtestedtoseeifthetwohalveswereindeedmatchedonthevariouspsycholinguistic variables.Thisanalysisapproachmimickedthetypicalwaythedatabasewouldbeusedtogen- eratestimuliforexperiments.Sevenseparateanalyseswereconducted,usingvaryingsample sizesof120,140,160,180,200,220,and240words.Thesesamplesizeswerechosenbecause theyarerepresentativeofthenumberofstimulitypicallyrequiredforsmall-scalememory experiments[e.g.,Experiment1inHennesseeetal.[67]presented180wordsintotal;Hopp- sta¨dteretal.[68,69]presented200wordsand100wordsintheirstudiesrespectively;Ozubko etal.[70]presented120wordsintotal].Eachsamplewassplitintotwosetsandcompared againsteachother(insteadofsplittingintothreeormoresets)becauseadmittedly,thedata- basehasonly292wordsandisunlikelytohaveenoughstimuliformorecomplexexperimental designs.Thesmallnumberofwordsinthedatabaseiscurrentlyitsmajorlimitation.Now,a samplesizeof120wordswillbeusedasanexampletoillustratehowtheanalyseswere PLOSONE|https://doi.org/10.1371/journal.pone.0174569 March27,2017 10/16
Description: