ebook img

Human Perception of Visual Information: Psychological and Computational Perspectives PDF

297 Pages·2021·6.909 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Human Perception of Visual Information: Psychological and Computational Perspectives

Bogdan Ionescu Wilma A. Bainbridge Naila Murray   Editors Human Perception of Visual Information Psychological and Computational Perspectives Human Perception of Visual Information Bogdan Ionescu • Wilma A. Bainbridge Naila Murray Editors Human Perception of Visual Information Psychological and Computational Perspectives Editors BogdanIonescu WilmaA.Bainbridge PolitehnicaUniversityofBucharest UniversityofChicago Bucharest,Romania Chicago,IL,USA NailaMurray NAVERLabsEurope Meylan,France ISBN978-3-030-81464-9 ISBN978-3-030-81465-6 (eBook) https://doi.org/10.1007/978-3-030-81465-6 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNatureSwitzerland AG2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuse ofillustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Thereisonethingthephotographmustcontain,thehumanityofthemoment. —RobertFrank Computational models of objective visual properties such as semantic content and geometric relationships have made significant breakthroughs using the latest achievements in machine learning and large-scale data collection. There has also beenlimitedbutimportantworkexploitingthesebreakthroughstoimprovecompu- tational modelling of subjective visualproperties such asinterestingness,affective values and emotions, aesthetic values, memorability, novelty, complexity, visual compositionandstylisticattributes,andcreativity.Researchersthatapplymachine learningtomodelthesesubjectivepropertiesareoftenmotivatedbythewiderange ofpotentialapplicationsofsuchmodels,includingforcontentretrievalandsearch, storytelling,targetedadvertising,educationandlearning,andcontentfiltering.The performance of such machine learning-based models leaves significant room for improvement andindicatesaneed forfundamental breakthroughs inourapproach tounderstandingsuchhighlycomplexphenomena. Largely in parallel to these efforts in the machine learning community, recent yearshavewitnessedimportantadvancementsinourunderstandingofthepsycho- logical underpinnings of these same subjective properties of visual stimuli. Early focuses in the vision sciences were on the processing of simple visual features like orientations, eccentricities, and edges. However, utilizing new neuroimaging techniquessuchasfunctionalmagneticresonanceimaging,breakthroughsthrough the 1990s and 2000s uncovered specialized processing in the brain for high-level visualinformation,suchasimagecategories(e.g.,faces,scenes,tools,objects)and morecompleximageproperties(e.g.,real-worldobjectsize,emotions,aesthetics). Recent work in the last decade has leveraged machine learning techniques to allow researchers to probe the specific content of visual representations in the brain.Inparallel,thewidespreadadventoftheInternethasallowedforlarge-scale crowd-sourced experiments, allowing psychologists to go beyond small samples with limited, controlled stimulus sets to study images at a large scale. With the combinationoftheseadvancements,psychologyisnowabletotakeafreshlookat v vi Preface age-oldquestionslikewhatwefindinteresting,whatwefindbeautiful,whatdrives ouremotions,howweperceivespaces,orwhatweremember. The field of machine learning, and Artificial Intelligence more broadly, enjoys alongtraditionofseekinginspirationfrominvestigationsintothepsychologyand neuroscience of human and non-human intelligence. For example, deep learning neuralnetworksinComputerVisionwereoriginallyinspiredbythearchitectureof thehumanvisualsystem,withitsmanylayersofneuronsthoughttoapplyfiltersat each stage. Psychology and neuroscience also rely heavily on developments from ArtificialIntelligence,bothforparsingdowntheBigDatacollectedfromthebrain andbehavior,aswellasforunderstandingtheunderlyingmechanisms.Forexample, now,objectclassificationdeepneuralnetworkssuchasVGG-16arefrequentlyused as stand-ins for the human visual system to predict behavior or even activity in the brain. Given the progress made in machine learning and psychology towards more successfully modelling subjective visual properties, we believe that the time isripetoexplorehowtheseadvancescanbemutuallyenrichingandleadtofurther progress. Tothatend,thisbookshowcasescomplementaryperspectivesfrompsychology and machine learning on high-level perception of images and videos. It is an interdisciplinaryvolumethatbringstogetherexpertsfrompsychologyandmachine learning in an attempt to bring these two, at a first glance, different fields, into conversation, while at the same time providing an overview of the state of the art in both fields. The book contains 10 chapters arranged in 5 pairs, with each pair describing state-of-the-art psychological and computational approaches to describingandmodellingaspecificsubjectiveperceptualphenomenon. In Chap. 1, Lauer and Võ review recent studies that use diverse methodologies like psychophysics, eye tracking, and neurophysiology to help better capture human efficiency in real-world scene and object perception. The chapter focuses in particular on which contextual information humans take advantage of most and when. Further, they explore how these findings could be useful in advancing computer vision and how computer vision could mutually further understanding of human visual perception. In Chap. 2, Constantin et al. consider the related phenomenonofinterestingnesspredictionfromacomputationalpointofviewand present an overview of traditional fusion mechanisms, such as statistical fusion, weighted approaches, boosting, random forests, and randomized trees. They also include an investigation of a novel, deep learning-based system fusion method for enhancingperformanceofinterestingnesspredictionsystems. InChap.3,Bradleyetal.reviewrecentresearchrelatedtophotographicimages thatdepictaffectivelyengagingevents,withthegoalofassessingtheextenttowhich specificpicturesreliablyengageemotionalreactionsacrossindividuals.Inparticu- lar, they provide preliminary analyses that encourage future investigations aimed atconstructingnormativebiologicalimagedatabasesthat,inadditiontoevaluative reports, provide estimates of emotional reactions in the body and brain for use in studiesofemotionandemotionaldysfunction.Onthecomputationalside,inChap. 4, Zhao et al. introduce image emotion analysis from a computational perspective with a focus on summarizing recent advances. They revisit key computational Preface vii problemswithemotionanalysisandpresentindetailaspectssuchasemotionfeature extraction, supervised classifier learning, and domain adaptation. Their discussion concludes with the presentation of the relevant datasets for evaluation and the identificationofopenresearchdirections. InChap.5,Chamberlainsetsoutthehistoryofempiricalaestheticsincognitive scienceandthestateoftheresearchfieldatpresent.Thechapteroutlinesrecentwork on inter-observer agreement in aesthetic preference before presenting empirical work that argues the importance of objective (characteristics of stimuli) and sub- jective(characteristicsofcontext)factorsinshapingaestheticpreference.Valenzise et al. explore machine learning approaches to modelling computational image aesthetics, in Chap. 6. They overview the several interpretations that aesthetics have received over time and introduce a taxonomy of aesthetics. They discuss computationaladvancesinaestheticsprediction,fromearlymethodstodeepneural networks, and overview the most popular image datasets. Open challenges are identifiedanddiscussed,includingdealingwiththeintrinsicsubjectivityofaesthetic scoresandprovidingexplainableaestheticpredictions. Bainbridge,inChap.7,drawsfromneuroimagingandotherresearchtodescribe our current state-of-the-art understanding of memorability of visual information. Such researchhasrevealed thatthebrainissensitivetomemorability bothrapidly and automatically during late perception. These strong consistencies in memory across people may reflect the broad organizational principles of our sensory environmentandmayrevealhowthebrainprioritizesinformationbeforeencoding itemsintomemory.InChap.8,Bylinskiietal.examinethenotionofmemorability with a computational lens, detailing the state-of-the-art algorithms that accurately predictimagememorabilityrelativetohumanbehavioraldata,usingimagefeatures atdifferentscalesfromrawpixelstosemanticlabels.Beyondprediction,theyshow how recent Artificial Intelligence approaches can be used to create and modify visualmemorability,andpreviewthecomputationalapplicationsthatmemorability canpower,fromfilteringvisualstreamstoenhancingaugmentedrealityinterfaces. In Chap. 9, Akcelik et al. review recent research that aims to quantify visual characteristics and design qualities of built environments, in order to relate more abstractaspectsofanurbanspacetoquantifiabledesignfeatures.Uncoveringthese relationshipsmayprovidetheopportunitytoestablishacausalrelationshipbetween design features and psychological feelings such as walkability, preference, visual complexity,anddisorder.Lastly,inChap.10,MedinaRíosetal.reviewresearchthat usesmachinelearningapproachestostudyhowpeopleperceiveurbanenvironments according to subjective dimensions like beauty and danger. Then, with a specific focusonGlobalSouthcities,theypresentastudyonperceptionofurbanscenesby peopleandmachines.Theyusetheirfindingsfromthisstudytodiscussimplications for the design of systems that use crowd-sourced subjective labels for machine learningandinferenceonurbanenvironments. We have edited this book to appeal to undergraduate and graduate students, academicandindustrialresearchers,andpractitionerswhoarebroadlyinterestedin cognitiveunderpinningsofsubjectivevisualexperiences,aswellascomputational approaches to modelling and predicting them. The authors of this book provide viii Preface overviewsofthecurrentstateoftheartintheirrespectivefieldsofstudy;therefore, chapters are largely accessible to researchers who may not be familiar with either prevailing computational, and particularly machine learning, practice, or with research practice in cognitive science. As such, we believe that researchers from bothworldswillhavemuchtolearnfromthesechapters. Weareindebtedtoalltheauthorsfortheircontributions,andhopethatreaders of this book will enjoy reading the fruits of their hard work as much as we have. Finally, we thank our editor, Springer, who gave us the opportunity to bring this projecttolife. Bucharest,Romania BogdanIonescu Chicago,IL,USA WilmaA.Bainbridge Meylan,France NailaMurray Contents TheIngredientsofScenesthatAffectObjectSearchandPerception...... 1 TimLauerandMelissaL.-H.Võ Exploring Deep Fusion Ensembling for Automatic Visual InterestingnessPrediction....................................................... 33 MihaiGabrielConstantinandLiviu-DanielS¸tefan,andBogdanIonescu AffectivePerception:ThePowerIsinthePicture............................ 59 MargaretM.Bradley,NicolaSambuco,andPeterJ.Lang Computational Emotion Analysis From Images: Recent AdvancesandFutureDirections ............................................... 85 SichengZhao,QuanweiHuang,YoubaoTang,XingxuYao,JufengYang, GuiguangDing,andBjörnW.Schuller TheInterplayofObjectiveandSubjectiveFactorsinEmpirical Aesthetics.......................................................................... 115 RebeccaChamberlain AdvancesandChallengesinComputationalImageAesthetics............. 133 GiuseppeValenzise,ChenKang,andFrédéricDufaux SharedMemoriesDrivenbytheIntrinsicMemorabilityofItems.......... 183 WilmaA.Bainbridge Memorability:AnImage-ComputableMeasureofInformationUtility... 207 ZoyaBylinskii,LoreGoetschalckx,AneliseNewman,andAudeOliva TheInfluenceofLow-andMid-LevelVisualFeaturesonthe PerceptionofStreetscapeQualities ............................................ 241 GabyN.Akcelik,KathrynE.Schertz,andMarcG.Berman WhoSeesWhat?ExaminingUrbanImpressionsinGlobalSouth Cities............................................................................... 263 LuisEmmanuelMedinaRios,SalvadorRuiz-Correa,DarshanSantani, andDanielGatica-Perez ix The Ingredients of Scenes that Affect Object Search and Perception TimLauerandMelissaL.-H.Võ 1 Introduction What determines where we attend and what we perceive in a visually rich environment? Since we typically cannot process everything that is in our field of viewatonce,certaininformationneedstobeselectedforfurtherprocessing.Models ofattentionalcontroloftendistinguishtwoaspects:Bottom-upattention(sometimes referred to as “exogenous attention”) focuses on stimulus characteristics that may standouttous,whiletop-down(or“endogenous”)attentionfocusesongoal-driven influencesandknowledgeoftheobserver(e.g.,Hendersonetal.,2009;Itti&Koch, 2001). In this chapter, we focus on top-down guidance of attention and object perception in scene context; particularly, on top-down guidance that is rooted in generic scene knowledge—or scene grammar as we will elaborate on later—and is abstracted away from specific encounters with a scene, but stored in long-term memory. Supposethatyouarelookingforcutleryinarentedaccommodation.Youwould probably search in the kitchen or in the living room but certainly not in the bathroom.Onceinthekitchen,youwouldprobablyreadilydirectyourattentionto thecabinets—itwouldnotbeworthwhiletoinspectthefridgeortheoven.Despite having a specific goal, certain items may attract your attention, such as a bowl of fruitsorcolorfulflowersonthekitchencounter.Ifyoufoundforks,youmightexpect tofindtheknivescloseby.Whileviewingthekitchen,youwouldprobablynothave a hard time recognizing various kitchen utensils, even if they were visually small, occludedorotherwisedifficulttoidentify.Inthisexample,onebenefitsfromcontext information, from prior experience with kitchens of all sorts. That is, in the real T.Lauer((cid:2))·M.L.-H.Võ GoetheUniversityFrankfurt,FrankfurtamMain,Germany e-mail:[email protected] ©TheAuthor(s),underexclusivelicensetoSpringerNatureSwitzerlandAG2022 1 B.Ionescuetal.(eds.),HumanPerceptionofVisualInformation, https://doi.org/10.1007/978-3-030-81465-6_1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.