ebook img

Hanspeter A. Mallot John S. Allen PDF

276 Pages·2006·7.69 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Hanspeter A. Mallot John S. Allen

Preface Our knowledge of the world surrounding us is mediated by our senses. In visual perception, we sense the light emanating from objects in the environment and infer from this light a wealth of information about the environment. We are able to \see" depth and shape of objects, the color of surfaces, the segmentation of the scene into distinct objects, or the mood of a partner in a conversation. All this is based on images, i.e., two-dimensional distributions of intensities, which as such do not at all contain depth, shape, or moods. The fact that we are able to see all this is among the most amazing and fascinating abilities of our brain. In this book, the performance of the perceptual apparatus is discussed on the levelofinformationprocessing. Contributionsfrompsychophysicsandcomputational neurosciencearegivenequalweightastheoriesandalgorithmsdevelopedformachine visionandphotogrametry. Indeed,thiscombinationistheveryideaofcomputational vision. InthetraditionofBelaJuleszandDavidMarr,mostofthebookisdevotedto early vision, i.e., stages of visual processing that do not require top-down inferences from\higher"stages. However,inbiologicalorganismsaswellasinrobots,visionhas to serve a purpose. Aspects of behavior-oriented vision covered in the book include eye-movement and visual navigation. The book is based on courses given since 1987 at the universities of Mainz, Bochum, and Tu(cid:127)bingen. I tried to keep it readable for students of psychology and the neurosciences as well as for students with a physics or computer vision back- ground. The mathematicalmaterialisselectedsuchastogiveasurveyof thevarious techniques employed in computational vision. Most of the ideas are introduced si- multaneously as \prose" text, by formal equations, and in (cid:12)gures. As an additional refresherofcollegemathematics,aglossaryofmathematicaltermshasbeencompiled. I am grateful to my colleagues and students at the Max-Planck-Institut for Bi- ological Cybernetics in Tu(cid:127)bingen and the Department of Neural Computing at the Ruhr-Universit(cid:127)atBochumwhohavesupportedthisworkinmanyways. Gerd{Ju(cid:127)rgen Gie(cid:12)ng, Walter Gillner, Holger Krapp und Roland Hengstenberg have provided (cid:12)g- ures that are included in the book. Valuable comments on the draft have been given byMatthiasFranz,KarlGegenfurtner,SabineGillner,HeikoNeumann,bythetrans- lator, John S. Allen, and by my wife, B(cid:127)arbel Foese-Mallot. Tu(cid:127)bingen, April 2000 Hanspeter Mallot Part I Fundamentals 2 Among the many possible de(cid:12)nitions of the concept of \vision," the present text concentrates on information processing; that is, on the reconstruction of the charac- teristics of the environment from images. The (cid:12)rst chapter is intended to establish a conceptual framework: vision does not occur simply because we open our eyes and turnthemtowardtheworld. Rather,visionisanactiveprocess,similarinmanyways to the testing of hypotheses using data provided by the senses. A series of demon- strations will show that the relationship between what is present and what is seen is not a simple one. Beforeweturntotheactiveaspectsofseeingandtherelatedprocessingofinforma- tioninthelaterpartsofthebook,wewill(inChapter2)examinethemostimportant characteristics of the generation of images. Clearly, the nature and properties of the available data, that is, of the images, is of great importance to its subsequent inter- pretation. Camera analogiesand perspective areveryimportant here, but re(cid:13)ections from surfaces and eye movements will also be discussed. Chapter 1 Introduction 1.1 Biological information processing 1.1.1 The perception-action cycle Livingbeingsrelatetotheirsurroundingsinmanykindsofways. Theinteractionsmay be considered physically and chemically and may be studied as exchanges of matter andenergy. Althoughalllifeformshavethistypeofenergeticandmaterialbasis,such an approach falls short in the study of sensory and e(cid:11)ector processes. For example, the physical nature of the sensation of light from a printed page is related only very indirectly to the text: the wavelength, the contrast, the size and the intensity|in fact,allofthephysicallyrelevantquantities|canbevariedwithinwidelimitswithout changing the meaning of the stimulus for the reader. Based on observations of this type, aperception-actioncyclecharacterizedbythe(cid:13)owofinformationratherthan a (cid:13)ow of material or energy, is considered to be an essential characteristic of behaving organisms, or \agents" (see Figure 1.1). The idea that perception and action are coupled via an accessible part of the world called the environment dates back to Jakob von Uexku(cid:127)ll (1926). The relationship between perception and behavior also playsanincreasingroleincomputervisionandrobotics(Brooks1986);to makeclear the distinction from older approacheswhich aredirected only towardperception, the term \behavior-orientedapproach" is used. The concept of information used in this book is closely related to that of the action-perceptioncycle: informationis whatis transportedalongthe pathsindicated by arrows in Figure 1.1 (Tembrock 1992). The meaning of a stimulus is ultimately determined by the behavior which it elicits in the organism, that is, by the role of that stimulus in the perception-action cycle (von Uexku(cid:127)ll 1926,Gibson 1979,Dusen- bery1992). Inthisframework,therefore,informationhasameaningwhichistosome extent independent of the quantity of information as measured by Shannon’s (1948) theoryofcommunication. Verymeagerquantitiesofinformation,asmeasuredaccord- ing to Shannon’s theory, can be su(cid:14)cient to generate and guide complex behaviors. 4 CHAPTER 1. INTRODUCTION Organism ’’ $$ - Central nervous system Information processing R ’$ Cognition ’Ef- $ Senses fectors (cid:18) I &% &% Homeostasis && %% 6I usr Acquisitive behavior Behavior, narrowly de(cid:12)ned Locomotion, Manipulation, Feeding, Reproductive and Social behavior etc. ’# $(cid:27) Environment &" !% Figure 1.1 The perception-action cycle and the associated transfer of information. The senses (vision, hearing, smell, taste, touch, posture and balance as well as proprioception) provide information about the outside world and about the inner conditions of the organism. Throughthee(cid:11)ectors(motorsystem,glands)andthebehaviorswhichthey generate,the organismentersinto arelationshipwith theenvironment. Thee(cid:11)ectors are related to the sensesby three feedback loops: throughinternal regulation(home- ostasis), sensory-motor and acquisitive behavior (for example, eye movement), and through alterations to the environment, which are e(cid:11)ected through other behaviors. Thisisdemonstrated,forexample,bythethoughtexperimentsofBraitenberg(1984), whichexaminethebehaviorofsmallwheeledvehicleswithtwopointsensors,andtwo wheels driven or braked depending on the sensory input. Depending on whether the sensors are connected to the motors on the same or the opposite sides, and whether thesensorysignaldrivesorbrakesthemotors,di(cid:11)erent\behaviors"canbegenerated, which appear to the outside observer to be based on signi(cid:12)cant amounts of informa- 1.1. BIOLOGICAL INFORMATION PROCESSING 5 tion processing. The relationship of the behavior of such vehicles to their internal circuits has been intensively studied in what is called arti(cid:12)cial life research, using simulated and real robots (cf. Langton 1995). The concept of information processing, as introduced here for biological vision, is appliedinentirelysimilarwaysinroboticsandincomputervision(Marr1982,Ballard & Brown1982,Horn 1986). The present description will thereforegivetechnical and biological problems equal weight. Thecentralorganofinformationprocessingisthebrain. Beforeweexaminevision, the central subjectmatterof this book,weneed to summarize the majortasksof the brain. These tasks de(cid:12)ne the areasof study in biological information processing. Sensation and Perception: Thebraininterpretsandintegratesallsensorymodal- ities (vision, hearing, smell, taste, touch, posture and balance, proprioception). Disturbances to the senses can occur as a result of poor brain function, even if the sense organs are perfectly normal. Examples of such disturbances include amblyopia and various agnosias. Behavior: With the assistance of e(cid:11)ectors (muscles, glands), the brain controls the behavioroftheorganismandregulatesitsinternalconditions. Behaviorislinked viatheenvironmenttothesenseorgans,andsoafeedbacklooparises. Thebrain serves as the control device in this loop. For experiments into perception, the presence or absence of feedback via the perception-action cycle is generally of great importance (open vs. closed-loop experiments, for example in navigation using visual cues). Memory: Most researchersdistinguish short-term or working memory, which holds information needed during processing, from long-term memory. One subdivi- sion of long-term memory serves to store explicit items such as events, faces andobjects,orrules(\declarativememory"). Anothersubdivision,called\pro- cedural" or \non-declarative memory" contains implicit knowledge supporting associations, habits and skills. Higher functions (cognition, motivation): The concept of cognition comprises abilities which require internal models or representations of the environment. Examples are latent learning (playful, non goal-directed learning of facts and regularitieswhich areusedlater)andproblemsolving. Behaviorguidedbycog- nitionis not organizedin stereotypesofstimulus andreactionbut alsodepends on the organism’s current goal. The knowledge needed to realize these goals is stored in a declarative memory. The study and description of these abilities as information processing tasks is the aim of theoretical neurobiology. The present book discusses mainly perception: that is, more or less, the upper left part of the information (cid:13)ow diagram of Fig. 1.1. Two aspects of visually guided behavior covered in this book are eye movements (Section 2.4) and visual navigation (Chapter 11). 6 CHAPTER 1. INTRODUCTION 1.1.2 How is information processing studied? The biological sciences employ a number of quite di(cid:11)erent types of explanations. In the context of this book, it is important to distinguish between the anatomy of the central nervous system, and the behavioral and perceptual competences which it supports. Even when physiological explanations are not available, inquiry into the information sources used and into the logic of their evaluation and integration is appropriate and fruitful. This type of explanation is called the computational theory of competence; it has been pioneered by David Marr (1982). In order to provide a framework for the information processing approach, we will summarize and slightly extend the approach developed by Marr (1982). 1. \Hardware" implementation: This level primarily concerns itself with the ana- tomy and optics of the eye, as well as the anatomy of the visual pathways and related neural networks. In the study of arti(cid:12)cial systems, it is usually assumed that the hardware is general and can be applied to any type of task. In biological systems, on the other hand, evolutionaryadaptation has led to an interdependence between structure and function. Therefore, it is in principle possible to infer function from structure, as is attempted in neural network research(cf. Arbib 1995). 2. Representation and algorithms: How are images (i.e., distributions of intensity or neural activity) and the information derived from them represented, and which availableneuronal operations canbe interpreted as\computation"? The experimental approachto the study of these questions is primarily that of elec- trophysiology. 3. Computational theory of competence: Thisisinformationprocessingin thenar- rowsense. Thecentralissueisofwhatmustbecalculatedinordertoderivethe desiredinformationfromthespatio-temporalstimulusdistribution. Thede(cid:12)ni- tionofvisionasinverseopticswhichwillbepresentedinwhatfollowshastaken root here. On the level of computational theory, the brain and the computer are confronted with the same problems. 4. Behavioral approach: Thecomputationaltheoryofcompetencedoesnot,forthe most part, examine to what ends certain information is derived from the data. Rather, the goal of information processing is assumed to be the reconstruction ofthemost completepossible descriptionofthe scenebeingviewed. In biology, as also in robotics and computer vision, this assumption is usually not at all correct. Rather, the primary interest is in information which is relevant to be- havior; that is, information which is actually required in order to function in a givenenvironment (vonUexku(cid:127)ll 1926). The \ecological"approachof J. J. Gib- son (1950) expands on this concept, though it is not highly formalized. In the (cid:12)eldsofbehavioralandsensoryecology,anattemptismadetoquantifythead- vantage of information processing competences in terms of evolutionary (cid:12)tness or reproductive success that an organism gains from a particular perceptual or behavioral competence (Krebs & Davies 1993; Dusenbery 1992). 1.2. INFORMATION IN IMAGES 7 The levels listed above are not completely independent. It is nonetheless useful for heuristic purposes to distinguish them from one another. This book stresses the computational theory approach, which is very similar for computer and biological vision. 1.2 Information in images 1.2.1 What information can images contain? In order for information to be acquired through the sense of vision, the light which emanates from a light source must itself contain such information. The following possibilities exist in principle: 1. Light source: Thepresence ordisappearanceof lightsourcesprovidesverysimple, thoughbi- ologicallyveryimportantinformation. Musselsclosetheirshellswhenashadow is cast on them, (cid:12)re(cid:13)ies and deep-sea (cid:12)shes have developed their own biolumi- nescentlightsourcestoattractmatesorprey,andhoneybeesnavigateaccording to the position of the sun. Directional point sensors used in many robots are su(cid:14)cient for detecting light sources and determining their direction.(cid:3) 2. Physical interactions of a ray of light with the visible world attach certain in- formation about the world to the ray. We (cid:12)rst consider only a single ray of light. (a) Opaque surfaces: re(cid:13)ection. The light returned from a surface contains information on surface properties such as re(cid:13)ectivity (\color"), albedo (\brightness") or smoothness (\glossy" or \dull") as well as on the ori- entation of the surface. (b) Transparent media: scattering. Scattering can reveal physical character- istics of the medium or the position (distance) of the light source. As an example, consider \aerial perspective", i.e. the bluish appearance of mountain ranges at great distances. (c) Boundaries between optical media: refraction. Refraction is of minor im- portance as a source of information, but plays a very important role in generation of images. (cid:3) Darwin(1859,Chapter6)suggestedthatsuch\pointsensors"areactuallythestartingpointof the evolution of eyes: \... I can see no very great di(cid:14)culty (...) in believing that natural selection has converted the simpleapparatus of an optic nerve merelycoated with pigment and invested by transparent membrane,into an optical instrument as perfect as ispossessedby any memberof the greatArticulateclass." SeealsoHalderetal.(1995)forrecentevidenceontheevolutionofdi(cid:11)erent eye typesfromonecommonancestralorgan. 8 CHAPTER 1. INTRODUCTION 1100 2200 1 0.75 0.5 0.25 0 3300 2200 1100 Image Intensity distribution I(x;y) Figure 1.2 Images are two-dimensional distributions of intensities or gray values. Left: Gray scale image. Right: Representation of the same image as a gray scale \landscape." 3. Information in the distribution of incident rays of light (a) Temporal: Information on movements of the observer (egomotion), on movements of objects, changes in the light source, etc. is conveyed by temporal, or spatio-temporal distributions of intensities. (b) Spatial: All rays of light incident on a given point are called the ambient optical array of this point (Gibson 1950). In a sense, the optical array characterizesthe image which ideally could be observedat the point. Ad- ditional information is available with binocular vision. Spatio-temporaldistributions of theintensityoflight, i.e.sequencesofimages,repre- sentbyfarthemostimportantsourceofvisualinformation. Theymakeitpossibleto see forms and movements. The representationof images as intensity maps (or \land- scapes")isillustratedinFig.1.2. Imageandintensitydistributioncontainexactlythe sameinformation(uptothelimitofthepixelresolution,whichdi(cid:11)ersinthetwoparts of Figure 1.2). Our perceptual system nonetheless is much better able to interpret the image than the intensity landscape. Attempting, to envision, by examining the intensity landscape, the object which is \visible" without e(cid:11)ort in the image, gives an idea of the complexity of the problem which must be solved in visual information processing. This is the problem of computer vision. 1.2.2 What information can be derived from images? The sources of information which have been described are confronted with a set of \perceptual needs" which the human visual system can generally satisfy. A few ex- amples are given in what follows: 1.2. INFORMATION IN IMAGES 9 (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) (cid:8) a. (cid:8) (cid:8) b. (cid:8) c. (cid:8) (cid:8) Figure 1.3 Necker’sinversionofthe cubeshowstheambiguityofthe three-dimensionalinterpre- tation of line drawings. The complete cube a. can be perceived as in b. or in c. Neither of the two interpretations can be maintained at will for any length of time. Surfacecolors. Re(cid:13)ectivity(thatis,colorasasurfaceproperty)andalbedo(bright- ness of the surface) are usually perceived independently of the spectral content and intensity of the light source. Perceiving surfaces unchanged under a wide range of di(cid:11)erent illuminants is an example of a perceptual constancy,(cid:3) a very common but often unnoticed phenomenon of perception. If photographs are taken under arti(cid:12)cial lightusingdaylight-balancedcolor(cid:12)lm,theresultingyellowishcolorsarenotincorrect in terms of the physical spectra of the photographed light. If yellow predominates in theilluminant, thelightre(cid:13)ectedfromthesurfaceswillalsohavemoreyellowcompo- nents than under white illumination. A human observer, however, does not see this. Unlike the photographicprocess, the visual system can adapt to the spectrum of the lightsource,andcanthereforeseethesamecolorsunder di(cid:11)erentlightingconditions. The word \color," then, describes a perception which corresponds more closely to the propertiesofthe surfacethanto the physicalspectrum ofthe lightre(cid:13)ected from that surface (cf. Chapter 5). The distinction between dull and glossy re(cid:13)ections is similarly made independently of the angular extent of the light source. Depth. An important task of vision is to discern the distance of individual points from the observer, as well as the shape of surfaces and objects from their images. A largenumberofdepth cuescanmakethispossible,someofwhichwillbediscussedin detailinthecourseofthiswork. Besidesstereopsisandmotionparallax,whichdepend on there being more than one image, there are also depth cues in individual images, called \pictorial" depth cues. Examples are the three-dimensional interpretation of line drawings as wire-frame objects (Figure 1.3), shadows and shading (Figure 1.4), texture gradients (Figure 8.1) and occlusion. Many of these cues, each taken alone, allow no unambiguous perception of depth. The visual system therefore uses \plau- sible" assumptions about the environment; if these proveto be false, optical illusions occur. (cid:3) A constancy is the perception that an object remains unchanged when viewed from di(cid:11)erent positions or under di(cid:11)erent illuminations. In addition to color constancy, there is also constancy of perceived size and form in spite of perspective changes in the image. The term \constancy" is related to the notion of \invariance" used in the pattern recognition literature. It is distinguished from\invariance"inthatchanges arenotsimplyignored,butaretakenintoaccount.

Description:
techniques employed in computational vision. Most of the ideas are introduced si- multaneously as \prose" text, by formal equations, and in gures.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.