ebook img

Modeling non-standard retinal in/out function using computer vision variational methods PDF

31 Pages·2017·1.79 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Modeling non-standard retinal in/out function using computer vision variational methods

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by HAL-Univ-Nantes Modeling non-standard retinal in/out function using computer vision variational methods Elaa Teftef, Maria-Jose Escobar, Aland Astudillo, Carlos Carvajal, Bruno Cessac, Adrian Palacios, Thierry Vi´eville, Fr´ed´eric Alexandre To cite this version: ElaaTeftef, Maria-JoseEscobar,AlandAstudillo,CarlosCarvajal, BrunoCessac,etal.. Model- ing non-standard retinal in/out function using computer vision variational methods. [Research Report] RR-8217, INRIA. 2013, pp.28. <hal-00783091v2> HAL Id: hal-00783091 https://hal.inria.fr/hal-00783091v2 Submitted on 1 Feb 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destin´ee au d´epˆot et `a la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publi´es ou non, lished or not. The documents may come from ´emanant des ´etablissements d’enseignement et de teaching and research institutions in France or recherche fran¸cais ou ´etrangers, des laboratoires abroad, or from public or private research centers. publics ou priv´es. Modeling non-standard retinal in/out function using computer vision variational methods 1 2 2 1 E. Teftef , M.-J. Escobar , A. Astudillo , C. Carvajal , 4 3 1 1 B. Cessac , A.G. Palacios , T. Viéville , F. Alexandre . (1)InriaMnemosyne/Cortexhttp://team.inria.fr/mnemosyne,France. (2)UniversidadTécnicaFedericoSantaMaría,ElectronicsEngineeringDepartment,Chile. (3)UniversidaddeValparaíso,CentroInterdisciplinariodeNeurocienciadeValparaiso,Chile. (4)InriaNeuromathcomp,France. G N E + R F 7-- 1 2 8 R-- R A/ RI N I N RESEARCH R S I REPORT 9 9 N° 8217 3 6 9- 4 Janvier2013 2 0 N Project-TeamsMnemosyne SS I Modeling non-standard retinal in/out function using computer vision variational methods E. Teftef1, M.-J. Escobar2, A. Astudillo2, C. Carvajal1, B. Cessac4, A.G. Palacios3, T. ViØville1, F. Alexandre1. ∗ (1) Inria Mnemosyne/Cortex http://team.inria.fr/mnemosyne, France. (2) Universidad TØcnica Federico Santa Mar(cid:237)a, Electronics Engineering Department, Chile. (3) Universidad de Valpara(cid:237)so, Centro Interdisciplinario de Neurociencia de Valparaiso, Chile. (4) Inria Neuromathcomp, France. Project-Teams Mnemosyne (cid:176) Research Report n 8217 (cid:22) Janvier 2013 (cid:22) 27 pages Abstract: We propose a computational approach using a variational speci(cid:28)cation of the visual front-end, where ganglion cells with properties of retinal Konio cells (K-cells), are considered as a network, yielding a mesoscopic view of the retinal process. The variational framework is imple- mented as a simple mechanism of di(cid:27)usion in a two-layered non-linear (cid:28)ltering mechanism with feedback, as observed in synaptic layers of the retina, while its biological plausibility, and capture functionalities as (i) stimulus adapted response; (ii) non-local noise reduction (i.e. segmentation); (iii)visualeventdetection,takingseveralvisualcuesintoaccount: contrastandlocaltexture,color or edge channels, and motion base in natural images. Those functionalities could be implemented in the biological tissues We use computer vision methods to propose an e(cid:27)ective link between the observed functions and their possible implementation in the retinal network base on a two-layers network with non- separable local spatio-temporal convolution as input, and recurrent connections performing non- linear di(cid:27)usion before prototype based visual event detection. The numerical robustness of the proposed model has been experimentally checked on real natu- ral images. Finally, we discuss in base of experimental biological and computational results the generality of our description. Key-words: No keywords ∗ SupportedbytheCONICYT/ANRKEOpSprojectandCORTINAassociatedteam. RESEARCHCENTRE BORDEAUX–SUD-OUEST 351,CoursdelaLibération BâtimentA29 33405TalenceCedex ModØlisation de la fonction d’entrØe/sortie non-standard de la rØtine (cid:224) partir de mØthodes variationnelles de vision par ordinateur RØsumØ : Nous proposons ici une approche fonctionnelle de la description des propriØtØs des cellulesrØtiniennesditesKonio,ceciauniveaudurØseau,enutilisantunespØci(cid:28)cationvariation- nelle, ce qui donne une vue mØsoscopique du processus de calcul de la rØtine. Le cadre variationnel est implØmentØ comme un simple mØcanisme de di(cid:27)usion non-linØaire, mØ- canismede(cid:28)ltrageavecrØtroaction,suivid’unecouched’unitØsajustØes(cid:224)unØlØmentstatistique de la scŁne, comme on l’observe dans les couches synaptiques de la rØtine pour ces cellules. On se propose de capturer les fonctionnalitØs suivantes: (i) adaptation de la rØponse aux statis- tiques des stimuli naturels, (ii) rØduction non-locale du bruit (en lien avec la segmentation de l’image), et (iii) dØtection d’ØvØnements visuels, en tenant compte de plusieurs indices visuels: contraste local, texture ou couleur, ces amers Øtant gØnØralisables (cid:224) des canaux de calcul de mouvement. Ces fonctionnalitØs peuvent Œtre mises en (cid:247)uvre dans les tissus biologiques, comme on le discute ici. NousutilisonsdesmØthodesdevisionparordinateurpourproposerunedescriptionfonctionnelle du calcul e(cid:27)ectuØ au niveau de la rØtine. LarobustessenumØriquedumodŁleproposØaØtØvØri(cid:28)ØexpØrimentalementauniveaunumØrique sur de vØritables images naturelles. Nous discutons, sur la base de rØsultats expØrimentaux bi- ologiques et informatiques, la gØnØralitØ de notre description. Mots-clØs : Pas de motclef Modeling non-standard retinal in/out function using computer vision variational methods 3 1 Introduction Recently it has been proposed that the retina, a accessible part of the brain, sustain more complex behaviors than expected before (Masland & Martin, 2007), including spatial, temporal and motion recognition (Schwartz & Michael, 2008) (Olveczky, Baccus, & Meister, 2003). The retina correspond to a multi-stream device including from phototransduction to early visual processing and neural coding, at di(cid:27)erent spatial and temporal scales. At the application level, a computing parallel information streams system for (e.g.) visual prosthesis is likely to need an important post-processing level before visual signals feed the nervous system (Barriga-Rivera & Suaning, 2011). For the later, an adequate computational architecture should have at least two-layersnetworkwithnon-separablelocalspatio-temporalconvolutionasinput, andrecurrent connections performing non-linear di(cid:27)usion before prototype based visual event detection. The present article is organized as follow: In the section 2, we review key biological facts at theearlydorsalandventralK-cellsvisualstreams,computingsophisticatedspatialandtemporal pattern recognition as review in the next section, 3, base on (Gollisch & Meister, 2010; Litke et al., 2004; Schwartz & Michael, 2008; Olveczky et al., 2003; Sterling, 2004). In the subsequent section 4, we implement the computational principles raised by the study on the retinal K- cells (Hendry & Reid, 2000; Yoonessi & Yoonessi, 2011) using a variational speci(cid:28)cation of the visual front-end, base on a network of retinal ganglion cells. In the section 5, we implemented a numerical model to study real image sequences and (cid:28)nally model predictions are presented to veri(cid:28)ed biological validity. 2 From standard to non-standard early-vision front-end Standard front-end retinal streams Theretinaloutputcorrespondtoseveraldi(cid:27)erentspatio-temporalprocessing,baseonadiversity ofganglioncells(Gc)types,theoutputoftheretinatothenervoussystem,oftheincomingimage sequence. Such streams are embodied as separate strata that span the retina at the Gc surface. Itisgeneralyaccepted,thatthe(i)parvo(P-cells)and(ii)magno(M-cells)streams,compute respectively (i) color image details and contrast in central vision and (ii) monochrome intensity temporal variation, including at very low contrast, in peripheral vision and (iii) konio streams (K-cells) projecting to the LGN (middle layer) receive information from blue cones, sending information to the V1 color blobs (Hendry & Reid, 2000) and from small bistrati(cid:28)ed Gc having a surround which is sensitive to yellow (G. Field et al., 2007). Here, we are going to focus on a subset of K-cells, beyond their color property, as detailed in the sequel. For other types of Gc see, e.g., (Callaway, 1998). Atthemodelinglevel,informationstreams(i)and(ii)arewellrepresentedbyLNmodels,i.e., aonelayerspatio-temporal(cid:28)lteringfollowedbyastaticnon-linearity,asrepresentedintheright part of Fig. 2, this unit being tuned by a gain control mechanism, as reviewed e.g., in (Wohrer, 2008). Qualitatively, such non-linear (cid:28)ltering tends to remove spatial correlations in the visual world, producingalessredundantoutput, asrequiredtotransmitinformationintheopticnerve (cid:28)bers of limited spatial capacity (see (Pitkow & Meister, 2012) for a discussion), resulting in a sparse coding for retinal spike trains in response to the statistics of natural images (Simoncelli, 2003). (cid:176) RRn 8217 4 E. Teftef & others Figure1: SchematicrepresentationofprojectionsoftheK-cellsfrom(Masland&Martin,2007) and (Nassi & Callaway, 2009). Considering retinal output K-streams While standard magno/parvo Gc downstream correspond to the standard visual dorsal/ventral streams (leftward drawing) after a relay in the ventral/dorsal LNG layers (rightward drawing); K-cellsprojectionsaremoremiscellaneous(leftwarddrawing)andinteractwiththeparvo/magno streams in the LGN and via the V1 2/3 (rightward drawing). Connections from the retina are feed-forward, while all other brain connections include feed-backs. In a purposive view of visual perception, the parietal cortex performs from such input an unconscious, e(cid:30)cient, specialized andrapidprocessingofthewholevisual(cid:28)eld,andpreparesactionsadaptedtothecharacteristics of the environment. In addition K-cells connect via the LGN to important multi-sensorial areas liketheamygdalalateralnucleus(ALN)involvedinthecontrolofemotion(Ciocchietal.,2010), thus in direct link with survival actions. Importance of the K-streams Quantitatively, 80% of the Gc are midget towards the P-cells in the thalamus, because of the need of (cid:28)ne spatial resolution to perceive visual details. Only 10% are parasol Gc towards the M-cells in the thalamus, and 10% are bistrati(cid:28)ed Gc (K-cells) towards the superior colliculus and the LGN layers (G. D. Field et al., 2007). Though such non-standard retinal cells are in proportion few in primates (whereas up to 75% in phylogenetically earlier mammalians) they constituterobustandwell-identi(cid:28)edcellmappingofthevisual(cid:28)eld(Gauthieretal.,2009). This lower proportion is easily explained: Having large receptive (cid:28)eld (RF) about 10 deg, the cover of the whole retinal (cid:28)eld required less units. In the human eye, about 130 millions of photoreceptors, where 120 millions are rods and 10 Inria Modeling non-standard retinal in/out function using computer vision variational methods 5 millionsarecones,concentrateontoabout1.5millionsofGc,includingthe105 K-cellsconsidered here. The key computational aspect of K-streams ThecurrentcomputationalassumptionhereconsiderthatK-cellsstreamsprovideroughbutfast event or object -detection of visual events (Hendry & Reid, 2000)(Masland & Martin, 2007) in order to induce survival goal-directed actions on time (V. Lamme & Roelfsema, 2000), and to drive higher-level iterative processes with prior information (Callaway, 1998; Koivisto, Railo, Revonsuo, Vanni, & Salminen-Vaparanta, 2011). Given natural image sequences, we interpret thesetwofunctionalitiesasimagesegmentationforvisualobjectsdetectionandnaturalstatistical recognition, including temporal pattern recognition. In the other hand, K-cells participate to blind-sight (V. A. Lamme, 2001; Cowey, 2010), i.e., or not consciously seen, including, slowly moving targets detection, rough localization of stimulus. This is typically information not for (cid:16)seeing(cid:17) but for visually guided behavior. There are some evidences, that such computation starts in the retina: (i) Some retinal Gc are able to produce sophisticated spatial and temporal pattern recogni- tion on their own (Gollisch & Meister, 2010; Werblin, 2011), including motion detection, directional selectivity, local edge detection, object motion and looming detection. (ii) Duringphylogeneticalevolution,theeyewasnotasimplefeed-forwardsystem: Feed-backs from the remainder of the nervous system had been able to produce adaptive learning of sophisticated visual functions (cid:16)optimizing the transfer of information(cid:17) (Sterling, 2004). (iii) Fast stimulus categorization develops roughly 150 ms after stimulus onset (Thorpe, Fize, & Marlot, 1996). In that respect the visual processing need to perform this using only few computation steps (ViØville & Crahay, 2004). (iv) Fromaninformationprocessingpointofview,regardingdownstreamcomputation,itisop- timaltotakeinto-accountthewholespatialandtemporalresolutionofthephoto-receptors beforethenecessarycompressionoccurringintheopticalnervetoimplementsuchcomplex non-linear (cid:28)ltering (Simoncelli, 2003). These are the key elements that di(cid:27)erentiate the K-cells from parvo/magno streams. In nutschell, the K-stream provides fast, a-priori event detection assumptions, to the remainder of the visual system (V. Lamme & Roelfsema, 2000). 3 Retinal architecture and biological constraints What are the biological constraints regarding what can be computed in the retina (i.e., the computational characteristics)?. In order to derive a mesoscopic model of K-cells input/output function let us collect known facts about the computational properties of the retina, as schematized in Fig. 2. Here are the three major features we propose to take into account: 1. Not one but two layers: It is clear that, by no means, a simple feed-forward (cid:28)ltering layer, even non-linear, can compute complex visual cues such as those reviewed above. On the other hand, it is known that a two layers non-linear feed-forward network, i.e., with a (cid:16)hidden(cid:17) layer, is a universal approximator of any function. For a static input/output relationship, i.e., a unique image as input with a unique related output, a neural network (cid:176) RRn 8217 6 E. Teftef & others Figure 2: left: Schematic representation of the retinal architecture, after (Wohrer, 2008) and (Gollisch & Meister, 2010). Excitatory connections are in red, inhibitory in blue. Gap junc- tions are in red with a black contour, neighboring cells of the same type being generally linked through gap junctions. The direct pathway from light receptors to Gc is a two-layers process de(cid:28)ning two successive non-linear (cid:28)ltering stages with di(cid:27)erent excitatory-inhibitory patterns. The present mesoscopic model is compatible with such an architecture, but requires a subset of this architecture (highlighted in yellow). This pathway is modulated by the interstitial cells in each layer. Here, we have emphasized that non-linearities are present on each link. right: The basic view of cell unit processing: (i) spatial (cid:28)ltering (gray arrow) speci(cid:28)ed by the wired connectivity,combinedbya(ii)band-passtemporal(cid:28)lter,thusincludingdelays,(bluerectangle) and (iii) followed by a thresholding operation. Inria Modeling non-standard retinal in/out function using computer vision variational methods 7 (e.g. retina) with two layers is a su(cid:30)cient architecture for generic computations (Hornik, Stinchcombe, & White, 1989). For a dynamic input, i.e., an image sequence, this basic idea has to be generalized, and we propose the following. 2. Non separable spatio-temporal (cid:28)ltering: At a given time, we consider that the retina input state is a function of the short-term visual information and that the output is a causal function of this 2D+T image volume. Quantitatively,wemayconsiderthelast150msintervalastemporaldepth(Li,1992),with a visual minimal event-time of about 10 ms: This thus correspond to a dozen of (cid:16)frames(cid:17) when time is discretized. - This means that the spatio-temporal (cid:28)ltering is local, i.e., only spatial connections from neighborhood cells are taken into account, while in the temporal domain the (cid:28)ltering is only due to delays in the biological substrate and connections. This seems to be fairly true for the outerplexiform layer (OPL) (i.e. for horizontal cells) which corresponds to the (cid:28)ltering mechanism considered here. This is less obvious for amacrine cells in the IPL (Werblin, 2011), likely involved in mechanisms not addressed here. - This means that the spatio-temporal (cid:28)lters are constrained: not all (cid:16)weight(cid:17) values can be used, but only values with signs compatible with the excitatory/inhibitory connections, andvaluerangestakingintoaccountthenatureoftheconnection(e.g.,gapjunctionversus synaptic junctions). All synapses have a delay, increasing as a function of the connection length. - This means that the output at a given time is a static function a 2D+T volume, thus computing local motion in the image sequence, but no long-term motion cue. Therefore the previous argument of a two-layers architecture is su(cid:30)cient for complex function ap- proximation still stands for dynamic stimuli in this restrained framework. -Thisalsomeansthatthespatio-temporal(cid:28)lteringisnon-separable: Inotherwords,wedo not consider a sequence of 2D image (thus spatial (cid:28)ltering each image and then temporal (cid:28)ltering the result) but the (cid:28)ltering of the 2D+T volume. Infact,asmadeexplicitbefore(Gollisch&Meister,2010;Dong,2001),theretinalstructure induces a separable spatio-temporal (cid:28)ltering if we consider a cell (cid:16)alone(cid:17) in a standard parvo-stream,butdi(cid:27)erentifweconsiderinteractionsbetweencells,andalsoconsiderthat the retina is adapted to natural image stimuli, as suggested by (Dong, 2001). 3. Recurrent connections: Thegeneralarchitecturede(cid:28)nitelyshowsthatthehard-wiringpro- vides feed-forward connections and recurrent (horizontal and feed-back) connections, as made explicit in Fig. 2. - Recurrent connections as local di(cid:27)usion. These connections are local, continuous and withtinydelays(i.e. basedongap-junctionsonlocalnon-spikingveryshortdendrite/axon connections) (Sterling, 2004). Qualitatively, this corresponds to local di(cid:27)usion of the state value, here mainly corresponding to the membrane voltage value. - Static non-linearities everywhere. Another key aspect is the fact that each connection is subject to a static non-linearity with a main characteristic: Threshold corresponding to a recti(cid:28)cation of the signal. The recti(cid:28)cation is not related to action potential spiking (except for the Gc output), but to bio-chemical membrane mechanisms. (cid:176) RRn 8217

Description:
streams (leftward drawing) after a relay in the ventral/dorsal LNG layers .. in commonly measured higher-order statistics, retinal cells exhibit
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.