ebook img

Comparison of objective functions for estimating linear-nonlinear models PDF

0.2 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Comparison of objective functions for estimating linear-nonlinear models

Comparison of objective functions for estimating linear-nonlinear models TatyanaO.Sharpee ComputationalNeurobiologyLaboratory, 8 theSalkInstituteforBiologicalStudies,LaJolla,CA92037 0 [email protected] 0 2 n a Abstract J 2 Thispapercomparesafamilyofmethodsforcharacterizingneuralfeatureselec- tivitywithnaturalstimuliintheframeworkofthelinear-nonlinearmodel. Inthis ] C model,theneuralfiringrateisanonlinearfunctionofasmallnumberofrelevant stimulus components. The relevant stimulus dimensions can be found by max- N imizing one of the family of objective functions, Re´nyi divergencesof different o. orders [1, 2]. We show that maximizing one of them, Re´nyi divergence of or- i der2, isequivalentto least-squarefittingofthe linear-nonlinearmodelto neural b data. Next,wederivereconstructionerrorsinrelevantdimensionsfoundbymax- - q imizingRe´nyidivergencesofarbitraryorderintheasymptoticlimitoflargespike [ numbers. We findthatthesmallesterrorsareobtainedwithRe´nyidivergenceof order1,alsoknownasKullback-Leiblerdivergence. Thiscorrespondstofinding 1 relevantdimensionsbymaximizingmutualinformation[2]. Wenumericallytest v howtheseoptimizationschemesperformintheregimeoflowsignal-to-noisera- 1 tio(smallnumberofspikesandincreasingneuralnoise)formodelvisualneurons. 1 3 Wefindthatoptimizationschemesbasedoneitherleastsquarefittingorinforma- 0 tionmaximizationperformwellevenwhennumberofspikesissmall.Information . maximizationprovidesslightly,butsignificantly,betterreconstructionsthanleast 1 square fitting. This makes the problem of finding relevantdimensions, together 0 withthe problemoflossycompression[3], oneofexampleswhereinformation- 8 0 theoreticmeasuresarenomoredatalimitedthanthosederivedfromleastsquares. : v i X 1 Introduction r a The application of system identification techniques to the study of sensory neural systems has a long history. One family of approaches employs the dimensionality reduction idea: while inputs aretypicallyveryhigh-dimensional,notalldimensionsareequallyimportantforelicitinganeural response[4,5,6,7,8].Theaimisthentofindasmallsetofdimensions{eˆ , eˆ , ...}inthestimulus 1 2 space thatare relevantforneuralresponse, withoutimposing,however,a particularfunctionalde- pendencebetweentheneuralresponseandthestimuluscomponents{s , s , ...}alongtherelevant 1 2 dimensions: P(spike|s)=P(spike)g(s ,s ,...,s ), (1) 1 2 K IftheinputsareGaussian,thelastrequirementisnotimportant,becauserelevantdimensionscanbe foundwithoutknowingacorrectfunctionalformforthenonlinearfunctiong inEq.(1). However, fornon-Gaussianinputsawrongassumptionfortheformofthenonlinearitygwillleadtosystematic errorsintheestimateoftherelevantdimensionsthemselves[9,5,1,2]. Thelargerthedeviationsof thestimulusdistributionfromaGaussian,thelargerwillbetheeffectoferrorsinthepresumedform ofthenonlinearityfunctiongonestimatingtherelevantdimensions.Becauseinputsderivedfroma naturalenvironment,eithervisualorauditory,havebeenshowntobestronglynon-Gaussian[10],we willconcentratehereonsystemidentificationmethodssuitableforeitherGaussianornon-Gaussian stimuli. Tofindtherelevantdimensionsforneuralresponsesprobedwithnon-Gaussianinputs,Hunterand Korenbergproposedaniterativescheme[5]wheretherelevantdimensionsarefirstfoundbyassum- ingthattheinput–outputfunctiong islinear. Itsfunctionalformisthenupdatedgiventhecurrent estimate of the relevant dimensions. The inverse of g is then used to improvethe estimate of the relevantdimensions.Thisprocedurecanbeimprovednottorelyoninvertingthenonlinearfunction gbyformulatingoptimizationproblemexclusivelywithrespecttorelevantdimensions[1,2],where thenonlinearfunctiongistakenintoaccountintheobjectivefunctiontobeoptimized.Afamilyof objectivefunctionssuitableforfindingrelevantdimensionswithnaturalstimulihavebeenproposed based on Re´nyi divergences[1] between the the probability distributions of stimulus components along the candidate relevantdimensions computed with respect to all inputs and those associated withspikes. HereweshowthattheoptimizationproblembasedontheRe´nyidivergenceoforder2 correspondstoleastsquarefittingofthelinear-nonlinearmodeltoneuralspiketrains.TheKullback- LeiblerdivergencealsobelongstothisfamilyandistheRe´nyidivergenceoforder1. Itquantifies theamountofmutualinformationbetweentheneuralresponseandthestimuluscomponentsalong therelevantdimension[2]. Theoptimizationschemebasedoninformationmaximizationhasbeen previouslyproposedandimplementedonmodel[2]andrealcells[11]. Herewederiveasymptotic errorsforoptimizationstrategiesbasedonRe´nyidivergencesofarbitraryorder,andshowthatrele- vantdimensionsfoundbymaximizingKullback-Leiblerdivergencehavethesmallesterrorsinthe limit of large spike numberscomparedto maximizingother Re´nyi divergences,includingthe one which implementsleast squares. We then show in numericalsimulations on modelcells that this trendpersistsevenforverylowspikenumbers. 2 Variance asanObjectiveFunction One way of selecting a low-dimensionalmodelof neuralresponse is to minimize a χ2-difference betweenspikeprobabilitiesmeasuredandpredictedbythemodelafteraveragingacrossallinputss: P(spike|s) P(spike|s·v) 2 χ2[v]= dsP(s) − , (2) P(spike) P(spike) Z (cid:20) (cid:21) where dimension v is the relevant dimension for a given model described by Eq. (1) [multiple dimensionscouldalsobeused,seebelow].UsingtheBayes’ruleandrearrangingterms,weget: χ2[v]= dsP(s) P(s|spike) − P(s·v|spike) 2= ds[P(s|spike)]2 − dx[Pv(x|spike)]2. (3) P(s) P(s·v) P(s) Pv(x) Z (cid:20) (cid:21) Z Z In the last integralaveraging has been carried out with respect to all stimulus componentsexcept forthosealongthetrialdirectionv,sothatintegrationvariablex = s·v. Probabilitydistributions Pv(x)andPv(x|spike)representtheresultofthisaveragingacrossallpresentedstimuliandthose thatleadtoaspike,respectively: Pv(x)= dsP(s)δ(x−s·v), Pv(x|spike)= dsP(s|spike)δ(x−s·v), (4) Z Z where δ(x) is a delta-function. In practice, both of the averages (4) are calculated by bining the range of projections values x and computing histograms normalized to unity. Note that if there multiple spikes are sometimeselicited, the probabilitydistributionP(x|spike) can be constructed byweightingthecontributionfromeachstimulusaccordingtothenumberofspikesitelicited. Ifneuralspikesareindeedbasedononerelevantdimension,thenthisdimensionwillexplainallof thevariance,leadingtoχ2 = 0. Forallotherdimensionsv,χ2[v] > 0. BasedonEq.(3),inorder tominimizeχ2weneedtomaximize 2 F[v]= dxPv(x) Pv(x|spike) , (5) Pv(x) Z (cid:20) (cid:21) whichisaRe´nyidivergenceoforder2betweenprobabilitydistributionPv(x|spike)andPv(x),and are partof a family of f-divergencesmeasuresthat are basedon a convexfunctionof the ratio of thetwoprobabilitydistributions(insteadofapowerαinaRe´nyidivergenceoforderα)[12,13,1]. ForoptimizationstrategybasedonRe´nyidivergencesoforderα,therelevantdimensionsarefound bymaximizing: α F(α)[v]= 1 dxPv(x) Pv(x|spike) . (6) α−1 Pv(x) Z (cid:20) (cid:21) Bycomparison,whentherelevantdimension(s)arefoundbymaximizinginformation[2],thegoal istomaximizeKullback-Leiblerdivergence,whichcanbeobtainedbytakingaformallimitα→1: I[v]= dxPv(x)Pv(x|spike)lnPv(x|spike) = dxPv(x|spike)lnPv(x|spike). (7) Pv(x) Pv(x) Pv(x) Z Z Returning to the variance optimization, the maximal value for F[v] that can be achieved by any dimensionvis: [P(s|spike)]2 F = ds . (8) max P(s) Z Itcorrespondstothevarianceinthefiringrateaveragedacrossdifferentinputs(seeEq.(9)below). Computationofthemutualinformationcarriedbytheindividualspikeaboutthestimulusrelieson similarintegrals.Followingtheprocedureoutlinedforcomputingmutualinformation[14],onecan usetheBayes’ruleandtheergodicassumptiontocomputeF asatime-average: max 1 r(t) 2 F = dt , (9) max T r¯ Z (cid:20) (cid:21) where the firing rate r(t) = P(spike|s)/∆t is measured in time bins of width ∆t using multiple repetitions of the same stimulus sequence . The stimulus ensemble should be diverse enough to justifytheergodicassumption[thiscouldbecheckedbycomputingF forincreasingfractionsof max theoveralldatasetsize]. Theaveragefiringrater¯= P(spike)/∆tisobtainedbyaveragingr(t)in time. The fact that F[v] < F can be seen either by simply noting that χ2[v] ≥ 0, or from the data max processing inequality, which applies not only to Kullback-Leibler divergence, but also to Re´nyi divergences[12,13,1].Inotherwords,thevarianceinthefiringrateexplainedbyagivendimension F[v] cannotbe greater than the overallvariance in the firing rate F . This is because we have max averagedoverallofthevariationsinthefiringratethatcorrespondtoinputswiththesameprojection valueonthedimensionvanddifferonlyinprojectionsontootherdimensions. OptimizationschemebasedonRe´nyidivergencesofdifferentordershaveverysimilarstructure. In particular,gradientcouldbeevaluatedinasimilarway: α−1 ∇vF(α) = α dxPv(x|spike)[hs|x,spikei−hs|xi] d Pv(x|spike) , (10) α−1Z dx"(cid:18) Pv(x) (cid:19) # wherehs|x,spikei= dssδ(x−s·v)P(s|spike)/P(x|spike),andsimilarlyforhs|xi. Thegradient isthusgivenbya weightedsumofspike-triggeredaverageshs|x,spikei−hs|xiconditionalupon projection values of sRtimuli onto the dimension v for which the gradient of information is being evaluated. The similarity of the structure of both the objective functions and their gradients for differentRe´nyidivergencesmeansthatnumericalgorithmscan be used foroptimizationof Re´nyi divergencesofdifferentorders.Examplesofpossiblealgorithmshavebeendescribed[1,2,11]and includeacombinationofgradientascentandsimulatedannealing. Hereareafewfactscommontothisfamilyofoptimizationschemes.First,aswasprovedinthecase ofinformationmaximizationbasedonKullback-Leiblerdivergence[2],themeritfunctionF(α)[v] doesnotchangewiththelengthofthevectorv. Thereforev·∇vF =0,ascanalsobeseendirectly from Eq. (10), because v ·hs|x,spikei = x and v ·hs|xi = x. Second, the gradient is 0 when evaluatedalongthe truereceptivefield. Thisis becausefor the truerelevantdimensionaccording to which spikes were generated, hs|s ,spikei = hs|s i, a consequence of the fact that relevant 1 1 projectionscompletelydeterminethe spike probability. Third, meritfunctions, includingvariance andinformation,canbecomputedwithrespecttomultipledimensionsbykeepingtrackofstimulus projectionsonalltherelevantdimensionswhenformingprobabilitydistributions(4). Forexample, inthecaseoftwodimensionsv andv ,wewoulduse 1 2 Pv1,v2(x1,x2|spike)= dsδ(x1−s·v1)δ(x2−s·v2)P(s|spike), Z Pv1,v2(x1,x2)= dsδ(x1−s·v1)δ(x2−s·v2)P(s), (11) Z to compute the variance with respect to the two dimensions as F[v ,v ] = 1 2 dx dx [P(x ,x |spike)]2/P(x ,x ). 1 2 1 2 1 2 IRf multiple stimulus dimensions are relevant for eliciting the neural response, they can always be found (provided sufficient number of responses have been recorded) by optimizing the variance according to Eq. (11) with the correct number of dimensions. In practice this involves finding asinglerelevantdimensionfirst,andtheniterativelyincreasingthenumberofrelevantdimensions consideredwhileadjustingthepreviouslyfoundrelevantdimensions.Theamountbywhichrelevant dimensionsneedtobeadjustedisproportionaltothecontributionofsubsequentrelevantdimensions to neural spiking (the correspondingexpression has the same functionalform as that for relevant dimensionsfoundbymaximizinginformation,cf.AppendixB[2]).Ifstimuliareeitheruncorrelated orcorrelatedbutGaussian,thenthepreviouslyfounddimensionsdonotneedtobeadjustedwhen additionaldimensionsareintroduced. Alloftherelevantdimensionscanbefoundonebyone,by always searching only for a single relevant dimension in the subspace orthogonal to the relevant dimensionsalreadyfound. 3 Illustrationforamodel simplecell Hereweillustratehowrelevantdimensionscanbefoundbymaximizingvariance(equivalenttoleast square fitting), and compare this scheme with that of finding relevant dimensions by maximizing information,aswellaswiththosethatarebaseduponcomputingthespike-triggeredaverage. Our goalistoreconstructrelevantdimensionsofneuronsprobedwithinputsofarbitrarystatistics. We usedstimuliderivedfromanaturalvisualenvironment[11]thatareknowntostronglydeviatefrom a Gaussian distribution. All of the studies have been carried out with respect to model neurons. Advantage of doing so is that the relevant dimensions are known. The example model neuron is takentomimicpropertiesofsimplecellsfoundintheprimaryvisualcortex.Ithasasinglerelevant dimension, which we will denote as eˆ . As can be seen in Fig. 1(a), it is phase and orientation 1 sensitive. Inthismodel, agivenstimuluss leadsto aspikeif theprojections = s·eˆ reachesa 1 1 thresholdvalueθinthepresenceofnoise:P(spike|s)/P(spike)≡g(s )=hH(s −θ+ξ)i,where 1 1 aGaussianrandomvariableξ withvarianceσ2 modelsadditivenoise,andthefunctionH(x) = 1 forx> 0,andzerootherwise. Theparametersθforthresholdandthenoisevarianceσ2 determine theinput–outputfunction.Inwhatfollowswewillmeasuretheseparametersinunitsofthestandard deviation of stimulus projectionsalong the relevant dimension. In these units, the signal-to-noise ratioisgivenbyσ. Figure1 showsthatit ispossibleto obtaina goodestimate of therelevantdimensioneˆ bymaxi- 1 mizingeitherinformation,asshowninpanel(b),orvariance,asshowninpanel(c). Thefinalvalue of the projection depends on the size of the dataset, as will be discussed below. In the example showninFig.1therewere≈50,000spikeswithaverageprobabilityofspike≈0.05perframe,and thereconstructedvectorhasaprojectionvˆ ·eˆ = 0.98whenmaximizingeitherinformationor max 1 variance. Havingestimatedtherelevantdimension,onecanproceedtosamplethenonlinearinput– outputfunction. ThisisdonebyconstructinghistogramsforP(s·vˆ )andP(s·vˆ |spike)of max max projectionsontovectorvˆ foundbymaximizingeitherinformationorvariance,andtakingtheir max ratio. Because of the Bayes’ rule, this yieldsthe nonlinearinput–outputfunctiong of Eq. (1). In Fig.1(d)thespikeprobabilityofthereconstructedneuronP(spike|s·vˆ )(crosses)iscompared max withtheprobabilityP(spike|s )usedinthemodel(solidline). Agoodmatchisobtained. 1 In actuality, reconstructing even just one relevant dimension from neural responses to correlated non-Gaussianinputs,suchasthosederivedfromreal-world,isnotaneasyproblem. Thisfactcan beappreciatedbyconsideringtheestimatesofrelevantdimensionobtainedfromthespike-triggered average (STA) shown in panel (e). Correcting the STA by second-ordercorrelations of the input ensemblethroughamultiplicationbytheinversecovariancematrixresultsinaverynoisyestimate, maximally informative dimension of (a) truth (b) dimension (c) maximal variance (d) 1.0 truth y 10 10 10 bilit0.8 imnfaoxrimmaiztiinogn (x) a ob0.6 variance (x) pr 20 20 20 e 0.4 k spi0.2 30 30 30 0.0 10 20 30 10 20 30 10 20 30 -6 -4-2 0 2 4 6 regularized filtered stimulus (sd=1) (e) STA (f) decorrelated STA (g) decorrelated STA (h) 1.0 10 10 10 bility0.8 dSeTcAo r(rxe)lated a ob0.6 regularized pr decorrelated 20 20 20 e 0.4 STA (x) k pi0.2 s 30 30 30 0.0 10 20 30 10 20 30 10 20 30 -6 -4-2 0 2 4 6 filtered stimulus (sd=1) Figure1: Analysisofamodelvisualneuronwithonerelevantdimensionshownin(a). Panels(b) and(c)shownormalizedvectorsvˆ foundbymaximizinginformationandvariance,respectively; max (d) The probability of a spike P(spike|s · vˆ ) (blue crosses – information maximization, red max crosses–variancemaximization)iscomparedtoP(spike|s )usedingeneratingspikes(solidline). 1 Parametersof the model are σ = 0.5 and θ = 2, both given in units of standard deviationof s , 1 whichisalsotheunitsforthex-axisinpanels(dandh).Thespike–triggeredaverage(STA)isshown in(e). Anattempttoremovecorrelationsaccordingtothereversecorrelationmethod,C−1 v apriori sta (decorrelatedSTA), is shown in panel(f) and in panel(g) with regularization(see text). In panel (h), the spike probabilities as a function of stimulus projections onto the dimensions obtained as decorrelatedSTA (bluecrosses)andregularizeddecorrelatedSTA (redcrosses)arecomparedtoa spikeprobabilityusedtogeneratespikes(solidline). showninpanel(f).Ithasaprojectionvalueof0.25.Attempttoregularizetheinverseofcovariance matrixresultsinaclosermatchtothetruerelevantdimension[15,16,17,18,19]andhasaprojection valueof0.8,asshownin panel(g). While itappearstobelessnoisy,theregularizeddecorrelated STA can have systematic deviations from the true relevant dimensions [9, 20, 2, 11]. Preferred orientationis less susceptible to distortionsthan the preferredspatial frequency[19]. In this case regularizationwasperformedbysettingaside1/4ofthedataasatestdataset,andchoosingacutoff ontheeigenvaluesofthe inputcovariancesmatrixthatwouldgivethe maximalinformationvalue onthetestdataset[16,19]. 4 ComparisonofPerformance withFiniteData Inthelimitofinfinitedatatherelevantdimensionscanbefoundbymaximizingvariance,informa- tion,orotherobjectivefunctions[1]. Inarealexperiment,withadatasetoffinitesize,theoptimal vector found by any of the Re´nyi divergencesvˆ will deviate from the true relevantdimension eˆ . 1 InthissectionwecomparetherobustnessofoptimizationstrategiesbasedonRe´nyidivergencesof variousorders,includingleastsquaresfitting(α=2)andinformationmaximization(α=1),asthe datasetsizedecreasesand/orneuralnoiseincreases. Thedeviationfromthetruerelevantdimensionδv = vˆ−eˆ arisesbecausetheprobabilitydistri- 1 butions(4) are estimated from experimentalhistogramsand differ from the distributionsfoundin the limit of infinite data size. The effects of noise on the reconstruction can be characterized by takingthe dotproductbetweentherelevantdimensionandthe optimalvectorfora particulardata sample: vˆ·eˆ = 1− 1δv2,wherebothvˆandeˆ arenormalized,andδvisbydefinitionorthogo- 1 2 1 naltoeˆ . Assumingthatthe deviationδv issmall, wecanuse quadraticapproximationtoexpand 1 the objective function (obtained with finite data) near its maximum. This leads to an expression δv = −[H(α)]−1∇F(α), which relates deviation δv to the gradient and Hessian of the objective functionevaluatedatthe vectoreˆ . Subscript(α) denotesthe orderof the Re´nyidivergenceused 1 asanobjectivefunction. Similarlytothecaseofoptimizinginformation[2],theHessianofRe´nyi divergenceofarbitraryorderwhenevaluatedalongtheoptimaldimensioneˆ isgivenby 1 α−3 2 P(x|spike) d P(x|spike) H(α) =−α dxP(x|spike)C (x) , (12) ij ij P(x) dx P(x) Z (cid:20) (cid:21) (cid:20) (cid:18) (cid:19)(cid:21) whereC (x) =(hs s |xi−hs |xihs |xi)arecovariancematricesofinputssortedbytheirprojec- ij i j i j tionxalongtheoptimaldimension. WhenaveragedoverpossibleoutcomesofN trials,thegradientiszerofortheoptimaldirection.In otherwords,thereisnospecificdirectiontowardswhichthedeviationsδvarebiased.Next,inorder tomeasuretheexpectedspreadofoptimaldimensionsaroundthetrueoneeˆ ,weneedtoevaluate 1 hδv2i = Tr h∇F(α)∇F(α)Ti H(α) −2 ,andthereforeneedtoknowthevarianceofthegradient of F averagehd across different(cid:2)equiva(cid:3)lenitdatasets. Assuming that the probabilityof generatinga spikeisindependentfordifferentbins,wefindthath∇F(α)∇F(α)i=B(α)/N ,where i j ij spike 2α−4 2 P(x|spike) d P(x|spike) B(α) =α2 dxP(x|spike)C (x) . (13) ij ij P(x) dx P(x) Z (cid:20) (cid:21) (cid:20) (cid:21) Therefore an expected error in the reconstruction of the optimal filter by maximizing variance is inverselyproportionaltothenumberofspikes: 1 Tr′[BH−2] vˆ·eˆ ≈1− hδv2i=1− , (14) 1 2 2N spike whereweomittedsuperscripts(α) forclarity. Tr′ denotesthetracetakeninthesubspaceorthogo- naltotherelevantdimension(deviationsalongtherelevantdimensionhavenomeaning[2],which mathematicallymanifestsitselfindimensioneˆ beinganeigenvectorofmatricesH andBwiththe 1 zero eigenvalue). Note that when α = 1, which correspondsto Kullback-Leiblerdivergenceand informationmaximization,A≡ Hα=1 = Bα=1. Theasymptoticerrorsinthiscasearecompletely determinedbythetraceoftheHessianofinformation,hδv2i ∝ Tr′ A−1 , reproducingtheprevi- ouslypublishedresultformaximallyinformativedimensions[2]. Qualitatively,theexpectederror ∼ D/(2N ) increases in proportion to the dimensionality D of(cid:2)input(cid:3)s and decreases as more spike spikes are collected. This dependenceis in common with expected errors of relevant dimensions foundbymaximizinginformation [2],aswellasmethodsbasedoncomputingthespike-triggered averagebothforwhitenoise[1,21,22]andcorrelatedGaussianinputs[2]. Next we examine which of the Re´nyi divergencesprovidesthe smallest asymptotic error (14) for estimating relevant dimensions. Representing the covariance matrix as C (x) = γ (x)γ (x) ij ik jk (exact expression for matrices γ will not be needed), we can express the Hessian matrix H and covariancematrixforthegradientBasaverageswithrespecttoprobabilitydistributionP(x|spike): B = dxP(x|spike)b(x)bT(x), H = dxP(x|spike)a(x)bT(x), (15) Z Z wherethegainfunctiong(x) = P(x|spike)/P(x), andmatricesb (x) = αγ (x)g′(x)[g(x)]α−2 ij ij and a (x) = γ (x)g′(x)/g(x). Cauchy-Schwarz identity for scalar quantities states that, ij ij hb2i/habi2 ≥ 1/ha2i, where the average is taken with respect to some probability distribution. AsimilarresultcanalsobeprovenformatricesunderaTroperationasinEq. (14). Applyingthe matrix-versionoftheCauchy-SchwarzidentitytoEq.(14),wefindthatthesmallesterrorisobtained when Tr′[BH−2]=Tr′[A−1], with A= dxP(x|spike)a(x)aT(x), (16) Z MatrixAcorrespondstotheHessianofthemeritfunctionforα=1:A=H(α=1).Thus,amongthe variousoptimizationstrategiesbased on Re´nyidivergences,Kullback-Leiblerdivergence(α = 1) has the smallest asymptotic errors. The least square fitting correspondsto optimization based on Re´nyi divergence with α = 2, and is expected to have larger errors than optimization based on Kullback-Leiblerdivergence(α = 1)implementinginformationmaximization. Thisresultagrees withrecentfindingsthatKullback-Leiblerdivergenceisthebestdistortionmeasureforperforming lossycompression[3]. Below weuse numericalsimulationswith modelcellsto comparethe performanceofinformation (α = 1) and variance (α = 2) maximizationstrategies in the regime of relatively small numbers ofspikes. Weareinterestedintherange0.1< D/N < 1,wheretheasymptoticresultsdonot ∼ spike ∼ necessarilyapply. TheresultsofsimulationsareshowninFig.2asafunctionofD/N ,aswell spike aswithvaryingneuralnoiselevels.Toestimatesharper(lessnoisy)input/outputfunctionswithσ = 1.5,1.0,0.5,0.25,weusedlargernumberofbins(16,21,32,64),respectively.Identicalnumerical algorithms,includingthenumberofbins,wereusedformaximizingvarianceandinformation.The relevantdimensionforeachsimulatedspiketrainwasobtainedasanaverageof4jackknifeestimates computedbysettingaside1/4ofthedataasatestset.Resultsareshownafter1000lineoptimizations (D = 900),andperformanceonthetestsetwascheckedaftereveryline optimization. As canbe seen, generallygoodreconstructionswith projectionvalues> 0.7canbe obtainedbymaximizing ∼ eitherinformationorvariance,evenintheseverelyundersampledregimeD <N . Wefindthat spike reconstructionerrorsarecomparableforbothinformationandvariancemaximizationstrategies,and arebetterorequal(atverylowspikenumbers)thanSTA-basedmethods.Informationmaximization achieves significantly smaller errors than the least-square fitting, when we analyze results for all simulationsforfourdifferentmodelscellsandspikenumbers(p<10−4,pairedt-test). 1.0 1.0 maximizing information 0.9 maximizing variance regularized decorrelated STA n 0.8 D 0.9 o C si B en 0.7 A m e di 0.6 maximizing information 0.8 u maximizing variance n tr 0.5 o n ectio 0.4 STA 0.7 proj 0.3 0.2 0.6 A B 0.1 A B C D C decorrelated STA D 0 0.5 0 0.5 1.0 1.5 2.0 2.5 0 0.5 1.0 1.5 2.0 2.5 D / Nspike D / Nspike Figure 2: Projection of vector vˆ obtained by maximizing information (red filled symbols) or max variance(blueopensymbols)onthetruerelevantdimensioneˆ isplottedasafunctionofratiobe- 1 tweenstimulusdimensionalityDandthenumberofspikesN ,withD =900.Simulationswere spike carriedoutformodelvisualneuronswithonerelevantdimensionfromFig.1(a)andtheinput-output functionEq.(1)describedbythresholdθ =2.0andnoisestandarddeviationσ =1.5, 1.0, 0.5, 0.25 forgroupslabeledA(△),B(▽),C((cid:13)),andD(2),respectively. Theleftpanelalsoshowsresults obtainedusingspike-triggeredaverage(STA,gray)anddecorrelatedSTA(dSTA,black).Intheright panel,wereplotresultsforinformationandvarianceoptimizationtogetherwiththoseforregularized decorrelatedSTA(RdSTA,greenopensymbols).Allerrorbarsshowstandarddeviations. 5 Conclusions Inthispaperwecomparedaccuracyofa familyofoptimizationstrategiesforanalyzingneuralre- sponsestonaturalstimulibasedonRe´nyidivergences.Findingrelevantdimensionsbymaximizing oneofthemeritfunctions,Re´nyidivergenceoforder2, correspondstofittingthelinear-nonlinear modelintheleast-squaresensetoneuralspiketrains.Advantageofthisapproachoverstandardleast squarefittingprocedureisthatitdoesnotrequirethe nonlineargainfunctiontobe invertible. We derivederrorsexpectedforrelevantdimensionscomputedbymaximizingRe´nyidivergencesofar- bitraryorderintheasymptoticregimeoflargespikenumbers.Thesmallesterrorswereachievednot inthecaseof(nonlinear)leastsquarefittingofthelinear-nonlinearmodeltotheneuralspiketrains (Re´nyidivergenceof order2), but with informationmaximization(based on Kullback-Leiblerdi- vergence).Numericsimulationsontheperformanceofbothinformationandvariancemaximization strategiesshowedthatbothalgorithmsperformedwellevenwhenthenumberofspikesisverysmall. Withsmallnumbersofspikes,reconstructionsbasedoninformationmaximizationhadalsoslightly, butsignificantly,smallererrorsthoseofleast-squarefitting. Thismakestheproblemoffindingrel- evantdimensions,togetherwiththeproblemoflossycompression[23,3], oneofexampleswhere information-theoreticmeasuresare no moredata limited than those derivedfrom least squares. It remainspossible,however,thatothermeritfunctionsbasedonnon-polynomialdivergencemeasures couldprovideevensmallerreconstructionerrorsthaninformationmaximization. References [1] L. Paninski. Convergence properties of three spike-triggered average techniques. Network: Comput. NeuralSyst.,14:437–464,2003. [2] T.Sharpee,N.C.Rust,andW.Bialek.Analyzingneuralresponsestonaturalsignals:Maximallyinforma- tiovedimensions. NeuralComputation,16:223–250,2004. Seealsophysics/0212110,andapreliminary account inAdvancesinNeuralInformationProcessing15editedbyS.Becker, S.Thrun,andK.Ober- mayer,pp.261-268(MITPress,Cambridge,2003). [3] Peter Harremoe¨s and Naftali Tishby. The Information bottleneck revisited or how to choose a good distortionmeasure. Proc.oftheIEEEInt.Symp.onInformationTheory(ISIT),2007. [4] E.deBoerandP.Kuyper. Triggeredcorrelation. IEEETrans.Biomed.Eng.,15:169–179,1968. [5] I.W.HunterandM.J.Korenberg. Theidentificationofnonlinearbiologicalsystems: WienerandHam- mersteincascademodels. Biol.Cybern.,55:135–144,1986. [6] R.R.deRuytervanSteveninckandW.Bialek.Real-timeperformanceofamovement-sensitiveneuronin theblowflyvisualsystem: codingandinformationtransferinshortspikesequences. Proc.R.Soc.Lond. B,265:259–265,1988. [7] V. Z. Marmarelis. Modeling Methodology for Nonlinear Physiological Systems. Ann. Biomed. Eng., 25:239–251,1997. [8] W.BialekandR.R.deRuytervanSteveninck. Featuresanddimensions:Motionestimationinflyvision. q-bio/0505003,2005. [9] D.L.Ringach,G.Sapiro,andR.Shapley.Asubspacereverse-correlationtechniqueforhtestudyofvisual neurons. VisionRes.,37:2455–2464,1997. [10] D. L. Ruderman and W. Bialek. Statistics of natural images: scaling in the woods. Phys. Rev. Lett., 73:814–817,1994. [11] T. O. Sharpee, H.Sugihara, A. V. Kurgansky, S. P.Rebrik, M. P.Stryker, and K. D.Miller. Adaptive filteringenhancesinformationtransmissioninvisualcortex. Nature,439:936–942,2006. [12] S.M.AliandS.D.Silvey. Ageneralclassofcoefficeintofdivergenceofonedistributionfromanother. J.R.Statist.Soc.B,28:131–142,1966. [13] I.Csisza´r. Information-typemeasuresofdifferenceofprobabilitydistrbutionsandindirectobservations. StudiaSci.Math.Hungar.,2:299–318,1967. [14] N.Brenner,S.P.Strong,R.Koberle,W.Bialek,andR.R.deRuytervanSteveninck. Synergyinaneural code. NeuralComputation,12:1531–1552,2000. Seealsophysics/9902067. [15] F. E. Theunissen, K. Sen, and A. J. Doupe. Spectral-temporal receptive fields of nonlinear auditory neuronsobtainedusingnaturalsounds. J.Neurosci.,20:2315–2331,2000. [16] F.E.Theunissen,S.V.David,N.C.Singh,A.Hsu,W.E.Vinje,andJ.L.Gallant.Estimatingspatio-temporal receptivefieldsofauditoryandvisualneuronsfromtheirresponsestonaturalstimuli. Network,3:289– 316,2001. [17] K.Sen,F.E.Theunissen,andA.J.Doupe. Featureanalysisofnaturalsoundsinthesongbirdauditory forebrain. J.Neurophysiol.,86:1445–1458,2001. [18] D.Smyth,B.Willmore,G.E.Baker,I.D.Thompson,andD.J.Tolhurst.Thereceptivefieldsorganization of simple cells in the primary visual cortex of ferrets under natural scene stimulation. J. Neurosci., 23:4746–4759,2003. [19] G.Felsen,J.Touryan,F.Han,andY.Dan. Corticalsensitivitytovisualfeaturesinnaturalscenes. PLoS Biol.,3:1819–1828,2005. [20] D. L. Ringach, M. J. Hawken, and R. Shapley. Receptive fieldstructure of neurons in monkey visual cortexrevealedbystimulationwithnaturalimagesequences. JournalofVision,2:12–24,2002. [21] N.C.Rust,O.Schwartz,J.A.Movshon,andE.P.Simoncelli. SpatiotemporalelementsofmacaqueV1 receptivefields. Neuron,46:945–956,2005. [22] SchwartzO.,J.W.Pillow,N.C.Rust,andE.P.Simoncelli. Spike-triggeredneuralcharacterization. Jour- nalofVision,176:484–507,2006. [23] N.Tishby,F.C.Pereira,andW.Bialek.Theinformationbottleneckmethod.InB.HajekandR.S.Sreeni- vas,editors,Proceedingsofthe37thAllertonConferenceonCommunication,ControlandComputing,pp 368–377.UniversityofIllinois,1999. Seealsophysics/0004057.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.