ebook img

Limits to causal inference with state-space reconstruction for infectious disease PDF

2.7 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Limits to causal inference with state-space reconstruction for infectious disease

Limits to causal inference with state-space reconstruction for infectious disease SarahCobey∗1 andEdwardB.Baskerville†1 6 1Ecology&Evolution,UniversityofChicago,Chicago,IL,USA 1 0 2 t Abstract c O Infectious diseases are notorious for their complex dynamics, which make it difficult to fit models 9 totesthypotheses. Methodsbasedonstate-spacereconstructionhavebeenproposedtoinfercausalin- 2 teractions in noisy, nonlinear dynamical systems. These “model-free” methods are collectively known asconvergentcross-mapping(CCM).AlthoughCCMhastheoreticalsupport,naturalsystemsroutinely ] violate its assumptions. To identify the practical limits of causal inference under CCM, we simulated M thedynamicsoftwopathogenstrainswithvaryinginteractionstrengths. TheoriginalmethodofCCM Q is extremely sensitive to periodic fluctuations, inferring interactions between independent strains that . oscillatewithsimilarfrequencies. Thissensitivityvanisheswithalternativecriteriaforinferringcausal- o i ity. However, CCM remains sensitive to high levels of process noise and changes to the deterministic b attractor. This sensitivity is problematic because it remains challenging to gauge noise and dynamical - q changesinnaturalsystems,includingthequalityofreconstructedattractorsthatunderliecross-mapping. [ WeillustratethesechallengesbyanalyzingtimeseriesofreportablechildhoodinfectionsinNewYork CityandChicagoduringthepre-vaccineera. Wecommentonthestatisticalandconceptualchallenges 3 v thatcurrentlylimittheuseofstate-spacereconstructionincausalinference. 6 1 7 Background 0 0 . 1 Identifyingtheforcesdrivingchangeinnaturalsystemsisamajorgoalinecology. Becauseexperimentsare 0 6 oftenimpracticalandcomeatthecostofgeneralizability,acommonapproachistofitmechanisticmodelsto 1 observations. Testinghypothesesthroughmechanisticmodelshasaparticularlystrongtraditionininfectious : v diseaseecology[1–4]. Modelsthatincorporatebothrainfallandhostimmunity,forexample,betterexplain i X patternsofmalariathanmodelswithonlyrainfall[5];modelswithschooltermsfitthehistoricperiodicityof r measlesinEnglandandWales[6,7]. Theabilityoffittedmechanisticmodelstopredictobservationsoutside a the training data strongly suggests that biological insight can be gained. There is nonetheless a pervasive riskthatpredictivevariablesmerelycorrelatewiththetrue,hiddenvariables,orthatthemodel’sfunctional relationships create spurious resemblances to the true dynamics. This structural uncertainty in the models themselveslimitsinference[8–12]. Analternativeapproachtoinferringcausalityistoexaminethetimeseriesofpotentiallyinteractingvariables without invoking a model. These methods face a similar challenge: they must distinguish correlated inde- ∗[email protected][email protected] 1 pendentvariablessharingamutualdriverfromcorrelationsarisingfromdirectorindirectinteractions. Many ofthesemethods,includingGrangercausality[13]andotherrelatedmethods[14–16],inferinteractionsin terms of information flow in a probabilistic framework and cannot detect bidirectional causality. A recent suiteofmethodsbasedondynamicalsystemstheoryproposestoinferinteractions,bothunidirectionaland bidirectional,insystemsthatarenonlinear,noisy,andpotentiallyhigh-dimensional[17–19]. Thebasicidea isthatifX drivesY,informationaboutX isembeddedinthetimeseriesofY. Examiningtherelationships betweendelay-embeddingsofthetimeseriesofX andY canrevealwhetherX drivesY,Y drivesX,both, orneither. Theseapproaches,whichwerefertocollectivelyasconvergentcross-mapping(CCM),havebeen offeredasgeneraltoolstoanalyzecausationinnonlineardynamicalsystems[17–19]. ThemathematicalfoundationsofCCM,andthereforeitsassumptions,lieindeterministicnonlinearsystems theory. After sufficient time, the states of a deterministic dynamical system reach an attractor, which may beapointequilibrium,alimitcycle,orahigher-dimensionalchaoticattractor. ByTakens’theorem,aone- dimensionaltimeseriesX(t)fromthesystemcanbemappedperfectlytotheattractorinthefullstatespace in the system by constructing a delay embedding, in which states of the full system are mapped to delay vectors,x(t) = X(t),X(t τ ),X(t τ ),...,X(t τ ,fordelaysτ andanembeddingdimension 1 2 E−1 i { − − − } E, which must be at least as large as the dimensionality of the attractor [20]. This mapping provides the basis for causal inference under CCM: if Y drives (causes) X, then a newly observed x(t) can perfectly reconstruct the corresponding Yˆ(t) from past observations of the mapping x(t) Y(t) (Fig. 1A). As the → number of observed delay vectors x(t) increases, the reconstruction converges to small error, as observed pointsonthereconstructedattractorbecomeclosetogether[17]. Withfinite,noisyrealdata,thereconstructionisnecessarilyimperfect,andtwooperationalcriteriahavebeen usedtodetectcausality. Thefirstcriterion(Fig.1B)isbasedsimplyonthisimprovementinreconstruction quality with the number of observations. This approach is known to produce false positives in the case of strongly driven variables, where the system becomes synchronized to the driver [17,21]. This failure is logically consistent with the theory: the theory implies that, with perfect data, causal drivers will produce good reconstructions, but not that non-causal drivers will not produce good reconstructions. The second criterion(Fig.1C)triestocorrectthisproblembyadditionallyconsideringthedirectionalityofinformation flowintime[18]. Ifonevariabledrivesanother,thebestpredictionsofcurrentstatesofthedrivenvariable shouldcomefrompast,notcurrentorfuture,statesofthedriver. Many ecological systems undergo synchronized diurnal or annual fluctuations and thus raise doubts about the first criterion. Transient dynamics, demographic and environmental noise, and observation error—all ubiquitous in nature—raise general concerns, since they violate the theory’s assumption that variables are perfectly observed in a deterministic system. Variations of CCM have nonetheless been applied to such systemstotesthypothesesaboutwhointeractswithwhom[17–19,22,23]. We investigated whether the frequently periodic, noisy, and transient dynamics of ecological systems are a currentobstacletocausalinferencebasedonstate-spacereconstruction. Thesefactorshavebeenaddressed to varying degrees in different contexts [17–19] but not systematically. Specifically, we examined whether thetwocriteriaforcausalinferencearerobusttoinevitableuncertaintiesaboutthedynamicsunderlyingthe data. With little prior knowledge of a system’s complexity, including the influences of transient dynam- ics and noise, can we reach statistically rigorous conclusions about who interacts with whom? Infectious diseases provide a useful test case because their dynamics have been extensively studied, long time series are available, and pathogens display diverse immune-mediated interactions [24]. Their dynamics are also influenced by seasonal variation in transmission rates, host population structure, and pathogen evolution. 2 A Time-Series Data Bootstrap Delay-Vector Sampling X(t ⌧ ) 2 � Library X(t ⌧ ) 1 � X(t) [X(t1),X(t1 ⌧1),X(t1 ⌧2)] Y(t1 `) � � ! � [X(t ),X(t ⌧ ),X(t ⌧ )] Y(t `) 2 2 1 2 2 2 � � ! � [X(t ),X(t ⌧ ),X(t ⌧ )] Y(t `) 3 3 1 3 2 3 � � ! � Y(t `) � Nearest-Neighbor Prediction Cross-Map Correlation Distribution X(t ⌧ ) ●● ●●�●●●2●●● ●●●●● ●●● sity ● ● ● n ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● de ● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ●● ● ●● ● ● ●●●● ●●●●X●(t) −0.2 0.0 0.2 ˆ X(t ⌧ ) cross-map ⇢(Y,Y) 1 � B C Criterion 1: Cross-Map Increase Criterion 2: Negative Cross-Map Lag X causes Y L max ⇢ 0.8 Y causes X nsity Lmin map 00..46 e - d ss 0.2 o cr 0.0 −0.4 0.0 0.4 0.8 −20 −10 0 10 20 ˆ cross-map ⇢(Y,Y) cross-map lag` Figure1: Summaryofcriteriafordetectingcausality. (A)Schematicofcross-mapalgorithmfortesting Y X. Delay vectors in X, mapped to values in Y with lag (cid:96), are bootstrap-sampled to construct a → prediction library. For each delay vector in X, reconstructed values Yˆ are calculated from a distance- weightedsumofY valuesfromnearestneighborsinthelibrary. Manysampledlibrariesyieldadistribution of cross-map correlations between actual Y and reconstructed Yˆ. (B) Criterion 1 (cross-map increase). Bootstrapdistributionsofcross-mapcorrelationarecalculatedatminimumandmaximumlibrarysizeswith (cid:96) = 0; causality is inferred if the correlation at L is significantly greater than the correlation at L . max min (C)Criterion2(negativecross-maplag). Cross-mapcorrelationsarecalculatedacrossdifferentvaluesof(cid:96). Causality is inferred if the highest cross-map correlation for negative (cid:96) is positive and significantly greater thanthehighestvaluefornonnegative(cid:96). 3 Theabilitytotestdirectlyforthepresenceofinteractionswouldsaveconsiderableeffortoverfittingsemi- mechanisticmodelsthatincorporatethesecomplexities. WefindthatalthoughCCMappearstoworkbeau- tifullyinsomeinstances,itdoesnotinothers. Noiseandtransientdynamicscontributetopooroutcomes,as do statistical ambiguities in the methodology itself. We propose that except in extreme circumstances, the currentmethodcannotreliablyrevealcausalityinnaturalsystems. Results To assess the reliability of CCM, we began by simulating the dynamics of two strains with stochastic, seasonallyvaryingtransmissionrates(Methods). Inlargesystems,manyfactorsmightinfluencetheserates. Inlow-dimensionalmodels,thesefactorsaretypicallyrepresentedasprocessnoise. Weconsequentlyvaried thelevelofprocessnoiseinoursimulationsbychangingitsstandarddeviation,η. Wealsovariedthestrength of competition from strain 2 on strain 1 (σ ); strain 1, in contrast, never affected strain 2 (σ = 0). For 12 21 eachlevelofcompetitionandprocessnoise, wesimulated100replicatesfromrandominitialconditionsto stochasticfluctuationsaroundadeterministicattractor. Onethousandyearsoferror-freemonthlyincidence wereoutputtogiveCCMthebestchancetowork. Foreachcombinationofparameters(competitionstrength σ andprocessnoiseη),weexaminedwhetherstraininteractionswerecorrectlyinferred. Whenσ > 0, 12 12 strain2shouldbeinferredto“drive”(influence)strain1. Becauseσ = 0,strain1shouldneverbeinferred 21 todrivestrain2. To detect interactions, for each individual time series, we identified the delay-embeddings (Fig. 1A) and applied one of two causality criteria using the reconstructed attractors (Fig. 1B,C and Methods). Both criteria are based on the cross-map correlation ρ, which is the correlation between reconstructed values of Yˆ and actual values of Y, given the reconstructed attractor of X. We use p < 0.05 to identify significant differences in these correlations because we are interested in situations in which the null hypothesis of no change in correlation, and thus no interaction, is rejected. Criterion 1 [17,19] measures whether the cross-mapcorrelationincreasesasthenumberofobservationsoftheputativelydrivenvariablegrows(Fig. 1B). We refer to this as the cross-map increase criterion. Criterion 2 [18] infers a causal interaction if the maximum cross-correlation of the putative driver is positive and occurs in the past (i.e., at a negative temporallag;Fig. 1C).Werefertothisasthenegativecross-maplagcriterion. Forsimplicity,westartwith Criterion1. Sensitivitytoperiodicity Criterion 1, which requires a significant increase in cross-map correlation ρ with observation library size L, frequently detected interactions that did not exist. In all cases where strain 2 had no effect on strain 1, CCM always incorrectly inferred an influence (Fig. 2A). Although strain 1 never influenced strain 2, it was often predicted to (Fig. 2A). Sample time series suggested a strong correlation between synchronous oscillations and the appearance of bidirectional interactions (Fig. 2B). In contrast, when strain 2 appeared to drive strain 1 but not vice-versa (σ = 0 and η = 0.05), strain 1 often oscillated with a period that 12 was an integer multiple of the other strain’s (Fig. 2C). Thus, as expected, strongly synchronized dynamics preventedseparationofthevariables. Additionally,theresemblanceofstrain2totheseasonaldriverledto falsepositivesevenwhenthestrainswereindependentandstrain1oscillatedatadifferentfrequency. 4 A C causesC C causesC 2 1 1 2 ) 2 σ1 1 1.00 1.00 1.00 1.00 0.84 1.00 1.00 1.00 0.00 0.76 1.00 ( y t 0.75 ni 0.5 1.00 1.00 1.00 1.00 0.98 1.00 1.00 0.96 0.00 0.78 u m 0.50 m 0.25 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.52 −i 0.25 s s o 0 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.44 0.00 r c 1e−06 0.001 0.01 0.05 0.1 1e−06 0.001 0.01 0.05 0.1 s.d. process noise (η) B 0.004 e c n0.003 e d ci C n 1 i0.002 y C hl 2 t n o0.001 m 0.000 0 5 10 15 20 25 time (years) C 0.06 e c n e d0.04 ci C n 1 i y C hl0.02 2 t n o m 0.00 0 5 10 15 20 25 time (years) Figure2: Interactionsdetectedasafunctionofprocessnoiseandthestrengthofinteraction(C C ) 2 1 → and representative time series. (A) Heat maps show the fraction of 100 replicates significant for each inferred interaction for different parameter combinations. A significant increase in cross-map correlation ρ with library length L indicated a causal interaction. The time series consisted of 1000 years of monthly data. (B) Representative 25-year sample of the time series for which mutual interactions were inferred (σ = 0.25,η = 0.01). (C) Representative sample of the time series for which C is inferred to drive C 12 2 1 butnotvice-versa(σ12 = 0.25,η = 0.05). 5 1.00 s 12 0 0.25 n0.75 o ti 0.5 c a r 1 e t n i0.50 h r o f e 1e−06 u al 0.001 v −0.25 p 0.01 0.05 0.1 0.00 0.0 0.1 0.2 0.3 0.4 maximum cross−spectral density Figure 3: Shared frequency spectra predict probability of inferred interaction. Points show the maxi- mumcross-spectraldensitiesofstrains1and2plottedagainstthep-valuesforC C for1000yearsof 1 2 → annual data. In all replicates, C never actually drives C . Point color indicates the strength of C C 1 2 2 1 → (σ ),andpointsizeindicatesthestandarddeviationoftheprocessnoise(η)ontransmissionrates. 12 Thesensitivityofthemethodtoperiodicitypersisteddespitetransformationsofthedataandchangestothe driver. One possible solution to reducing seasonal effects, sampling annual rather than monthly incidence, reducedtheoverallrateoffalsepositivesbutalsofailedtodetectsomeinteractions(Fig.S1A).Furthermore, whentheeffectsofstrain2on1werestrongest, thereverseinteractionwasmoreofteninferred. Sampling the prevalence at annual intervals gave similar results (Fig. S1B), and first-differencing the data did not qualitativelychangeoutcomes(Fig.S1C).Themethodyieldedincorrectresultsevenwithoutseasonalforc- ing ((cid:15) = 0) because of noise-induced oscillations (Fig. S1D). In all of these cases, the presence of shared periodsbetweenthestrainscorrelatedstronglyandsignificantlywiththerateofdetectingafalseinteraction (Fig.3). Becausecross-mapskillshoulddependonthequalityofthereconstructedattractor,weinvestigatedperfor- manceunderothermethodsofconstructingtheattractorsofthetwostrains(Methods). Nonuniformembed- dingmethodsallowthetimedelaystooccuratirregularintervals,τ ,τ ,...τ ,whichmayprovideamore 1 2 E−1 accurate reconstruction. Alternative reconstruction methods, including nonuniform embedding [25,26], random projection [22], and maximizing the cross-map (rather than univariate) correlation failed to fix the problem(Fig.S2). Criterion 2, which infers that Y drives X if there is a positive cross-map correlation that is maximized at a negative cross-map lag, performed relatively well (Fig. 4). Fewer false positives were detected, although the method missed some weak extant interactions (σ = 0.25) and interactions in noisy systems (η = 12 0.05,0.1). Results for annual data were similar (Fig. S3A). Requiring that ρ be not only positive but also 6 C causesC C causesC 2 1 1 2 ) 1 1.00 1.00 0.98 0.22 0.31 0.00 0.00 0.00 0.09 0.08 2 1 σ 1.00 ( y t 0.5 0.95 0.90 0.97 0.05 0.25 0.01 0.00 0.00 0.07 0.11 0.75 ni u m 0.50 m −i 0.25 0.12 0.10 0.31 0.15 0.08 0.00 0.06 0.05 0.11 0.08 0.25 s s o 0.00 r c 0 0.18 0.11 0.14 0.11 0.14 0.00 0.00 0.00 0.09 0.04 1e−060.001 0.01 0.05 0.1 1e−060.001 0.01 0.05 0.1 s.d. process noise (η) Figure4: Interactionsdetectedasafunctionofprocessnoiseandthestrengthofinteraction(C C ) 2 1 → andrepresentativetimeseries. Heatmapsshowthefractionof100replicatessignificantforeachinferred interactionfordifferentparametercombinations. Amaximum,positivecross-mapcorrelationρatanegative lagindicatedacausalinteraction. Eachreplicateused100yearsofmonthlyincidence. increasingbarelyaffectedperformance(Fig.S3B). Limitstoidentifiability If two variables X and Y share the same driver but do not interact, if the driving is strong enough, X may resemble the driver so closely that X appears to drive Y. In a similar vein, when the two strains in our systemhaveidenticaltransmissionrates(β = β )andonestronglydrivestheother(σ = 1),thedirection 1 2 12 of the interaction cannot be detected when the dynamics are nearly deterministic (η = 10−6) (Fig. S3C). Causalinferenceinsuchcasesbecomesdifficult. Toinvestigatethelimitstodistinguishingstrainsthatareecologicallysimilaranddonotinteract,wevaried thecorrelationofthestrain-specificprocessnoisewhileapplyingthemoreconservativeofthetwocriteriafor inferringcausality(Criterion2),thatthecross-mapcorrelationρbepositiveandpeakatanegativelag[18]. Processnoisecanbethoughtofasahiddenenvironmentaldriverthataffectsbothstrainssimultaneously,and thusthestrengthofcorrelationindicatestherelativecontributionofsharedversusstrain-specificnoise. With twoidentical,independentstrains,noseasonalforcing,andlowprocessnoise(η = 0.01),thefalsepositive ratedependedoncorrelationstrengthandthequantityofdata. Whenusing100yearsofmonthlyincidence, the false positive rate varied non-monotonically with correlation strength, with a minimum (5%-6%) at a correlationof0.75anditshighestvalues,near24%,atcorrelationsof0and1(Fig.S4A).Using1000years ofannualincidencereducedfalsepositiveratesto5%-9%forimperfectlycorrelatednoise(Fig.S4B).The bestperformanceoccurredwith100-yearmonthlydatawhencross-mapcorrelationwasrequiredtoincrease with library length (Fig. S4C). Thus, the independence of two strains will generally be detected as long as theyexperienceimperfectlycorrelatednoise. 7 We next considered the problem of identifying two ecologically distinct strains (β = β ) when one strain 1 2 (cid:54) strongly drives the other (σ = 1) and its dynamics resemble the seasonal driver. In this case, even with 12 perfectlycorrelatedprocessnoise,correctinteractionsareconsistentlyinferred(Fig.S5). Thus,weconclude thatthepresenceofnoise,evenhighlycorrelatednoise,canhelpdistinguishcausalitybetweencoupled,syn- chronizedvariables[14]. Itismoredifficulttodistinguishnon-interacting,dynamicallyequivalentvariables. Inthelattercase,noisehasinconsistenteffectsoncausalinference,althoughCriterion2mayperformmuch better than Criterion 1. These results at least hold for “modest” noise (η = 0.01): as shown earlier, higher levelshurtperformance(Fig.4). Transientdynamics CCM is optimized for dynamics that have converged to a deterministic attractor. Directional parameter changes in time and large perturbations can prevent effective cross-mapping because the method requires a consistent mapping between system states as well as sufficient coverage of state space by the data. We evaluatedtheimpactofbothofthesetypesoftransientdynamicsoncausalinference,usingasimpleexample ofeachasproofofprinciple. In the first test, we identified two sets of parameter values where CCM was successful under Criterion 2 (intermediate interaction strength, σ = 0.5; seasonal forcing, (cid:15) = 0.1; process noise, η = 0.01; 21 and transmission rates β of 0.30 (Fig. 5A) and 0.32 (Fig. 5B)). We tested CCM on simulations with the 1 parameter values fixed and then with the transmission rate β varying linearly over time betwen the two 1 values. Allthreetestsused100yearsofmonthlyincidence. Of100replicates,withβ fixedat0.30,CCM 1 failedtodetectaninteraction5times,andneverfalselydetectedanabsentinteraction. Withβ1fixedat0.32, therewere12falsenegativesand1falsepositive. Whenβ variedfrom0.30to0.32,errorratesincreased: 1 therewere29falsenegativesand44falsepositives. Transientdynamicsduetoalinearchangeinasystem parameter can thus lead to incorrect causal inference even when causal inference is successful before and afterthechange. Inthesecondtest,webegansimulationsatrandominitialconditionsfarfromequilibriumandappliedCCM to the first 100 years of monthly incidence. When strain 2 weakly drives strain 1 (σ = 0.5), causal 12 inference is compromised, even when process noise is low (η = 0.01; Fig. S6). In 100 simulations of this scenario, the correct interaction (strain 2 driving strain 1) was always detected after transients had passed, but it was detected in only 19 of 100 simulations that included transients. Furthermore, a reverse interaction(strain1driving2)wasincorrectlydetectedin21of100simulations. Themethodthusperformed worse than chance in identifying interactions that were present, and it also regularly predicted nonexistent interactions. Applicationtochildhoodinfections Given the apparent success of CCM under Criterion 2 (negative cross-map lag) with two strains and little noise near the attractor, we investigated whether the method might shed light on the historic dynamics of childhood infections in the pre-vaccine era. Time series analyses have suggested that historically common childhood pathogens may have competed with or facilitated one another [27,28]. We obtained the weekly incidenceofsixreportableinfectionsinNewYorkCityfromintermittentperiodsspanning1906to1953[29] (Fig.6A).Sixof30pairwiseinteractionsweresignificantatthep < 0.05level,notcorrectingformultiple 8 A 0.025 e c n0.020 C1 C2 e d ci0.015 n i y0.010 hl t n0.005 o m 0.000 0 25 50 75 100 time (years) B 0.025 e c n0.020 e d ci0.015 n i y0.010 hl t n0.005 o m 0.000 0 25 50 75 100 time (years) C 0.025 e c n0.020 e d ci0.015 n i y0.010 hl t n0.005 o m 0.000 0 25 50 75 100 time (years) Figure5: Incorrectinferencewithchangingtransmissionrate. Exampletimeseriesfortestingtransient dynamics. Each time series contained 100 years of monthly incidence data. The transmission rate β for 1 the driven strain C was fixed at β = 0.30 (A) and β = 0.32 (B), and varied linearly over time between 1 1 1 thetwovalues(C).ThetransienttimeseriesyieldshighfalsepositiveandfalsenegativeratesunderCCM. Interactionstrengthwasσ = 0.5,processnoisewasη = 0.01,andseasonalforcingwas(cid:15) = 0. 21 9 tests (Fig. 6C). Polio drove mumps and varicella, scarlet fever drove mumps and polio, and varicella and pertussis drove measles. Typical cross-map lags occurred at one to three years (Fig. S7). The inferred interactions were identical if we required that the cross-map correlation ρ be increasing and not merely positive. Although we specifically chose infectious diseases not subject to major public health interventions in the sampling period, it is possible that the New York data contain noise and transient dynamics. To the check robustness of the conclusions, we analyzed analogous time series from Chicago from the same period (Fig. 6B). Completely different interactions appeared (Fig. 6C). Not correcting for multiple tests, pertus- sisdrovescarletfeverandvaricella;acceptingmarginallysignificantnegativelags(p = 0.055),poliodrove measles. In these cases, the maximum cross-map correlation ρ was not only positive but also increased at negative lag. Requiring that ρ only be positive at negative lag, polio also drove pertussis, measles drove mumpsandvaricella,andmumpsdrovescarletfever. Exceptinonecase,allnegativelagsoccurredatmore than one year (Fig. S8). Thus, no consistent interactions appeared in epidemiological time series of two major,andpossiblydynamicallycoupled,cities. Toinvestigatethepossibilitythatourmethodofattractorreconstructionmightbeundulysensitivetonoise andtransientdynamics,werepeatedtheprocedurewithamethodbasedonrandomprojections[22]. Once again, no interactions were common to both cities (Fig. 6D). Furthermore, only one of the original eight interactions from the first reconstruction method reappeared with random projection (two of eight reap- pearedifdisregardingthecity),andtwointeractionschangeddirection(threeifdisregardingthecity). Both reconstructionmethodsselectedsimilarlags(Figs.S9,S10). Discussion CCMis,intheory,anefficientalternativetomechanisticmodelingforcausalinferenceinnonlinearsystems. Byevaluatingpropertiesofreconstructeddynamicsinstatespace,itsidestepsanyneedtoformulateandfit whatareofteninaccuratemathematicalmodels. Incurrentpractice,CCMappearsanunstablebasisforin- ferenceinnaturalsystems. WesimulatedtwointeractingstrainsandfoundthattheoriginalCCM(Criterion 1)canleadtoerroneousconclusionswheneverstrainsfluctuatedatsimilarfrequencies. Applyingadifferent criterionforcausalitythatconsidersthetemporallagatwhichthecross-mapcorrelationismaximized[18], rather than the change in the cross-map correlation with time series length L [17], avoids this problem. Inference with Criterion 2 is somewhat robust to process noise, which can improve performance in some cases. Butthemethodhastwoproblems,evenwithperfectandabundantobservations. First,itremainssus- ceptible to deviations from its core dynamical assumptions. “High” process noise and transient dynamics eachdiminishperformance,leadingtofalsepositivesandnegatives. Althoughsomeobservedsystemsmay follow deterministic dynamics that do not themselves change in time, this assumption is often dubious in ecology. Second, even when the dynamical assumptions are upheld, seemingly equally justifiable methods ofattractorreconstructionyielddifferentresults. Iftheaimistotesthypothesesstatistically,theseproblems raisedoubtsaboutthesuitabilityofmethodsbasedonstate-spacereconstructioninecology. Oscillations are common in nature, especially in infectious diseases, and suggest that the original criterion (Criterion 1) for causal inference could routinely mislead. Climatic and seasonal cycles, driven by such factors as school terms, El Nin˜o, and absolute humidity, pervade the dynamics of many pathogens and influence the timing of epidemics [5,6,30–32]. Infectious diseases can also exhibit fluctuations in the 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.