IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 1 Band-phase-randomized Surrogates to assess nonlinearity in non-stationary time series Diego Guar´ın∗, Student Member, IEEE, Edilson Delgado, Member, IEEE, and A´lvaro Orozco. Abstract—Testingfornonlinearityisoneofthemostimportant from the surrogates. If the statistic value of the data deviates preprocessing steps in nonlinear time series analysis. Typically, from that of the surrogates, then the null hypothesis may be this is done by means of the linear surrogate data methods. 1 rejected. Otherwise, it may not. But it is a known fact that the validity of the results heavily 1 The linear methods for constrained realizations namely (i) depends on the stationarity of the time series. Since most 0 physiologicalsignalsarenon-stationary,itiseasytofalselydetect Random shuffle (RS); (ii) Random phase (RP); and, (iii) 2 nonlinearity using the linear surrogate data methods. In this Amplitude adjusted Fourier transform (AAFT) surrogates[1], n document, we propose a methodology to extend the procedure were developed to test the null hypothesis that the data came a forgeneratingconstrainedsurrogatetimeseriesinordertoassess from a (i) i.i.d gaussian random process, (ii) linear correlated J nonlinearity in non-stationary data. The method is based on stochastic process; and (iii) nonlinear static transformation of 1 theband-phase-randomizedsurrogates, whichconsists(contrary 3 to the linear surrogate data methods) in randomizing only a a linear stochastic process. Surrogates generated with the RS portion of the Fourier phases in the high frequency band. method are constrained to the amplitude distribution (AD) or ] Analysis of simulated time series showed that in comparison rankdistributionoftheoriginaldata,whiletheonesgenerated P to the linear surrogate data method, our method is able to with the RP algorithm preserve the autocorrelation (AC(τ)) A discriminate between linear stationarity, linear non-stationary and surrogates generated with the AAFT algorithm preserve . and nonlinear time series. When applying our methodology to t heart rate variability (HRV) time series that present spikes both the AD and AC(τ) of the original data. a t and other kinds of nonstationarities, we where able to obtain As the process that generates surrogate data is stationary [3], s surrogatetimeseriesthatlooklikethedataandpreserveslinear there could be some situations where surrogatesfail to match [ correlations,somethingthatisnotpossibletodowiththeexisting thedata,eventhoughtheAD andAC(τ) arethe sameforthe 1 surrogate data methods. data and surrogates, so the null hypothesis could be trivially v Index Terms—Computational methods in statistical physics rejected.Thisis particularytrue whendata are nonstationary. 3 andnonlineardynamics,hypothesistesting,surrogatedata,heart 6 Because of this, when the statistical properties of data are rate variability. 0 time dependent it is not feasible to use the linear surrogate 6 data methodsfor testing nonlinearity [4] (Timmer [5] showed 1. I. INTRODUCTION that for some non-stationary processes the test is able to 0 discriminate between linear and nonlineardata, but this is not The surrogate data method, initially introduced by J. 1 a general result). 1 Theiler et al. [1] is nowadays one of the most popular From the introduction of the linear surrogate data method, : tests used in nonlinear time series analysis to investigate v there has been a widespread interest in modifyingit to assess the existence of nonlinear dynamics underlying experimental i nonlinearity in non-stationary time series. The first attempt X data. The approach is to formulate a null hypothesis for a (as we can tell) to apply the method to non-stationary time r specific process class and compare the system output to this a series was done by T. Schreiber [6]. He proposed that to deal hypothesis. The surrogate data method can be undertaken withnon-stationaritydata,thenullhypothesisshouldincludeit in two different ways: Typical realizations are Monte Carlo explicitly.Becauseotherwise,therejectionofanullhypothesis generated surrogates from a linear model that provides a can be equally to nonlinearity or non-stationarity. e.g., given good fit to the data; constrained realizations are surrogates any process we can ask whether the data is compatible with generated from the time series to fulfill the null hypothesis the null hypothesis of a correlated linear stochastic process and to conform to certain properties of the data. The latter with time dependent local behavior. In order to answer this approachis suitable for hypothesistesting due to the fact that question in a statistical sense we have to create surrogate itdoesnotrequierepivotalstatistics [2].Inorderto testanull time series that show the same linear correlations and the hypothesis at a certain confidence level, one has to generate same time dependency of the local behavior as the data and a given number of surrogates. Then, one evokes whatever compare a nonlinear statistic between data and surrogates statistic is of interest and compares the value of this statistic [4]. To generate surrogates constrained to data AC(τ) and computed from the data to the distribution of values elicited time dependence of local behavior, T. Schreiber [6] used an iterative procedure called simulated annealing. Unfortunately, D.GuarinandA.OrozcoarewiththeDepartmentofElectricalEngineering, UniversidadTecnolo´gicadePereira-UTP,VeredalaJulita,Pereira,Colombia. this method requires a big amount of computationaltime and E. Delgado is with the Research Center of the Instituto Tecnolo´gico never became of popular usage. Metropolitano – ITM, Calle 73 No. 76A-354 V´ıa al volador, Medell´ın, In another study, A. Schmitz and T. Schreiber [7] proposed a Colombia. ∗Corresponding author. email: [email protected] different method to deal with non-stationarity. The proposed IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 2 method involved dividing the signal into stationary segments, to the parameter selection criteria in order to make the thenapplyingthelinearsurrogatedatamethodtoeachsegment method suitable for different types of nonstationarities (not andfinallyjoiningthesegmentstoformasurrogatetimeseries onlytrends).Totesttheproposedmethodologyweappliedthe of the same size as the originaldata. The major problemwith test to several simulated time series with different dynamical thisprocedureisthatthereisnotastraightforwardwaytofind properties. We also applied the methods to HRV signals of stationary segments in a non-stationary signal. healthy patients. Finally we conclude. Recently, T. Nakamura and M. Small [8] proposed a new methodologytoapplythesurrogatedatamethodtotimeseries II. BACKGROUND withtrends,calledSmallShuffleSurrogate(SSS)datamethod Prior to introducing the current technology in surrogate which is a modification of the RS algorithm. The main idea data methods, it is vital to make one observation: Hypothesis introduced in [8] is that in order to preserve the trend of the testing, such as the surrogate data method, cannot be used datainsurrogates,therandomizationshouldbeappliedonlyin to determine what the data is, only what the data is not a small scale, in this way all local correlations in the original [2]. That is; if after our comparison we cannot distinguish timeseriesaredestroyedinsurrogates;buttheglobalbehavior between data and surrogates, this may be simply because our (i.e., the trend) is preserved. selected statistic is inadequate. Conversely, if the data and Basedontheideaofpreservingtheslowbehaviorofthesignal surrogates are different, then we can sate, that, with a certain in surrogates,T. Nakamura et al. [9] presented a modification probability the data is not consistent with the corresponding of the RP algorithm which makes it suitable for data null hypothesis. with trends. They called it the Truncated Fourier Transform Surrogate (TFTS) data method. TFTSs are constrained to A. Surrogate data methods conformtotheAC(τ)andwiththecorrectparameterselection 1) Linear surrogate data methods: Linear surrogate data to the trend of data (the authors also apply the modification were introduced to preclude a linear filtered noise source to the iAAFT method, thus preserving the AD, AC(τ) and as the possible origin of experimental data. The algorithms, the trend of data in surrogates). So, nonstationarities (in as stated earlier, generate surrogate data that fulfill the this case caused by the presence of a trend) are included null hypothesis of IID noise; linearly filtered noise; and, a in the null hypothesis, as suggested by A. Schmitz and T. monotonic nonlinear transformation of linearly filtered noise. Schreiber [4], [6]. The idea of the method is to preserve Hence, these techniques produce flawless linear data. The the slow behavior or trends while destroying all possible algorithmstogeneratesuchsurrogatescanbestatedasfollows nonlinear correlationsin the irregular fluctuations. To achieve [1]. this goal, the authors proposed to randomize phases only in the higher-frequency domain and not alter the low-frequency RS A surrogate time series {st} is generated from the phases(theoriginalideaofband-phase-randomizedsurrogates scalar time series data {xt} by randomly shuffling was briefly proposed by J. Theiler et al. [10] but it was not {xt}. Thisprocessdestroysall temporalcorrelations implemented until the work of T. Nakamura et al. [9]). This (which are not expected in a IID process) but approach is in contrast to linear surrogate methods (RP and maintains the same AD. iAAFT), where all phases are randomized. RP The surrogate {st} is generated by taking the It is worth mentioning that other attempts have been made Fouriertransformofthedata,randomisingthephases in order to assess nonlinearity in non-stationay data. L. Faes (replacing it by the phases of a random IID process et al. [11] presented a method for calculating the parameters of the same length as {xt}), and taking the inverse of an non-stationary AR model. Based on this method, they Fouriertransform.The surrogatethereforemaintains generatedtypicalrealizationsofthenon-stationaryHeartRate the linear correlations of data but any nonlinear Variability (HRV) signals and tested for nonlinearity, but as structure is destroyed. the surrogates are typical realizations, one needs a pivotal AAFTOne first re-scales the data original time series so statistic. Recently, C.J. Keylock [12] presented a modification that it is Gaussian, then generates an Algorithm 1 of the iAAFT method based on the wavelet transform, with surrogate of the data {pt}, and finally re-orders the this method it is possible to generate surrogates constrained originaldataso thatithasthesamerankdistribution to preserve the AC(τ) and the local mean and variance of as {pt}. This re-ordered time series constitutes the the data, butaccordingto our personalexperiencethe method surrogate{st}.Thisprocessachievestwoaims:first, proposed by T. Nakamura et al. [9] is simpler to implement just as with the Algorithm 1, the power spectra (and and achieves similar results. In a recent publication [13], therefore linear correlations) of data is preserved in we presented a modification of the TFTS through which we surrogates; second, the re-ordering process means assessed nonlinearity in data with spikes, but this method is thattheADofdataandsurrogatesarealsoidentical. limited to data with gaussian AD. It should be noted that the AAFT algorithm does not deliver In this document we introduce the band-phase-randomized what it promises. The phase randomisation will preserve the surrogate methods in a rather organized way, we also present linear correlation, but re-scaling the output of the inverse the algorithmsto facilitate and promote the application of the Fourier transform {p } to have the same AD as the original t method. In regards to the method, we present a discussion data will alter the autocorrelation structure of the data. on the parameter selection and introduce some modifications Although the data and surrogate will have identical rank IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 3 distribution, the linear correlation will only be approximately Time Series the same. A solution to this problem has been proposed by FT T. Schreiber and A. Schimitz [14]. Essentially, the solution is to iterate the AAFT algorithm until convergence is achieved. Magnitude Phase However,there is no guarantee that this iteration will, in fact, converge. This algorithm is refereed to as improved AAFT (for a discussion on the convergenceof the iAAFT algorithm Randomize the see [15]). phases in a portion 2) Surrogate data methods for data with trends: As stated of the higher earlier,whendataarenon-stationary,thehypothesisaddressed frequency domain bythelinearsurrogatedatamethodsaretriviallyrejected.Two differentsurrogatedatamethodshavebeenproposedto tackle data with trends, the SSS and the TFTS data methods. The IFT hypothesis tested by SSS algorithm is that the data, while possibly exhibiting some trend, is otherwise just noise [8]; Truncated Fourier while the hypothesis tested by TFTS algorithm is that the Transform Surrogates data, while possibly exhibiting some trend, is generated by a stationary linear system [9]. These algorithms can be stated Fig.1. FlowchartoftheTruncatedFourierTransformSurrogatemethod. as follow [16]. SSS sLoetx{itit}=bextth).eOinbdtaexino{fi′t{}xt=} (t{hiatt+is,Aigtt}=wthaenrde lio.en.,gefrcp≈resNer/v2e)d, dinectrheeassinugrrofgcatuensti(li.teh.,eAdaCt(aτli=nea1r)ityofisdantoa {gt} are Gaussian random numbers, and A is an falls outside the distribution of surrogates) and then perform ainmtepglriteusd,ewh(neoreteast{hia′tt}{witil}l nwoitl)l. Rbeanka osredqeure{nicte} otof tahree ptersetsewrvitehdthinesluasrtrovgaaltuees.of fc for which linearities of data obtain {r }. The surrogates {s } are obtained from t t s = x . If A is an intermediate value (e.g. 1), t rt B. Significance and power of the test surrogates generated by this algorithm will preserve the slow trend in the data, but any inter-point Applying a statistical hypothesis test to observed data can dynamics will be destroyed by the local shuffling of result in two outcomes: either the null hypothesis is rejected, individual points. or it is not. In the former case there is a probability α that TFTS Thesurrogate{s }isgeneratedbytakingtheFourier the null hypothesis is rejected even though it is true (Type t transform of the data {X } . Then generating I Error), in the latter case there is a probability β (Type II ω ω random phases φ , such that φ ∼ U(0,2π) Error) that we will fail to reject the null when it is in fact ω ω if ω > f and 0 if ω ≤ f ( φ have to false. The probabilityα is known as the significance level, its c c ω be antisymmetric around φ ). Finally taking the complement (1−α) is the confidence level. For example, if 0 inverse Fourier transform of the complex series one generates 19 surrogates using some algorithm, and these {X eıφω} (Fig. 1). As in the RP surrogates, all yield a larger (or smaller) value of some statistic than the ω ω linear dependencies are preserved in surrogates. data,thentheprobabilitythatthisresultoccurredbychanceis But, since some phases are untouched, TFTS data α= 1 , and hence we concludeat the 0.05significance (0.95 20 may still have nonlinear correlations. However, it is confidence)levelfora one-sidedtest thattheselected statistic possibletodiscriminatebetweenlinearandnonlinear is different from the surrogates. Conversely, the power of a databecausethesuperpositionprincipleisonlyvalid test (1−β) is the probability the null hypothesis is correctly forlineardata,sowhendataarenonlinear,evenifthe rejected. power spectrum is preserved completely, the inverse Clearly,theprobabilityαisdeterminedbythenumberoftrials Fouriertransformdatausingrandomizedphaseswill and the numberof independenttest statistics. Computingα is exhibit a different dynamical behavior only a matter of computing probabilities. The problem is that thevalueofβ isnotclear.Theactualpowerβ willdependon TFTSsareinfluencedprimarilybythechoiceoffrequencyf . c the choice of test statistic. If the test statistic is independent Iff istoohigh,surrogatesarealmostidenticaltotheoriginal c of data and surrogates then the power is determined by the data. In thiscase, evenif there is nonlinearityin the data, one number of trials [16]. mayfailtodetectit.Conversely,iff istoolow,surrogatesare c almostthe same as the linearsurrogateand the localbehavior III. NONLINEARITY TESTFORNON-STATIONARYTIME is not preserved. In this case, even if there is no nonlinearity SERIES:PHYSIOLOGICAL DATAAPPROACH in the data, one may wrongly judge otherwise. A. Database In general, the correct value of f cannot be determined a c priori. To select an adequate value of f , T. Nakamura et 1) Simulatedtimeseries: Totesttheproposedmethodology c al. [9] proposed to start randomizing a portion of the higher we applied it to different simulated time series, two linear frequencydomain(e.g.a 1% of the higherfrequencydomain, (stationary and non-stationary) and two nonlinear (stationary IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 4 a) Time Series 1 0 −1 f = f b) c cmin 1 0 −1 c) Generate and 1 ensemble of 0 Band-phase-randomized −1 d) surrogates 1 0 −1 1 2048 Fig.2. a)Linearstationary(LS)signal,b)linearnon-stationary(LNS)signal, c)nonlinearstationary(NLS)signalandd)nonlinearnon-stationary (NLNS) Are linear signal. correlations of NO data present in and non-stationary).The linear time series were generated by surrogates? the following AR(2) process [5] x(n)=a (n)x(n−1)+a x(n−2)+η. (1) 1 2 Where YES a (n)=2cos(2π/T(n))e(−1/τ), a =e(−2/τ), Use the selected 1 2 T(n)=T +M sin(2πt/T ), (2) statistic to perform e T mod a nonlinear test and η ∼N(0,1). store the results. To generate a linear stationary signal we used T = 10, e T = 250, τ = 50 and M = 0, for the linear mod T non-stationary signal we used MT =6. Increase fc The nonlinear time series were generated by the following nonlinear process [17] x(n)=a (n)x(n−1)(1−x2(n−1))e(−x2(n−1))+a x(n−2). 1 2 (3) f ≤f For the nonlinear stationary signal we used a1(n) = 3.4 and YES c cmax a =0.8. For the nonlinear non-stationary signal we used 2 3.0 if 0<n≤N/2, a (n)= 1 NO (3.4 if N/2<n≤N. End An example of each of these signals is shown in Fig. 2 with N =2048. Fig.3. Proposedmethodologytoassessnonlinearity innon-stationary time 2) Physiological time series: The HRV time series of series. healthy subjects were extracted from the MIT-BIH Normal Sinus Rhythm Database in Physionet [18], [19] according to annotationsforonlynormalbeats.Samplerate was128Hz in 1) Band-Phase-Randomized Surrogates: 24-hr Holter recordings. Band-phase-randomized surrogate data method is, as mentioned, a modification of the RP algorithm in which not B. Proposed procedure all phases but a portion of the phases in the high-frequency It is widely accepted that most biomedical systems are band are randomized. dynamicand producenonstationarysignals [20]; the presence Unfortunately, as stated by [10] it is difficult to automate the of slow varying trends is only one type of nonstationarities procedure in order to make it applicable to all time series. present in physiological signals. So, the novelty of the The methodology proposed in [9] to find de correct value of present document is to propose a methodology based on f (i.e., the correct portion in the frequency band in which c the TFTS data method (which from now on will be called the phases are to be randomized) is only useful when data band-phase-randomized surrogate data method) that allows have a slow varyingtrend, because when this statementis not us to assess nonlinearity in data with different kinds of true, the stoping criterium is never met (i.e., AC(τ = 1) of nonstationarities(e.g.,spikes,abruptchangesinthedynamical data falls outside the distribution of surrogates ) and so one behavior). The proposed procedure is depicted in Fig 3. alwaysendsup using the iAAFT algorithmevenwhen data is IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 5 a) b) not-stationary. In [13], we propose that the stoping criterium 1 1 should be the similarity between data and surrogates, i.e., surrogatesshouldpreservethe localbehaviorofthe data.But, when the data is in fact nonlinear this criterium fails. Next, we present a new method for selecting the correct parameter 0 0 c) d) of the algorithm. 1 1 Itshouldbenotedthattheuseoftheend-phase-randomized surrogate data method will not improve the type II error because if the method fails to reject the null when all phases arerandomized(usingsomestatistic) thenitcertainlywillnot 0 0 0 250 500 750 1.000 0 250 500 750 1.000 be able to reject the null when just a portion of the phases fc fc are randomized. On the other hand, the type I error will be improved by means of this method. Fig.5. Normalizedrmsdifference betweenlocalmean(continuosline)and 2) Parameter selection: To overcome the parameter variance(dottedline)ofdata(a)LSsignal,b)LNSsignal,c)NLSsignaland selection problem we propose not to use just one value of d)NLNSsignal)andBand-Phase-Randomize surrogates asafunctionoffc. f but a set of values. The proposed methodology is as Thelocalmeanandvariancewascalculatedusingwindowsoflength64with c 50%overlap. follows: First, we select two values f ≈N/2 and f . cmax cmin Within this range, we select a set of values for f (e.g., 10 c values),thenwegenerateBand-Phase-RandomizedSurrogates because at the moment, there is no optimal method for using all those values and finally we performthe nonlinearity embedding such data [21]. test (one must ensure that linear correlations of the data are Therefore, as discriminant statistics we chose the Average preserved in surrogates for those values of f ). c Mutual Information (I(τ)) [21]. The I(τ) is a nonlinear There are several ways to determine the value f ; if the cmin version of the AC(τ). It can answer the following question: Fourier transform magnitude (S(n)) has a pronounced peak On average, how much does one learn about the future from then, f is selected abovethe peak (see Fig. 4 a)). If S(n) cmin thepast?So,weexpectthatifourdataisnotjustarealization does not have a pronounced peak (or has several) then f cmin ofa linear non-stationarynoisyprocessit wouldhavea larger should be selected as the lowest value for which the local I(τ) than that of the surrogates. mean of the data is preserved in the surrogates (see Fig. 5); when data have a pronounced peak, both criteria result in a similar value of f . D. Implementation cmin Prior to the application of the methodology, we normalize C. Selection of the discriminant statistic the data to zero mean and unit variance and find the largest sub-segmentthatminimizesthe end-pointmismatch (thisstep Dynamical measures are often used as discriminating is extremely important and can be done automatically as statistics, the correlation being dimension one of the most suggested in [4]); if the data have a trend then one can apply popular choices [16]. To estimate these, we first need to the preprocessing methodology proposed in [9]. reconstruct the underlying attractor. For this purpose, a In order to reject a null hypothesis we generate M = 99 time-delay embedding reconstruction is usually applied. But surrogates using an improved Amplitude Adjusted version of this method is not useful for data exhibiting nonstationarities the band-phase-randomized surrogate data method, because as the I(τ) depends on the data AD, we have to generate surrogateswithequalAD asthedatatoavoidfalserejections. 100 a) ThenwecomputetheI(τ =1)fortheensembleofsurrogates f= 290 c andfortheoriginaltimeseries(inapreviousstudyweshowed n) S ( that I(τ) is sensible to the type of dynamics only for small 10−4 lags [22]). If I(τ = 1) is greater than that of the surrogates we reject the null hypothesis at the 0.01 significance level; b) otherwise, we do not reject the null. 0 n) φ ( IV. RESULTS −10 A. Numerical results 1 f 970 c Prior to testing for nonlinearity we normalized each time series to zero mean and unit variance, and selected Fig. 4. a) FT magnitude (note the logarithmic scale) and b) FT phases as a function of n for LS signal with N = 1940 (continuos line) and one a subsegment of the signals that minimized the end-point Band-Phase-Randomize surrogatefc=291(dottedline).S(n)fordataand mismatch, we end up with N= 1940, 1954, 1996 and 2023 surrogates are equal for all n, but φ(n) is equal only for n ≤ fc. In this number of data points for each time series. case we are randomizing 70% of the higher frequency domain. In b) the difference between the FT phases of data and surrogates is displaced form Fig5showsthenormalizedrootmeansquare(rms)difference ceroforclarity. between data (a) LS signal, b) LNS signal, c) NLS signal IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 6 a) c) e) g) 0.808 0.634 0.239 −0.135 1) = τ C ( A 0.803 0.628 0.234 −0.139 b) d) f) h) 0.81 0.6 1.2 0,8 1) = τ I( 0.76 0.35 0 0 0 1000 0 1000 0 1000 0 1000 f f f f c c c c Fig. 6. (Color online) a), c), e) and g)AC(τ =1)of the original time series (a),b) LSsignal, c),d)LNSsignal, e),f)NLSsignal and g),h)NLNSsignal) (continuos vertical line) and AC(τ =1)ofan ensemble Band-Phase-Randomize surrogates (5th, 50th and 95th percentiles) as afunction offc (fc =0is theresultofusingtheiAAFTalgorithm). b),d),f)andh)thesameasabovebutusingtheI(τ =1). and d) NLNS signal) andBand-Phase-Randomizedsurrogates test (the same curve as fig. 6 d), is obtained when the value as a function of f (when f = 0 Band-Phase-Randomized of M in (2) is slightly modified, the range of values of f c c T c surrogates and the iAAFT surrogates are equivalent). It can for which the null is rejected vary with M ). T be noted that for linear data it is possible to obtain surrogates Two other important aspects can be noticed in Fig. 6, first, it with almost the same local behavior as the original time isremarkablethatwhenlocalmeanandvarianceofsurrogates serieswhilefornonlineardatathelocalvarianceofsurrogates are similar to data, AC(τ = 1) of ensemble of surrogates is is never similar to the data (except for f = N/2). This almost the same as data, this can be seen in Fig. 6 a) and c) c result is expected because the variance is a nonlinear statistic forf >300andf >500respectively(comparethiswiththe c c and surrogates are only constrained to sample mean, sample resultsshowninFig.5a)andb)),butthesameresultsarenot variance AD and AC(τ) of data. observed when local variance of surrogates is not similar to From Fig. 5 we notice that f ≈ 280, 400, 50 and data(althoughthelocalmeanofsurrogatesissimilartodata), cmin 50 for each time series. Anyhow, we use f = 0 and this can be seen in Fig. 6 e) and g) respectively (compare cmin f =N/2−10 for the following result. this with the results shown in fig. 5 c) and d)). Second, cmax Fig. 6 shows the AC(τ = 1) and I(τ = 1) from data besidesdifferentiatingbetweenlinearandnonlineartimeseries and Band-Phase-Randomized surrogates. It can be noted that (stationary or not), this test can be used to discriminate for linear stationary data (fig. 6 a) and b) ) the hypothesis between linear stationary and linear non-stationary data, in tested by the iAAFT algorithm cannot be rejected (f = 0) the former case the hypothesis of linearity will be accepted c and as expected, randomizing only a portion of the higher for all valuesof f , while in the latter this will occuronly for c frequency domain, does not affect this result. When data is certain values of f (as shown in Fig. 6 d)). c nonlinear(stationaryornot)thetestrejectsthenullhypothesis Totesttherobustnessofthemethodweperformedthesame of linearity for all values of f within the selected range of analysispresentedhereaddinga5dB whitenoisetoeachtime c values.Asshowninfig.6g),AC(τ =1)ofdataisnotsimilar series and found similar results. to that of surrogates for some values of f , this implies that c linear correlations of the data are not well preserved in the B. Application to HRV signals surrogatesandoneshouldnotperformthenonlinearitytestfor Despite the fact that nonlinear dynamics are involved in these values of fc. In spite of this, the hypothesis is rejected. the genesis of HRV as a result of the interactions among The most interesting case (at least for the purpose of the hemodynamic, electrophysiological, and humoral variables present document) is the linear non-stationary case; in this [23], there is no proof that the recorded HRV time series situation nonlinearity is detected using the iAAFT algorithm (usually derived from an ECG) reflects this nonlinearity, this (fig. 6 d), fc = 0), so a careless application of the linear must be proven in each case. In this section, we apply the surrogate data method would result in a false detection of proposed methodology to assess nonlinearity in HRV which nonlinearity (type I error). But, as shown in fig. 6 d), the areknowntohavespikesandnonstationaritiesduetovariation nonlinearity is detected only for certain values of fc, in this of the patient activity (see Fig. 7 a). case when fc >500 nonlinearityis no longer detected by the Fig. 7 a), shows a 1 hour record of the HRV of a healthy IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 7 a) 32 year old male, the starting time is about midnight and the 0,8250 patient is at rest. Fig 7 b), depicted one surrogate time series 1) gwehnieleratseudrrougsiantegsthperescelnatsesdicailnmFeigtho7d c()iA, AwFhTeresugrreongeartaetse)d, τAC ( = using the band-phase-randomizedsurrogate data method with 0,8235 b) f =360. 0.85 c The original time series has much of its energy concentrated in the high frequency components, and as in the iAAFT = 1) surrogates the high frequency energy of the original time τI ( series is blurred in all the frequency spectrum, one gets 0.65 surrogates that are not simular to the HRV signal, allowing a 0 500 1000 f 1500 2000 c trivialrejectionofthenullhypothesis.Band-phase-randomized surrogatesovercomethisproblembypreservingthephasesina Fig. 8. a) AC(τ = 1), b)I(τ = 1) for the HRV signal and portionofthefrequencyspectrum,inthisway,highfrequency Band-Phase-Randomize surrogates asafunction offc. and low frequency componentsof the original time series are preserved in surrogates, as can be seen in Fig. 7 a) and c). by a nonlinear monotonic static observation function) and Usingtheproposedmethodologyitwasfoundthatf = cmin nonlineartimeseries.Thismethodisdifferentfrompreviously 200 and f = 2300, with this information, Fig. 8 was cmax proposed nonlinearity tests because: i) we do not randomize generated. the phases in all the frequencydomain but in a portion of the As expected, the null tested by the iAAFT surrogates is frequencydomain, and ii) we do notselect a correctvalue of rejected (f = 0), but as seen in Fig. 6 d), this is not an c f but a correct range [f ,f ], and within this range, a c cmin cmax indicatorofnonlinearity,butofnonlinearityornonstationarity, set of values for the parameter f . c and as in this case it is acknowledge that the tested signal Applying this test to physiologicaltime series, we found that is nonstatioanary, this test is not giving any new information nonlinearcorrelationsare presentin HRV signalsof a healthy about this signal. But the proposed methodology is; it can male,thisconfirmsthatnonlineardynamicsareinvolvedinthe be noticed that when f is within the selected range, the null c genesis of HRV, but as mentioned, every times series should hypothesisisalwaysrejected(andthelinearcorrelationsofthe be tested because there no a priori method to determine if a originaltimeseriesarealwayspreservedinsurrogates),andas given signal represent the nonlinearity of the process. was already noticed(Fig. 6 f) andh)), this is a clear indicator It is worth mentioning that as pointed out by many authors ( of the presence of nonlinear correlations. By this means, we [9], [10]), the linear surrogate data methods are only suitable confirmthatthereisacomplexnonlinearphysiologicalprocess for stochastic like data, and as the present methodology is underlying the HRV. based on that, the same limitations apply. V. CONCLUSION ACKNOWLEDGMENT D. Guar´ın was supportedby the UTP and COLCIENCIAS, In this document, we presented a methodology based program Jo´venes Investigadores e innovadores 2010. E. on the TFTS data method and the iAAFT algorithm that Delgado is supported by the Condonable Credits program allows us assess nonlinearity in non-stationary time series. of COLCIENCIAS in Colombia. Additionally, he would like Based on some simulated data we demonstrate that our to thank to the Research Center of the ITM of Medell´ın – methodologyis able to differentiatebetween linear stationary, Colombia who supported him with the PM10201 grant. linearnon-stationary(evenwhenthelineardataistransformed REFERENCES a) 5 [1] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. D. Farmer, ms) “Testingfornonlinearity intimeseries:Themethodofsurrogatedata,” V ( 0 PhysicaD,vol.58,p.77–94,1992. R H [2] J.TheilerandD.Prichard,“Constrained-realizationmonte-carlomethod −5 b) forhypothesistesting,”PhysicaD,vol.94,no.4,pp.221–235,1996. 5 ms) fc=0 [3] D. T. Kaplan, Frontiers of blood pressure and heart rate analysis. V ( 0 Amsterdam: IOS Press, 1997, vol. 35, ch. Nonlinearity and R Nonstationarity :Theuseofsurrogatedata ininterpreting fluctuations, H −5 pp.15–28. c) 5 [4] T. Schreiber and A. Schmitz, “Surrogate time series,” Physica D, vol. ms) fc=360 142,pp.346–382,2000. V ( 0 [5] J. Timmer, “Power of surrogate data testing with respect to HR nonstationarity,” Physical Review E, vol. 58, no. 4, pp. 5153 – 5156, −50 n 4797 1998. n [6] T.Schreiber, “Constrained randomization oftimeseriesdata,”Physical ReviewLetters, vol.80,no.10,pp.2015–2018,1998. [7] A. Schmitz and T. Schreiber, “Surrogate data for non-stationary Fig. 7. a) Segment of a HRV time series of a 32 year old healthy male, signals,” in Workshop on Chaos in Brain?, K. Lehnertz, J. Arnhold, b)surrogategeneratedusingtheiAAFTalgorithm,c)band-phase-randomized P. Grassberger, and C. E. Elger, Eds. Singapore: World Scientific, surrogates usingfc=360. 1999,p.222–225. IEEETRANSACTIONSONBIOMEDICALENGINEERING,VOL.?,NO.?,JANUARY2011 8 [8] T. Nakamura and M. Small, “Small-shuffle surrogate data: Testing for [16] M. Small, T. Nakamura, and X. Luo, Nonlinear Phenomena Research dynamics in fluctuating data with trends,” Physical Review E, vol. 72, Perspectives. Nova Science Publications, 2007, ch. Surrogate data pp.056216–1–056216–6,2005. methodsfordatathatisn’tlinearnoise,pp.55–81. [9] T. Nakamura, M. Small, and Y. Hirata, “Testing for nonlinearity in [17] L. Faes, H. Zhao, K. H. Chon, and G. Nollo, “A method for the irregularfluctuationswithlong-termtrends,”PhysicalReviewE,vol.74, time-varying nonlinear prediction ofcomplexnonstationary biomedical pp.026205–1–026205–8,2006. signals,” IEEE Trans. on Biomedical Engineering, vol. 56, no. 2, pp. [10] J.Theiler,P.S.Linsay,andD.M.Rubin,“Detectingnonlinearityindata 205–209,2009. withlongcoherencetimes,”inPredictingthefutureandunderstanding [18] L.Glass,“Introductiontocontroversialtopicsinnonlinearscience:Isthe thepast. Addison-Wesley, 1993,pp.429–455. normalheartratechaotic?”Chaos,vol.19,pp.028501–1–028501–4, [11] L.Faes, H. Zhao,K. H.Chon, and G.Nollo, “Time-varying surrogate 2009. data to assess nonlinearity in nonstationary time series: Application to [19] A. L. Goldberger, L. A. N. Amaral, J. M. H. L. Glass, P. C. Ivanov, heartratevariability,” IEEETrans.onBiomedicalEngineering,vol.56, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. no.3,pp.685–695,2009. Stanley, “Physiobank, physiotoolkit, and physionet : Components of a [12] C.J.Keylock,“Awavelet-based methodforsurrogatedatageneration,” new research resource for complex physiologic signals,” Circulation, PhysicaD,vol.225,p.219–228, 2007. vol.101,pp.215–220,2000. [13] D. Guar´ın, A. Orozco, E. Delgado, and E. Guijarro, “On detecting [20] R. M. Rangayyan, Biomedica Signal Analysis, M. Akay, Ed. IEEE determinism and nonlinearity in microelectrode recording signals: Press,2002. Approach based on non-stationary surrogate data methods,” in 32nd [21] M. Small, Applied Nonlinear Time Series Analysis - Applications in Annual International Conference of the IEEE EMBS, 2010, pp. 4032 Physics,PhysiologyandFinance. WorldScientific, 2005. –4035. [22] D. Guar´ın and A. Orozoco, “Pruebas de no linealidad para series [14] T.SchreiberandA.Schmitz,“Improvedsurrogatedatafornonlinearity temporales,” Submited, 2010. tests,”PhysicalReviewLetters, vol.77,p.635–638,1996. [23] T. F. of The European Society of Cardiology, T. N. A. S. of Pacing, [15] V.Venema,F.Ament,andC.Simmer,“Astochasticiterativeamplitude andElectrophysiology,“Heartratevariability.standardsofmeasurement, adjustedfouriertransformalgorithmwithimprovedaccuracy,”Nonlinear physiologicalinterpretation, andclinicaluse,”EuropeanHeartJournal, ProcessesGeophysics, vol.13,p.321–328,2006. vol.17,pp.354–381,1996.