Ribeiro et al(cid:2)(cid:2) Simulation of nonGaussian Long(cid:3)Range(cid:3)Dependent Traffic (cid:0) Simulation of nonGaussian Long(cid:2)Range(cid:2)Dependent Tra(cid:3)c using Wavelets (cid:0) Vinay J(cid:2) Ribeiro(cid:3) Rudolf H(cid:2) Riedi(cid:3) Matthew S(cid:2) Crouse(cid:3) and Richard G(cid:2) Baraniuk Departmentof Electrical and ComputerEngineering Rice University (cid:2)(cid:3)(cid:4)(cid:4) SouthMain Street Houston(cid:5) TX (cid:6)(cid:6)(cid:4)(cid:4)(cid:7)(cid:5) USA April (cid:2)(cid:3)(cid:4) (cid:2)(cid:5)(cid:5)(cid:5) Abstract scaling property for all a(cid:2)(cid:20) fd H Inthispaper(cid:2)wedevelopasimpleandpowerfulmultiscalemodel B(cid:4)at(cid:5) (cid:21) a B(cid:4)t(cid:5)(cid:3) (cid:4)(cid:0)(cid:5) for the synthesis of nonGaussian(cid:2) long(cid:3)range dependent (cid:4)LRD(cid:5) network tra(cid:6)c(cid:7) Although wavelets e(cid:8)ectively decorrelate LRD withequalityin(cid:4)(cid:11)nite(cid:3)dimensional(cid:5)distribution(cid:7) Theincrement data(cid:2) wavelet(cid:3)based models have generally been restricted by a processG(cid:4)k(cid:5)(cid:0)B(cid:4)k(cid:22)(cid:0)(cid:5)(cid:2)B(cid:4)k(cid:5)(cid:2)calledfractional Gaussian noise Gaussianity assumption that can be unrealistic for tra(cid:6)c(cid:7) Us(cid:3) (cid:4)fGn(cid:5)(cid:2) has an autocorrelationof the form ing a multiplicative superstructure on top of the Haar wavelet (cid:0) (cid:4) (cid:0)H (cid:0)H (cid:0)H transform(cid:2) we exploit the decorrelating properties of wavelets rG(cid:14)k(cid:15)(cid:21) (cid:4)jk(cid:22)(cid:0)j (cid:2)(cid:16)jkj (cid:22)jk(cid:2)(cid:0)j (cid:5)(cid:5) (cid:4)(cid:16)(cid:5) (cid:16) while simultaneously capturing the positivity and (cid:9)spikiness(cid:10) of nonGaussian tra(cid:6)c(cid:7) This leads to a swift O(cid:4)N(cid:5) algorithm for The parameterH(cid:2) (cid:20)(cid:6)H (cid:6)(cid:0)(cid:2) is known as the Hurst parameter(cid:7) (cid:11)tting and synthesizing N(cid:3)point data sets(cid:7) The resulting model It simultaneously rules the large(cid:3)scale behavior and the degree belongs to the class of multifractal cascades(cid:2) a set of processes of local (cid:9)spikiness(cid:7)(cid:10) In particular(cid:2) for all t with rich statistical properties(cid:7) We elucidate ourmodel(cid:12)s ability H to capture the covariance structure of real data and then (cid:11)t it B(cid:4)t(cid:22)s(cid:5)(cid:2)B(cid:4)t(cid:5)(cid:3)s (cid:4)(cid:17)(cid:5) to real tra(cid:6)ctraces(cid:7) Queueingexperimentsdemonstratethe ac(cid:3) (cid:4)more precisely(cid:2)B(cid:4)t(cid:22)s(cid:5)(cid:2)B(cid:4)t(cid:5)isa zero(cid:3)meanGaussianprocess curacyofthemodel formatchingrealdata(cid:7) Ourresultsindicate (cid:0)H of variance s (cid:5)(cid:2) meaning that fBm has (cid:9)in(cid:11)nite slope(cid:10) every(cid:3) that the nonGaussiannature of tra(cid:6)c has a signi(cid:11)cante(cid:8)ect on where(cid:7) Processes approximating fBm(cid:23)fGn can be synthesized queuing(cid:7) almost e(cid:8)ortlessly in the wavelet domain due to the amazing decorrelating e(cid:8)ect of the wavelet transform (cid:14)(cid:24)(cid:15)(cid:7) A strong argument for the fBm(cid:23)fGn models in networks is (cid:2) Introduction that in many cases tra(cid:6)c can be viewed as the superposition Tra(cid:6)c models play a signi(cid:11)cant r(cid:13)ole in the analysis and charac(cid:3) of a large number of independent individual ON(cid:23)OFF sources(cid:2) terizationof networktra(cid:6)candnetworkperformance(cid:7) Accurate with the ON durations heavy(cid:3)tailed (cid:14)(cid:25)(cid:2)(cid:26)(cid:15)(cid:7) In this case(cid:2) sub(cid:3) models enhance our understanding of these complex signalsand tracting the mean arrival rate and normalizing properly(cid:2) the systems by allowing us to study the e(cid:8)ect of various model pa(cid:3) aggregated ON(cid:23)OFF sources (cid:4)cumulative arrivals(cid:5) converge to rameters on network performance through simulation(cid:7) GaussianfBmbythecentrallimit theorem(cid:4)CLT(cid:5)(cid:14)(cid:0)(cid:2)(cid:17)(cid:15)(cid:7) A(cid:9)self(cid:3) The presence of long(cid:2)range dependence (cid:4)LRD(cid:5) in modern net(cid:3) similar(cid:10)tra(cid:6)carrivalmodel(cid:4)oftheincrementsprocess(cid:5)is(cid:2)thus(cid:2) work tra(cid:6)c was demonstrated convincingly in the landmark simply an (cid:9)fGn(cid:22)mean(cid:10) model with given variance and H(cid:7) The paper by Leland et(cid:7) al(cid:7) (cid:14)(cid:0)(cid:15)(cid:7) There(cid:2) measurements of tra(cid:6)c fBm(cid:23)fGn models have found wide use in networking(cid:2) since their load on an Ethernet were attributed to fractal behavior or self(cid:2) Gaussianity and strong scaling (cid:4)(cid:0)(cid:5) allows analysts to perform similarity(cid:2) i(cid:7)e(cid:7)(cid:2) to the fact that the data (cid:9)looked statistically analytical studies of queueing behavior (cid:14)(cid:27)(cid:18)(cid:0)(cid:17)(cid:15)(cid:7) similar(cid:10) (cid:4)(cid:9)bursty(cid:10)(cid:5) on all time(cid:3)scales(cid:7) These features are inade(cid:3) Unfortunately(cid:2) the fBm(cid:23)fGn models have severe limitations quately described by classical tra(cid:6)c models(cid:2) such as Markov or for network tra(cid:6)c applications(cid:7) First(cid:2) real(cid:3)world tra(cid:6)c traces Poisson models(cid:7) In particular(cid:2) the LRD of data tra(cid:6)c can lead do not exhibit the strict self(cid:3)similarity of (cid:4)(cid:0)(cid:5) or (cid:4)(cid:16)(cid:5) and are at to higher packet losses than that predicted by classical queuing best merely asymptotically self(cid:3)similar(cid:7) In other words(cid:2) the sin(cid:3) analysis (cid:14)(cid:0)(cid:2)(cid:16)(cid:15)(cid:7) gleparameterH isnotsu(cid:6)cienttocapturethecomplicatedcor(cid:3) These(cid:11)ndings were immediately followedby the development relation structure of real network processes(cid:7) Indeed(cid:2) convincing ofnew fractal tra(cid:6)cmodels(cid:14)(cid:17)(cid:18)(cid:19)(cid:15)(cid:7) The fractional Brownian mo(cid:2) evidencehasbeenproducedestablishingtheimportanceofshort(cid:3) tion(cid:4)fBm(cid:5)(cid:2)themostbroadlyappliedfractalmodel(cid:2)istheunique termcorrelationsforbu(cid:8)ering(cid:14)(cid:0)(cid:24)(cid:18)(cid:0)(cid:26)(cid:15)andso(cid:3)calledrelevanttime Gaussian process with stationary increments and the following scales have been discovered (cid:14)(cid:0)(cid:27)(cid:15)(cid:7) The wavelet(cid:3)domain indepen(cid:3) (cid:0) dent Gaussian (cid:4)WIG(cid:5) model generalizes fBm(cid:23)fGn by allowing a This work was supported by the National Science Foundation(cid:2) grant more (cid:28)exible scaling relation than (cid:4)(cid:0)(cid:5)(cid:7) By matching both long no(cid:3) MIP(cid:4)(cid:5)(cid:6)(cid:7)(cid:8)(cid:6)(cid:9)(cid:10)(cid:2) by DARPA(cid:11)AFOSR(cid:2) grant no(cid:3) F(cid:6)(cid:5)(cid:12)(cid:13)(cid:14)(cid:15)(cid:5)(cid:8)(cid:15)(cid:16)(cid:15)(cid:14)(cid:7)(cid:16)(cid:9)(cid:2) and by Texas Instruments(cid:3) Email(cid:17) fmcrouse(cid:2) riedi(cid:2) vinay(cid:2) richbg(cid:18)rice(cid:3)edu(cid:3) and short(cid:3)term correlations(cid:2) the WIG model more completely URL(cid:17) www(cid:3)dsp(cid:3)rice(cid:3)edu(cid:3) matches the correlation structure of a target data set (cid:14)(cid:0)(cid:19)(cid:15)(cid:7) (cid:16) Proceedings SigMetrics (cid:4)(cid:5)(cid:5)(cid:6) Atlanta(cid:6) GA 8000 LBL(cid:3)TCP(cid:3)(cid:17) data 8000 WIG synthesis 8000 MWM synthesis 6000 6000 6000 → → → mber of bytes 24000000 mber of bytes 24000000 mber of bytes 24000000 nu nu nu 0 0 0 −2000 −2000 −2000 0 1000 2000 3000 4000 0 1000 2000 3000 4000 0 1000 2000 3000 4000 time (1 unit = 6 ms) → time (1 unit = 6 ms) → time (1 unit = 6 ms) → 14000 14000 14000 12000 12000 12000 →10000 →10000 →10000 mber of bytes 468000000000 mber of bytes 468000000000 mber of bytes 468000000000 nu nu nu 2000 2000 2000 0 0 0 −2000 −2000 −2000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 time (1 unit = 6 ms) → time (1 unit = 6 ms) → time (1 unit = 6 ms) → 20000 20000 20000 15000 15000 15000 → → → mber of bytes 150000000 mber of bytes 150000000 mber of bytes 150000000 nu nu nu 0 0 0 −5000 −5000 −5000 0 5000 10000 15000 0 5000 10000 15000 0 5000 10000 15000 time (1 unit = 6 ms) → time (1 unit = 6 ms) → time (1 unit = 6 ms) → (cid:4)a(cid:5) (cid:4)b(cid:5) (cid:4)c(cid:5) Figure (cid:0)(cid:29) Bytes(cid:2)per(cid:2)time arrival process at di(cid:3)erent aggregation levels for (cid:4)a(cid:5) wide(cid:2)area TCP tra(cid:6)c at the Lawrence Berkeley Laboratory (cid:4)trace LBL(cid:2)TCP(cid:2)(cid:7)(cid:5) (cid:8)(cid:9)(cid:10)(cid:11)(cid:12) (cid:4)b(cid:5) one realization of the state(cid:2)of(cid:2)the(cid:2)art wavelet(cid:2)domain independent Gaussian (cid:4)WIG(cid:5) model (cid:8)(cid:9)(cid:13)(cid:11)(cid:12) and (cid:4)c(cid:5) one realizationofthemultifractalwaveletmodel(cid:4)MWM(cid:5)synthesis(cid:14) Thetop(cid:12)middleandbottomplotscorrespondtobytesarrivinginintervalsof (cid:2)ms(cid:12)(cid:3)(cid:8) msand(cid:8)(cid:9) msrespectively(cid:14) Thetopandmiddleplotscorrespondtothesecondhalfof themiddleandbottomplots(cid:12) respectively(cid:12)as indicatedbytheverticaldottedlines(cid:14) TheMWMtracescloselyresembletherealdataclosely(cid:12)whiletheWIGtraces(cid:4)withtheirlargenumber of negative values(cid:5) donot(cid:14) Second(cid:2) the Gaussianity of fBm(cid:23)fGn(cid:23)WIG models can be un(cid:3) Initssimplestform(cid:2)theMWMiscloselyrelatedtothewavelet(cid:3) realistic for certain types of tra(cid:6)c(cid:2) for instance when the stan(cid:3) basedconstructionoffBm(cid:23)fGn(cid:2)havingasfewparameters(cid:4)mean(cid:2) dard deviation of the data exceeds the mean(cid:7) In this case(cid:2) the variance(cid:2) H(cid:5)(cid:7) However(cid:2) the MWM framework boasts the (cid:28)exi(cid:3) fBm(cid:23)fGn(cid:23)WIG output signals take on a considerable number of bility to additionally match the short(cid:3)term correlations like the negative values (cid:4)see Figure (cid:0)(cid:5)(cid:7) WIG model(cid:7) Third(cid:2) in many networking applications(cid:2) we are nowhere near The MWM has a bursty demeanor that matches that of real the Gaussian limit(cid:2) in particular on small time scales(cid:7) Indeed(cid:2) tra(cid:6)c much more closely than fBm(cid:23)fGn(cid:7) The TCP tra(cid:6)c we variousauthorshaveobservedmarginalsthatdi(cid:8)ersubstantially have studied here exhibits local scaling similar to (cid:4)(cid:17)(cid:5)(cid:2) but with fromGaussian(cid:7) Usuallythesedistributionshavebeenobservedto an exponent Ht that depends on t(cid:7) This has been termed mul(cid:2) be heavy tailed (cid:14)(cid:16)(cid:20)(cid:2) p(cid:7) (cid:17)(cid:24)(cid:30)(cid:15)(cid:2) (cid:14)(cid:16)(cid:0)(cid:15)(cid:7) Consequently(cid:2) methods aimed tifractal behavior and was reported for the (cid:11)rst time in (cid:14)(cid:16)(cid:19)(cid:15) and at(cid:11)ttingmarginalshavebeendeveloped(cid:14)(cid:16)(cid:16)(cid:2)(cid:16)(cid:17)(cid:15)(cid:7) Also(cid:2)morever(cid:3) subsequently in (cid:14)(cid:16)(cid:24)(cid:18)(cid:16)(cid:27)(cid:15)(cid:7) Amazingly(cid:2) the statistical properties of satile models such as fractional ARIMA (cid:14)(cid:16)(cid:30)(cid:15) have been applied Ht asarandomvariableintcanbedescribedcompactlythrough towardsbettermatching theshort(cid:3)rangeandlong(cid:3)rangecorrela(cid:3) a function T(cid:4)q(cid:5) that controls the scaling behavior of the sample tion structure present in real traces(cid:7) momentsoforderq(cid:7) Thispowerfulrelation(cid:2)calledthe multifrac(cid:2) In this paper(cid:2) we propose a new non(cid:3)linear model for network talformalism(cid:2)tiesburstiness(cid:2)higher(cid:3)orderdependencestructure(cid:2) tra(cid:6)c data(cid:7) The multifractal wavelet model (cid:4)MWM(cid:5) is based on and moments of marginalstogether in one uni(cid:11)ed theory(cid:7) a multiplicative cascade in the wavelet domain that by design Fitting the MWM to real tra(cid:6)c traces results in an excellent guarantees a positive output(cid:7) Since each sample of the MWM match(cid:2) far better than the WIG model(cid:2) visually (cid:4)see Figure (cid:0)(cid:5) process is obtained as a product of several positive independent and(cid:2) as we will see(cid:2) in the multifractal partition function T(cid:4)q(cid:5)(cid:2) randomvariables(cid:2)theMWM(cid:12)smarginaldensityisapproximately the burstiness as measured by the multifractal spectrum(cid:2) the lognormal(cid:2) a heavier(cid:3)tailed distribution than the Gaussian(cid:7) The marginals(cid:2)and the queueing behavior(cid:7) Since these propertiesall MWM is thus a more natural (cid:11)t for positive arrival processes(cid:2) dependonthesmalltime(cid:3)scalebehavior(cid:2)itappearsthatthemul(cid:3) especially those with a standard deviation much larger than the tiplicativeMWMapproachismoreappropriatethan anadditive mean (cid:4)as observed in the traces we have studied(cid:5)(cid:7) Gaussian one(cid:7) Ribeiro et al(cid:2)(cid:2) Simulation of nonGaussian Long(cid:3)Range(cid:3)Dependent Traffic (cid:17) In this paper(cid:2) we summarize the impact of LRD on network(cid:3) with (cid:8) a positive constant that depends on the service rate at ing in Section (cid:16)(cid:7) After introducing the wavelet transform and the queue (cid:14)(cid:0)(cid:20)(cid:2)(cid:0)(cid:0)(cid:15)(cid:7) The decay of the tail queue distribution for describing the WIG model in Section (cid:17)(cid:2) we derive the MWM in fGn with H (cid:2) (cid:0)(cid:7)(cid:16) is much slower than the exponential decay Section (cid:30)(cid:7) Section (cid:19) reports on the results of simulation exper(cid:3) predicted by short(cid:3)range dependent (cid:4)SRD(cid:5) classical models (cid:14)(cid:16)(cid:15)(cid:7) iments with real data traces(cid:7) We give an intuitive introduction This corresponds to the case H (cid:21)(cid:0)(cid:7)(cid:16)(cid:7) to multifractal cascades in Section (cid:24) and close with conclusions Even though (cid:4)(cid:24)(cid:5) shows that LRD processes have higher tail in Section (cid:25)(cid:7) queueprobabilitiesthan SRD processes(cid:2)thereis still anongoing discussion on the e(cid:8)ect of LRD on queuing(cid:2) with researchers arguing both for and against its importance (cid:14)(cid:0)(cid:25)(cid:18)(cid:0)(cid:27)(cid:2)(cid:17)(cid:16)(cid:18)(cid:17)(cid:30)(cid:15)(cid:7) (cid:3) Long(cid:4)range Dependence in Network Tra(cid:5)c (cid:6) Wavelets and LRD Processes The discovery of LRD in data tra(cid:6)c (cid:14)(cid:0)(cid:2)(cid:0)(cid:30)(cid:15) has incited a revo(cid:3) lution in network design(cid:2) control and modeling(cid:7) Intuitively(cid:2) the (cid:7)(cid:2)(cid:3) Wavelet transform strong correlations present in a LRD process are responsible for Thediscretewavelettransformprovidesamultiscalesignalrepre(cid:3) its (cid:9)bursty(cid:10) nature(cid:7) Thus(cid:2) LRD tra(cid:6)c arrives in bursts that(cid:2) sentationofaone(cid:3)dimensionalsignalc(cid:4)t(cid:5) interms ofshiftedand upon entering a queue(cid:2) cause excessive bu(cid:8)er over(cid:28)ows that are dilated versions of a prototype bandpass wavelet function (cid:9)(cid:4)t(cid:5) notpredictedbytraditionalnon(cid:3)LRDtra(cid:6)cmodelssuchasPois(cid:3) and shifted versions of a lowpass scaling function (cid:10)(cid:4)t(cid:5) (cid:14)(cid:17)(cid:19)(cid:15)(cid:7) For son and Markovmodels (cid:14)(cid:16)(cid:15)(cid:7) special choices of the wavelet and scaling functions(cid:2) the atoms j(cid:3)(cid:0) j (cid:0)(cid:2)(cid:3) Long(cid:4)range dependence (cid:5)LRD(cid:6) (cid:9)j(cid:2)k(cid:4)t(cid:5) (cid:0) (cid:16) (cid:9) (cid:16) t(cid:2)k (cid:3) Consider a discrete(cid:3)time(cid:2) wide(cid:3)sense stationary random pro(cid:3) (cid:10)j(cid:2)k(cid:4)t(cid:5) (cid:0) (cid:16)j(cid:3)(cid:0) (cid:10)(cid:2)(cid:16)jt(cid:2)k(cid:3)(cid:3) j(cid:3)k (cid:4)ZZ (cid:4)(cid:25)(cid:5) cess fXt(cid:3) t (cid:4) ZZg with auto(cid:3)covariance function rX(cid:14)k(cid:15) (cid:21) (cid:2) (cid:3) formanorthonormalbasis(cid:2)andwehavethesignalrepresentation cov(cid:4)Xt(cid:3)Xt(cid:2)k(cid:5)(cid:7) A change in time scale can be represented by (cid:3)m(cid:4) (cid:14)(cid:17)(cid:19)(cid:15) forming the aggregate process Xt (cid:2) which is obtained by aver(cid:3) agingXt overnon(cid:3)overlappingblocksoflength m replacingeach (cid:2) block by its mean c(cid:4)t(cid:5) (cid:21) uJ(cid:0)(cid:2)k(cid:10)J(cid:0)(cid:2)k(cid:4)t(cid:5) (cid:22) wj(cid:2)k(cid:9)j(cid:2)k(cid:4)t(cid:5)(cid:3) (cid:4)(cid:26)(cid:5) Xk jX(cid:6)J(cid:0)Xk (cid:3)m(cid:4) Xtm(cid:0)m(cid:2)(cid:5)(cid:22)(cid:5)(cid:5)(cid:5)(cid:22)Xtm Xt (cid:21) m (cid:5) (cid:4)(cid:30)(cid:5) (cid:3) (cid:3) with wj(cid:2)k (cid:0) c(cid:4)t(cid:5)(cid:9)j(cid:2)k(cid:4)t(cid:5)dt(cid:2) and uJ(cid:0)(cid:2)k (cid:0) c(cid:4)t(cid:5)(cid:10)J(cid:0)(cid:2)k(cid:4)t(cid:5)dt(cid:7) Denotetheauto(cid:3)covarianceofXt(cid:3)m(cid:4) byrX(cid:3)m(cid:4)(cid:14)k(cid:15)(cid:7) TheprocessX is Without loss ofRgenerality(cid:2) we will assume J(cid:7) (cid:21)R(cid:20)(cid:7) said to exhibit LRD if its auto(cid:3)covariance decays slowly enough In this representation(cid:2) k indexes the spatial location of anal(cid:3) (cid:2) to render k(cid:6)(cid:0)(cid:2)rX(cid:14)k(cid:15) in(cid:11)nite (cid:14)(cid:17)(cid:20)(cid:15)(cid:7) Equivalently(cid:2) the power ysis and j indexes the scale or resolution of analysis (cid:31) larger (cid:3)m(cid:4) j corresponds to higher resolution with j (cid:21) (cid:20) indicating the spectrum SPX(cid:4)f(cid:5) is singular near f (cid:21) (cid:20) and m rX (cid:14)(cid:20)(cid:15) (cid:6) (cid:7) coarsest scale or lowest resolution of analysis(cid:7) In practice(cid:2) we as m (cid:6) (cid:7)(cid:7) work with a sampled or (cid:11)nite(cid:3)resolution representation of c(cid:4)t(cid:5)(cid:2) An important class of LRD processes are the asymptotically replacing the semi(cid:3)in(cid:11)nite sum in (cid:4)(cid:26)(cid:5) with a sum over a (cid:11)nite second(cid:2)order self(cid:2)similar processes(cid:2) which maybe de(cid:11)ned by the (cid:0)H(cid:0)(cid:0) number of scales (cid:20) (cid:10) j (cid:10) n(cid:2)(cid:0)(cid:3) n (cid:4)ZZ(cid:7) Using (cid:11)lter bank tech(cid:3) property rX(cid:14)k(cid:15) (cid:8) k for some H (cid:4) (cid:4)(cid:0)(cid:7)(cid:16)(cid:3)(cid:0)(cid:5)(cid:2) or equivalently niques(cid:2) the wavelettransformandinversewavelettransformcan by (cid:14)(cid:17)(cid:20)(cid:15) (cid:3)m(cid:4) (cid:3)m(cid:4) (cid:0)H(cid:0)(cid:0) becomputedinO(cid:4)N(cid:5)operationsforalength(cid:3)N signal(cid:7) Formore var(cid:4)X (cid:5)(cid:21)rX (cid:14)(cid:20)(cid:15)(cid:6)m (cid:4)(cid:19)(cid:5) information on wavelet systems and their construction(cid:2) see (cid:14)(cid:17)(cid:19)(cid:15)(cid:7) asm(cid:6)(cid:7)(cid:7) Inwords(cid:2)theseprocesses(cid:9)looksimilar(cid:10)onallscales(cid:2) In the Haar wavelet transform (cid:4)see Figure (cid:16)(cid:5)(cid:2) the prototype at least from point of view of second(cid:3)order statistics(cid:7) The fGn(cid:2) scaling and wavelet functions are given by issuch aprocesswhere the Hurst parameter H is the same as in (cid:4)(cid:0)(cid:5)(cid:7) (cid:0)(cid:3) (cid:20)(cid:10)t(cid:6)(cid:0)(cid:7)(cid:16) (cid:0)(cid:3) (cid:20)(cid:10)t(cid:6)(cid:0) To estimate H by the variance(cid:2)time plot method(cid:2) we (cid:11)t a (cid:10)(cid:4)t(cid:5)(cid:21) and (cid:9)(cid:4)t(cid:5)(cid:21)(cid:5) (cid:2)(cid:0)(cid:3) (cid:0)(cid:7)(cid:16)(cid:10)t(cid:6)(cid:0) straight line through the plot of an estimate of logvar(cid:4)X(cid:3)m(cid:4)(cid:5) (cid:4) (cid:20)(cid:3) else (cid:6) (cid:20)(cid:3) else(cid:7) against log(cid:4)m(cid:5)(cid:7) More reliable estimators have also been de(cid:3) (cid:7) TheHaarscalingandwaveletcoe(cid:6)cientscanberecursivelycom(cid:3) vised (cid:14)(cid:16)(cid:30)(cid:15)(cid:2) in particular an unbiased one based on wavelets (cid:14)(cid:17)(cid:0)(cid:15)(cid:7) puted via (cid:14)(cid:17)(cid:19)(cid:15) (cid:0)(cid:2)(cid:0) Impact of LRD on networking uj(cid:0)(cid:5)(cid:2)k (cid:21) (cid:16)(cid:0)(cid:5)(cid:3)(cid:0)(cid:4)uj(cid:2)(cid:0)k(cid:22)uj(cid:2)(cid:0)k(cid:2)(cid:5)(cid:5)(cid:3) (cid:0)(cid:5)(cid:3)(cid:0) (cid:4)(cid:27)(cid:5) Thepre(cid:3)eminentLRDmodelatpresentisthefGn(cid:7) Itspopularity wj(cid:0)(cid:5)(cid:2)k (cid:21) (cid:16) (cid:4)uj(cid:2)(cid:0)k(cid:2)uj(cid:2)(cid:0)k(cid:2)(cid:5)(cid:5)(cid:5) stemsfromthefactthatitisasecond(cid:3)orderself(cid:3)similarGaussian process (cid:4)(cid:16)(cid:5)(cid:2) and thus is analytically tractable(cid:7) In addition(cid:2) it is (cid:7)(cid:2)(cid:0) Modeling LRD data completely described by just two parameters (cid:31) variance and Wavelets serve as an approximate Karhunen(cid:3)Lo eve or decorre(cid:3) H(cid:7) When fGn is input to an in(cid:11)nite(cid:3)length queue with constant lating transform for fBm (cid:14)(cid:24)(cid:15)(cid:2) fGn(cid:2) and more general LRD sig(cid:3) service rate(cid:2) the tail queue distributions decay asymptotically nals (cid:14)(cid:17)(cid:24)(cid:15)(cid:7) Hence(cid:2) modelingand processingof these signalsin the with a Weibullian law wavelet domain is often more e(cid:6)cient and powerful than in the (cid:0)(cid:0)(cid:0)H P(cid:14)Q(cid:2)x(cid:15)(cid:9)exp(cid:4)(cid:2)(cid:8)x (cid:5)(cid:3) (cid:4)(cid:24)(cid:5) time domain(cid:7) (cid:30) Proceedings SigMetrics (cid:4)(cid:5)(cid:5)(cid:6) Atlanta(cid:6) GA φ (t) j,k j/2 U 2 j,k U W j,k j,k 0 k2-j (k+1)2-j U U j+1,2k j+1,2k+1 ψ (t) j/2 j,k 1 1 2 √2 √2 0 k2-j (k+1)2-j Uj+2,4k Uj+2,4k+1 Uj+2,4k+2 Uj+2,4k+3 Uj+1,2k Uj+1,2k+1 (cid:4)a(cid:5) (cid:4)b(cid:5) (cid:4)c(cid:5) Figure (cid:16)(cid:29) (cid:4)a(cid:5) TheHaarscaling andwaveletfunctions(cid:0)j(cid:2)k(cid:10)t(cid:11)and(cid:2)j(cid:2)k(cid:10)t(cid:11)(cid:14) (cid:4)b(cid:5)Binarytreeofscalingcoe(cid:6)cients fromcoarse to(cid:15)nescales(cid:14) (cid:4)c(cid:5) Recursiveschemefor calculating theHaar scaling coe(cid:6)cients Uj(cid:0)(cid:2)(cid:2)(cid:3)k and Uj(cid:0)(cid:2)(cid:2)(cid:3)k(cid:0)(cid:2) at scale j(cid:12)(cid:3) as sums and di(cid:3)erences of thescaling and wavelet coe(cid:6)cients Uj(cid:2)k andWj(cid:2)k atscale j (cid:4)normalizedby(cid:3)(cid:3)p(cid:8)(cid:5)(cid:14) FortheWIG model(cid:12) theWj(cid:2)k(cid:16)saremutuallyindependentandidentically (cid:3) distributedwithin scale according to Wj(cid:2)k N(cid:10)(cid:4)(cid:4)(cid:5)j(cid:11)(cid:14) (cid:2) Thevarianceofthewaveletcoe(cid:6)cientsofcontinuous(cid:3)timefBm balance of this paper(cid:7) decayswith scaleaccordingtoapowerlawinH (cid:14)(cid:24)(cid:15)(cid:7) ForfGn(cid:2) an Since the scaling coe(cid:6)cients uj(cid:2)k represent the local mean of exact power(cid:3)law in H also holds for decay of the Haar wavelet the signal at di(cid:8)erent scales and shifts(cid:2) they are non(cid:3)negative if coe(cid:6)cient variances (cid:14)(cid:17)(cid:24)(cid:15)(cid:7) This power(cid:3)law decay(cid:2) along with the and only if the signal itself is non(cid:3)negative! that is(cid:2) c(cid:4)t(cid:5) (cid:11) (cid:20) (cid:12) decorrelation property of wavelets(cid:2) has led to fast(cid:2) robust algo(cid:3) uj(cid:2)k (cid:11)(cid:20)(cid:3) (cid:13)j(cid:3)k(cid:7) Thisconditionleadsusdirectlytoconstraintson rithms for estimation (cid:14)(cid:17)(cid:24)(cid:2)(cid:17)(cid:25)(cid:15)(cid:7) the Haar wavelet coe(cid:6)cients(cid:7) Solving (cid:4)(cid:27)(cid:5) for uj(cid:2)(cid:0)k and uj(cid:2)(cid:0)k(cid:2)(cid:5)(cid:2) GaussianLRDprocessescanbeapproximatelysynthesizedby we (cid:11)nd (cid:0)(cid:5)(cid:3)(cid:0) generating wavelet coe(cid:6)cients as independent zero(cid:3)mean Gaus(cid:3) uj(cid:2)(cid:0)k (cid:21) (cid:16) (cid:4)uj(cid:0)(cid:5)(cid:2)k(cid:22)wj(cid:0)(cid:5)(cid:2)k(cid:5)(cid:3) (cid:0)(cid:5)(cid:3)(cid:0) (cid:4)(cid:0)(cid:20)(cid:5) sian random variables(cid:2) identically distributed within scale ac(cid:3) uj(cid:2)(cid:0)k(cid:2)(cid:5) (cid:21) (cid:16) (cid:4)uj(cid:0)(cid:5)(cid:2)k(cid:2)wj(cid:0)(cid:5)(cid:2)k(cid:5)(cid:3) (cid:0) (cid:5) (cid:0) cordingtoWj(cid:2)k (cid:8)N(cid:4)(cid:20)(cid:3)(cid:4)j(cid:5)(cid:2) with(cid:4)j thewavelet(cid:3)coe(cid:6)cientvari(cid:3) which corresponds to moving down the tree in Figure (cid:16)(cid:4)b(cid:5) one ance at scale j (cid:14)(cid:0)(cid:19)(cid:15)(cid:7) scale level at a time(cid:7) (cid:0) A power(cid:3)law decay for the (cid:4)j(cid:12)s leads to approximate wavelet Now(cid:2) combining (cid:4)(cid:0)(cid:20)(cid:5) with the constraint uj(cid:2)k (cid:11) (cid:20)(cid:2) we obtain synthesis of fBm or fGn (cid:14)(cid:24)(cid:15)(cid:7) However(cid:2) while networktra(cid:6)c may the condition exhibit LRD consistent with fBm or fGn(cid:2) it may have short(cid:3) term correlations that vary considerably from pure fBm or fGn c(cid:4)t(cid:5)(cid:11)(cid:20)(cid:12)jwj(cid:2)kj(cid:10)uj(cid:2)k(cid:3) (cid:13) j(cid:3)k(cid:5) (cid:4)(cid:0)(cid:0)(cid:5) (cid:0) scaling(cid:7) Such LRD processes can be modeled by setting (cid:4)j to matchthe measuredor theoreticalvariancesofthe wavelet coef(cid:3) (cid:7) Multifractal Wavelet Model (cid:11)cients of the desired process (cid:14)(cid:0)(cid:19)(cid:15)(cid:7) We call the resulting model thewavelet(cid:3)domainindependentGaussian(cid:4)WIG(cid:5)model(cid:14)(cid:0)(cid:19)(cid:15)(cid:4)see Letussummarizeourbasicwavelet(cid:3)basedapproachformodeling Figure(cid:16)(cid:4)c(cid:5)(cid:5)(cid:7) Foralength(cid:3)N signal(cid:2)theWIGischaracterizedby nonGaussian LRD network tra(cid:6)c(cid:7) As with the WIG we will approximatelylog(cid:0)N parameters(cid:7) characterize the Haar wavelet variance decay as a function of The WIG model assumes Gaussianity even though network scale to capture the short(cid:3)range and long(cid:3)range correlations(cid:7) In tra(cid:6)c signals (cid:4)such as loads and interarrival times(cid:5) can be contrasttotheWIG(cid:2)wewillenforcetheconstraint(cid:4)(cid:0)(cid:0)(cid:5)toensure highly nonGaussian(cid:7) Not only are these signals strictly non(cid:3) the non(cid:3)negativity of the model output(cid:7) negative(cid:2) but they can exhibit (cid:9)spiky(cid:10) behavior corresponding To keep things clear(cid:2) we will introduce three di(cid:8)erent pro(cid:3) to a marginal distribution whose right(cid:3)side tail decays much cesses(cid:29) the continuous(cid:3)time model output c(cid:4)t(cid:5)(cid:2) its integral D(cid:4)t(cid:5)(cid:2) more slowly than that of a Gaussian(cid:7) We seek a more accurate and a discrete(cid:3)time approximationC(cid:14)k(cid:15) to c(cid:4)t(cid:5)(cid:7) These three sig(cid:3) marginalcharacterizationforthesespiky(cid:2)non(cid:3)negativeLRDpro(cid:3) nals are related by cesses(cid:2)yetwishtoretainthedecorrelatingpropertiesofwavelets (cid:0)n and the simplicity of the WIG model(cid:7) (cid:3)k(cid:2)(cid:5)(cid:4)(cid:0) n n C(cid:14)k(cid:15)(cid:0) c(cid:4)t(cid:5)dt(cid:21)D(cid:4)(cid:4)k(cid:22)(cid:0)(cid:5)(cid:16) (cid:5)(cid:2)D(cid:4)k(cid:16) (cid:5)(cid:5) (cid:4)(cid:0)(cid:16)(cid:5) Zk(cid:0)(cid:0)n (cid:7)(cid:2)(cid:7) Modeling non(cid:4)negative data with the Haar Here(cid:2)c(cid:4)t(cid:5)andD(cid:4)t(cid:5)playr(cid:13)olesanalogoustofGnandfBm(cid:2)respec(cid:3) wavelet tively(cid:7) In order to model non(cid:3)negative signals using the wavelet trans(cid:3) For notational simplicity(cid:2) we will assume that both c(cid:4)t(cid:5) and n form(cid:2) we must develop conditions on the scaling and wavelet D(cid:4)t(cid:5) live on (cid:14)(cid:20)(cid:3)(cid:0)(cid:15) and that C(cid:14)k(cid:15) is a length(cid:3)(cid:16) discrete(cid:3)time sig(cid:3) coe(cid:6)cient values for c(cid:4)t(cid:5) in (cid:4)(cid:26)(cid:5) to be non(cid:3)negative(cid:7) While cum(cid:3) nal(cid:7) Thus(cid:2) there is only one scaling coe(cid:6)cient U(cid:7)(cid:2)(cid:7) in (cid:4)(cid:26)(cid:5)(cid:7) (cid:4)A (cid:0) bersomefor ageneralwaveletsystem(cid:2) theseconditionsaresim(cid:3) more general case is treated in (cid:14)(cid:16)(cid:27)(cid:15)(cid:7)(cid:5) We will primarily focus on plefortheHaarsystem(cid:4)seeFigure(cid:16)(cid:5)(cid:2) onwhichwefocusforthe C(cid:14)k(cid:15)(cid:7) (cid:2) For the Haar wavelet transform(cid:2) C(cid:14)k(cid:15) relates directly to the We usecapital letterswhen weconsider the underlying variablesto be (cid:11)nest(cid:3)scale scaling coe(cid:6)cients(cid:29) random(cid:3) (cid:3) Theconditionsarestraightforwardalsoforcertainbiorthogonalwavelet n(cid:3)(cid:0) n systems(cid:3) C(cid:14)k(cid:15) (cid:21) (cid:16) Un(cid:2)k(cid:3) k (cid:21)(cid:20)(cid:3)(cid:0)(cid:3)(cid:5)(cid:5)(cid:5)(cid:16) (cid:2)(cid:0)(cid:5) (cid:4)(cid:0)(cid:17)(cid:5) Ribeiro et al(cid:2)(cid:2) Simulation of nonGaussian Long(cid:3)Range(cid:3)Dependent Traffic (cid:19) A U j,k W j,k j,k Table (cid:0)(cid:29) Comparisonofthetree(cid:2)basedWIGandMWMmodels(cid:14) For approximating a signal with a strict fGn covariance structure(cid:12) both the WIG and MWM require only three parameters (cid:4)mean(cid:12) variance(cid:12) and H(cid:5)(cid:14) WIG MWM Additive Multiplicative 1 1 Gaussian Asymptotically Lognormal √2 √2 LRD matched LRD matched U U Monofractal Multifractal j+1,2k j+1,2k+1 log(cid:0)N (cid:22)(cid:16) parameters log(cid:0)N (cid:22)(cid:16) parameters O(cid:4)N(cid:5) synthesis O(cid:4)N(cid:5) synthesis Figure (cid:17)(cid:29) MWM construction(cid:17) At scale j(cid:12) generate the multiplier This result can be derived by iteratively applying (cid:4)(cid:0)(cid:19)(cid:5) (cid:14)(cid:16)(cid:27)(cid:15)(cid:7) Aj(cid:2)k (cid:6)(cid:10)pj(cid:4)pj(cid:11)(cid:12)andthenformthewaveletcoe(cid:6)cient as theproduct (cid:2) Sincethescalingcoe(cid:6)cientsaregeneratedsimultaneouslywith Wj(cid:2)k (cid:13)Aj(cid:2)kUj(cid:2)k(cid:14) Atscalej(cid:12)(cid:3)ofthistree(cid:12)formthescalingcoe(cid:6)cients the wavelet coe(cid:6)cients(cid:2) there is no need to invert the wavelet inthesame manneras theWIGmodel in Figure (cid:18)(cid:14) transform(cid:7) The (cid:11)nest(cid:3)scale scaling coe(cid:6)cients Un(cid:2)k are in fact the MWM output process (cid:4)(cid:0)(cid:17)(cid:5)(cid:7) The total cost for computing N (cid:8)(cid:2)(cid:3) The model MWM signal samples is O(cid:4)N(cid:5)(cid:7) In fact(cid:2) synthesis of a trace of (cid:5)(cid:8) length (cid:16) data points takes just (cid:26) seconds of workstation cpu The positivity constraints (cid:4)(cid:0)(cid:0)(cid:5) on the Haar wavelet coe(cid:6)cients time(cid:7) suggest a very simple multiscale(cid:2) multiplicative signal model for positiveprocesses(cid:7) In the multifractal wavelet model (cid:4)MWM(cid:5) we compute the wavelet coe(cid:6)cients recursively by (cid:8)(cid:2)(cid:0) (cid:0) multipliers All that remainsis to choosean appropriatedistribution for the Wj(cid:2)k (cid:21)Aj(cid:2)k Uj(cid:2)k(cid:3) (cid:4)(cid:0)(cid:30)(cid:5) multipliers Aj(cid:2)k(cid:7) For simplicity of development(cid:2) we will assume that the Aj(cid:2)k(cid:12)s are mutually independent and independent of with Aj(cid:2)k a random variable supported on the interval (cid:14)(cid:2)(cid:0)(cid:3)(cid:0)(cid:15)(cid:7) Uj(cid:2)k(cid:7) We will also assume that the Aj(cid:2)k(cid:12)s are symmetric about Together with (cid:4)(cid:0)(cid:20)(cid:5)(cid:2) we obtain (cid:4)see Figure (cid:17)(cid:5) (cid:20) and identically distributed within scale! it iseasily shown that (cid:0)(cid:5)(cid:3)(cid:0) these two conditions are necessary for the resulting process to Uj(cid:2)(cid:0)k (cid:21) (cid:16) (cid:4)(cid:0)(cid:22)Aj(cid:2)(cid:5)(cid:2)k(cid:5)Uj(cid:0)(cid:5)(cid:2)k(cid:3) (cid:0)(cid:5)(cid:3)(cid:0) (cid:4)(cid:0)(cid:19)(cid:5) be (cid:11)rst(cid:3)order stationary (cid:14)(cid:16)(cid:27)(cid:15)(cid:7) This leads us to the choice of the Uj(cid:2)(cid:0)k(cid:2)(cid:5) (cid:21) (cid:16) (cid:4)(cid:0)(cid:2)Aj(cid:2)(cid:5)(cid:2)k(cid:5)Uj(cid:0)(cid:5)(cid:2)k(cid:5) symmetric beta distribution(cid:3) (cid:11)(cid:4)p(cid:3)p(cid:5) (cid:4)see Figure (cid:30)(cid:5) for the Aj(cid:2)k(cid:12)s See(cid:14)(cid:17)(cid:26)(cid:15)forasimilarmodelusedasanintensitypriorforwavelet(cid:3) Aj(cid:2)k (cid:8)(cid:11)(cid:4)pj(cid:3)pj(cid:5)(cid:3) (cid:4)(cid:0)(cid:26)(cid:5) based image estimation(cid:7) TogeneratearealizationofanMWM process(cid:2)weperformthe with pj the beta parameter at scale j(cid:7) The beta distribution following coarse(cid:3)to(cid:3)(cid:11)nesynthesis(cid:29) is compactly supported(cid:2) easily shaped(cid:2) and amenable to closed(cid:3) formcalculations(cid:7) ThevarianceofarandomvariableA(cid:8)(cid:11)(cid:4)p(cid:3)p(cid:5) (cid:0)(cid:7) Set j (cid:21) (cid:20)(cid:7) Fix or compute the coarsest (cid:4)root(cid:5) scaling coef(cid:3) is (cid:11)cient U(cid:7)(cid:2)(cid:7)(cid:2) establishing the global mean of the signal(cid:7) (cid:0) var(cid:14)A(cid:15)(cid:21) (cid:5) (cid:4)(cid:0)(cid:27)(cid:5) (cid:16)p(cid:22)(cid:0) (cid:16)(cid:7) At scale j(cid:2) generate the random multipliers Aj(cid:2)k and calcu(cid:3) late each Wj(cid:2)k via (cid:4)(cid:0)(cid:30)(cid:5) for k (cid:21)(cid:20)(cid:3)(cid:5)(cid:5)(cid:5)(cid:3)(cid:16)j (cid:2)(cid:0)(cid:7) IntheMWM(cid:2)thepj playar(cid:13)oleanalogoustothe(cid:4)j(cid:0)oftheWIG model(cid:7) With one beta parameter per wavelet scale(cid:2) the MWM (cid:17)(cid:7) At scale j(cid:2) use Uj(cid:2)k and Wj(cid:2)k in (cid:4)(cid:0)(cid:20)(cid:5) to calculate Uj(cid:2)(cid:5)(cid:2)(cid:0)k uses approximately log(cid:0)N parameters for a trace of length N(cid:7) and Uj(cid:2)(cid:5)(cid:2)(cid:0)k(cid:2)(cid:5)(cid:2) the scaling coe(cid:6)cients at scale j (cid:22) (cid:0)(cid:2) for Distributions with more parameters (cid:4)e(cid:7)g(cid:7)(cid:2) discrete distributions j k(cid:21)(cid:20)(cid:3)(cid:5)(cid:5)(cid:5)(cid:3)(cid:16) (cid:2)(cid:0)(cid:7) or mixtures of betas(cid:5) could be used to capture high(cid:3)order data momentsatacostofincreasedmodelcomplexity(cid:14)(cid:16)(cid:27)(cid:15)(cid:7) SeeTable (cid:30)(cid:7) Iterate steps (cid:16) and (cid:17)(cid:2) replacing j by j (cid:22)(cid:0) until the (cid:11)nest (cid:0) for a comparison of the WIG and MWM properties(cid:7) scale j (cid:21)n is reached(cid:7) We can express the signal C(cid:14)k(cid:15) directly as a product (cid:4)or cas(cid:3) (cid:8)(cid:2)(cid:7) Covariance matching cade(cid:5)oftherandommultipliers(cid:0)(cid:14)Aj(cid:2)k(cid:7) Decomposingeachshift n(cid:0)(cid:5) (cid:4) n(cid:0)(cid:5)(cid:0)i Thepj(cid:12)sallowustocontrolthewaveletenergydecayacrossscale(cid:2) k into a binary expansion k (cid:21) i(cid:6)(cid:7) ki(cid:16) (cid:2) we can write since (cid:0)n(cid:3)(cid:0) (cid:0)n Pn(cid:0)(cid:5)(cid:4)(cid:0)(cid:22)(cid:4)(cid:2)(cid:0)(cid:5)ki(cid:2)Ai(cid:2)ki(cid:5) var(cid:4)Wj(cid:0)(cid:5)(cid:2)k(cid:5) (cid:21) (cid:16)var(cid:14)Aj(cid:0)(cid:5)(cid:2)k(cid:15) (cid:21) (cid:16)pj (cid:22)(cid:0) (cid:5) (cid:4)(cid:16)(cid:20)(cid:5) C(cid:14)k(cid:15)(cid:21)(cid:16) Un(cid:2)k (cid:21)(cid:16) U(cid:7)(cid:2)(cid:7) (cid:3) (cid:4)(cid:0)(cid:24)(cid:5) var(cid:4)Wj(cid:2)k(cid:5) var(cid:14)Aj(cid:2)k(cid:15) (cid:4)(cid:0)(cid:22)var(cid:14)Aj(cid:0)(cid:5)(cid:2)k(cid:15)(cid:5) pj(cid:0)(cid:5)(cid:22)(cid:0) (cid:16) iY(cid:6)(cid:7) Thus(cid:2) to model a given process with the MWM(cid:2) we can select with the pj(cid:12)s to match the signal(cid:12)stheoretical wavelet(cid:3)domainenergy i(cid:0)(cid:5) decay(cid:7) Or(cid:2) given training data(cid:2) we can select the parameters k(cid:7) (cid:0)(cid:20)(cid:3) and ki (cid:21) ki(cid:4)(cid:16)i(cid:0)(cid:5)(cid:0)j(cid:3) i(cid:21)(cid:0)(cid:3)(cid:5)(cid:5)(cid:5)(cid:3)n(cid:2)(cid:0)(cid:5) (cid:4)(cid:0)(cid:25)(cid:5) to match the sample variances of the wavelet coe(cid:6)cients as a Xj(cid:6)(cid:7) function of scale(cid:7) (cid:24) Proceedings SigMetrics (cid:4)(cid:5)(cid:5)(cid:6) Atlanta(cid:6) GA 4 p=0.2 The second real data set is one of the famous Ethernet data p=1 3 p=2 traces collected at Bellcore Morristown Research and Engineer(cid:3) → p=10 A ing facility (cid:14)(cid:0)(cid:15)(cid:7) The trace (cid:4)BC(cid:3)pAug(cid:26)(cid:27)(cid:5) began at (cid:0)(cid:0)(cid:29)(cid:16)(cid:19) on Au(cid:3) of 2 df gust(cid:16)(cid:27)(cid:2)(cid:0)(cid:27)(cid:26)(cid:27)(cid:2)andranforabout(cid:17)(cid:0)(cid:30)(cid:16)(cid:7)(cid:26)(cid:16)seconds(cid:4)until(cid:0)(cid:2)(cid:20)(cid:20)(cid:20)(cid:2)(cid:20)(cid:20)(cid:20) p 1 packets had been captured(cid:5)(cid:7) As in the case of the LBL(cid:3)TCP(cid:3)(cid:17) data set(cid:2) we obtain a data trace by summing the bytes of pack(cid:3) 0 −1 −0.5 0 0.5 1 ets that arrived in consecutive time intervals of (cid:16)(cid:5)(cid:24) ms(cid:7) We use a → (cid:0)(cid:7) the (cid:11)rst (cid:16) data points of this trace in our experiments(cid:7) This Figure (cid:30)(cid:29) Probability density function of a (cid:6)(cid:10)p(cid:4)p(cid:11) random variable trace has mean (cid:17)(cid:30)(cid:19)(cid:7)(cid:26) bytes(cid:23)(cid:4)unit time(cid:5) and standard deviation A(cid:14) For p (cid:13) (cid:4)(cid:7)(cid:8)(cid:12) A resembles a binomial random variable(cid:12) and for (cid:25)(cid:20)(cid:17)(cid:7)(cid:30)bytes(cid:23)(cid:4)unit time(cid:5)(cid:7) TheBC(cid:3)pAug(cid:26)(cid:27)traceisapproximately p(cid:13) (cid:3) it has a uniform density(cid:14) For p (cid:8)(cid:3) the density appears like a a second(cid:3)order self similar process with H (cid:21)(cid:20)(cid:5)(cid:25)(cid:27) (cid:14)(cid:17)(cid:25)(cid:15)(cid:7) truncatedGaussian density(cid:12) andas pincreases(cid:12) thedensityresembles a Gaussian densitymore and moreclosely(cid:14) (cid:9)(cid:2)(cid:0) Physical Interpretation The MWM multipliers have a simple interpretation as recur(cid:3) To complete the modeling(cid:2) we must choose the parameter p(cid:7) sively partitioning the arriving bytes into smaller and smaller of the model and characterize the distribution of the coarsest time intervals(cid:7) For instance(cid:2) the value U(cid:7)(cid:2)(cid:7) determines the total scaling coe(cid:6)cient U(cid:7)(cid:2)(cid:7)(cid:7) From (cid:4)(cid:0)(cid:30)(cid:5) and (cid:4)(cid:0)(cid:27)(cid:5) we obtain number of bytes in the entire trace(cid:7) The value A(cid:7)(cid:2)(cid:7) determines (cid:0) how many of these packets will be placed in the (cid:11)rst half of the (cid:4)(cid:16)p(cid:7)(cid:22)(cid:0)(cid:5)var(cid:4)W(cid:7)(cid:2)(cid:7)(cid:5)(cid:21)IE(cid:14)U(cid:7)(cid:2)(cid:7)(cid:15)(cid:3) (cid:4)(cid:16)(cid:0)(cid:5) trace(cid:7) The value A(cid:5)(cid:2)(cid:7) then determines how many of these bytes (cid:0) which allows us to obtain p(cid:7) from estimates of IE(cid:14)U(cid:7)(cid:2)(cid:7)(cid:15) and will be placed in the (cid:11)rst quarter of the trace(cid:2) and so on(cid:7) var(cid:4)W(cid:7)(cid:2)(cid:7)(cid:5)(cid:7) When trained on real network data(cid:2) the behavior of the mul(cid:3) To precisely model U(cid:7)(cid:2)(cid:7)(cid:2) we would have to use a strictly tipliers Aj(cid:2)k changes with scale(cid:2) with extremely low variance at non(cid:3)negative probability density function to ensure the non(cid:3) coarse scales and high variance at (cid:11)ne scales(cid:7) Amazingly(cid:2) this negativity of the MWM output(cid:7) However(cid:2) in practice a Gaus(cid:3) is consistent with both the small(cid:3)scale behavior of actual tra(cid:6)c sianmodelatthecoarsestscale(cid:4)requiringIE(cid:14)U(cid:7)(cid:2)(cid:7)(cid:15)andvar(cid:14)U(cid:7)(cid:2)(cid:7)(cid:15)(cid:5) and the large(cid:3)scale properties resulting from the superposition is usually su(cid:6)cient if enough scales are employed (cid:4)so that of a large number of souces (cid:14)(cid:25)(cid:2)(cid:26)(cid:15)(cid:7) At (cid:11)ne scales multiplicative IE(cid:14)U(cid:7)(cid:2)(cid:7)(cid:15)(cid:15) standard deviation of U(cid:7)(cid:2)(cid:7)(cid:5)(cid:7) schemes with large variances produce bursts like those in real data (cid:4)recall Figure (cid:0)(cid:5)(cid:7) At coarse scales(cid:2) the scaling coe(cid:6)cients (cid:4)which correspondto the arrivalof tra(cid:6)coverlargetime scales(cid:5) (cid:8) Experimental Results involve only a handful of low(cid:3)variance multipliers Aj(cid:2)k(cid:7) From (cid:4)(cid:0)(cid:19)(cid:5) we can write(cid:2) for example(cid:2) at the third(cid:3)coarsest scale(cid:29) In this section(cid:2) we perform experiments with real data traces to demonstratethe MWM(cid:12)s capacityto capture important proper(cid:3) fd U(cid:7)(cid:2)(cid:7) U(cid:0)(cid:2)(cid:7) (cid:21) (cid:4)(cid:0)(cid:22)A(cid:7)(cid:2)(cid:7)(cid:5)(cid:4)(cid:0)(cid:22)A(cid:5)(cid:2)(cid:7)(cid:5) ties of real data(cid:7) As expected(cid:2) the MWM does an excellent job (cid:16) in capturing the correlation structure of real data sets(cid:7) We also fd U(cid:7)(cid:2)(cid:7) observethattheMWMperformswellinmatchingthemarginals (cid:9) (cid:4)(cid:0)(cid:22)A(cid:7)(cid:2)(cid:7)(cid:22)A(cid:5)(cid:2)(cid:7)(cid:5) (cid:4)(cid:16)(cid:16)(cid:5) (cid:16) andhigher(cid:3)ordermomentsofrealdata(cid:7) RecallthattheGaussian WIGmodelisalsocapableofcapturingthecorrelationstructure Thus(cid:2) for a (cid:11)xed U(cid:7)(cid:2)(cid:7) at the coarsest scale(cid:2) to a (cid:11)rst(cid:3)order ap(cid:3) oftrainingdata(cid:7) Wethushavetwomodels(cid:2)bothofwhichcapture proximation(cid:2) the MWM is additive at the coarsescales provided the correlation structure of real data but with the MWM com(cid:3) therandomvariablesAj(cid:2)k aresmallin amplitude(cid:7) Moreover(cid:2)the ing closerto matchingthe marginalsand higher(cid:3)ordermoments(cid:7) Aj(cid:2)k are approximately Gaussian for these low(cid:3)variance (cid:4)high(cid:3) Equipped with these models(cid:2) we are in an excellent position to p(cid:5) symmetric (cid:11) multipliers (cid:14)(cid:17)(cid:27)(cid:15)(cid:7) Hence(cid:2) coarse(cid:3)resolutionMWM perform queuing experiments to study if the correlation struc(cid:3) outputs will exhibit an additive(cid:2) Gaussian(cid:3)like behavior consis(cid:3) tureis byitself su(cid:6)cienttocapturethe queuingbehaviorofreal tent with that of the previously justi(cid:11)ed ON(cid:23)OFF models and tra(cid:6)c(cid:7) notions of client behavior as a superposition of sources (cid:14)(cid:25)(cid:2)(cid:26)(cid:15)(cid:7) (cid:9)(cid:2)(cid:3) Real data (cid:9)(cid:2)(cid:7) Matching of Real Data Weuse twowell(cid:3)knownrealdatatracesinourexperiments(cid:7) The In order to study how well the MWM and WIG models can (cid:11)rst (cid:4)LBL(cid:3)TCP(cid:3)(cid:17)(cid:5) contains two hours(cid:12) worth of wide(cid:3)area TCP match real data(cid:2) we train them on the the real data traces(cid:7) To tra(cid:6)c between the Lawrence Berkeley Laboratory and the rest (cid:11)ttheWIGandMWMmodelstothedata(cid:2)weusetheprocedure of the world (cid:14)(cid:0)(cid:30)(cid:15)(cid:7) This data contains the following informa(cid:3) outlined in Sections (cid:17)(cid:7)(cid:16) and (cid:30)(cid:7)(cid:17)(cid:2) which involves taking a Haar tion about each packet(cid:29) the time(cid:3)stamp(cid:2) (cid:4)renumbered(cid:5) source wavelet transform of the real data and estimating the variances (cid:0) host(cid:2) (cid:4)renumbered(cid:5) destination host(cid:2) source TCP port(cid:2) destina(cid:3) (cid:4)j of the wavelet coe(cid:6)cients at each scale(cid:7) We estimate these tionTCPport(cid:2)andnumberofdatabytes(cid:7) Inourexperimentswe variances only at the (cid:0)(cid:19) (cid:11)nest scales(cid:2) because at coarser scales useonly the time(cid:3)stamp and databytes information(cid:7) We form a there are not a su(cid:6)cient number of coe(cid:6)cients to obtain good datatracebycountingthenumberofbytesofpacketsthatarrive variance estimates(cid:7) As a result(cid:2) we synthesize data traces of (cid:0)(cid:7) (cid:5)(cid:9) in consecutive time intervals of (cid:24) ms and use the (cid:11)rst (cid:16) data maximumlength(cid:16) datapoints(cid:7) ForboththeMWMandWIG(cid:2) points in our simulation experiments(cid:7) This trace has a sample wemodelthecoarsest(cid:3)scalescalingcoe(cid:6)cientU(cid:7)(cid:2)(cid:7) asaGaussian mean of (cid:16)(cid:19)(cid:25)(cid:7)(cid:19) bytes(cid:23)(cid:4)unit time(cid:5) and sample standard deviation random variable with mean and variance equal to the sample of (cid:19)(cid:24)(cid:16)(cid:7)(cid:24) bytes(cid:23)(cid:4)unit time(cid:5)(cid:7) mean and variance of the scaling coe(cid:6)cients of the real data Ribeiro et al(cid:2)(cid:2) Simulation of nonGaussian Long(cid:3)Range(cid:3)Dependent Traffic (cid:25) 2x 104 LBL(cid:3)TCP(cid:3)(cid:17) data 2x 104WIG synthesized data 2x 1M04 WM synthesized data 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0 −2000 −1000 0 1000 2000 3000 4000 −2000 −1000 0 1000 2000 3000 4000 −2000 −1000 0 1000 2000 3000 4000 7000 7000 7000 6000 6000 6000 5000 5000 5000 4000 4000 4000 3000 3000 3000 2000 2000 2000 1000 1000 1000 0 0 0 −4000 −2000 0 2000 4000 6000 8000 −4000 −2000 0 2000 4000 6000 8000 −4000 −2000 0 2000 4000 6000 8000 3000 3000 3000 2500 2500 2500 2000 2000 2000 1500 1500 1500 1000 1000 1000 500 500 500 0 0 0 −5000 0 5000 10000 −5000 0 5000 10000 −5000 0 5000 10000 (cid:4)a(cid:5) (cid:4)b(cid:5) (cid:4)c(cid:5) Figure (cid:19)(cid:29) Histograms of the bytes(cid:2)per(cid:2)times process at di(cid:3)erent aggregation levels for (cid:4)a(cid:5) wide(cid:2)area TCP tra(cid:6)c at the Lawrence Berkeley Laboratory (cid:4)trace LBL(cid:2)TCP(cid:2)(cid:7)(cid:5) (cid:8)(cid:9)(cid:10)(cid:11)(cid:12) (cid:4)b(cid:5) one realization of the WIG model(cid:12) and (cid:4)c(cid:5) one realization of the MWM synthesis(cid:14) The top(cid:12) middle andbottomplotscorrespondtobytesarrivinginintervalsof(cid:2)ms(cid:12)(cid:3)(cid:8)msand(cid:8)(cid:9)msrespectively(cid:14) Notethelargeprobabilitymassovernegative values for the WIG model(cid:14) 19 1 at this scale(cid:7) With trained models in hand(cid:2) we now generate 18 LBL−TCP−3 WIG 0.8 synthetic data traces(cid:7) 17 MWM theDuLeBLto(cid:3)TsCpaPc(cid:3)e(cid:17)ctornacster(cid:7)aiRntesc(cid:2)alwlefrpormeseFnitgu(cid:11)rteti(cid:0)ngthraetsuvltissuoanlllyytfhoer (m)Log(Var(X)2111456 αf()000...246 syntheticMWMlooksverysimilartotherealtrace(cid:7) Wecompare 13 0 LMBWLM−T C P −3 the marginals of MWM and WIG traces to that of the LBL(cid:3) 12 −0.2 TCP(cid:3)(cid:17)traceatthreedi(cid:8)erentaggregationlevels(cid:7) FromFigure(cid:19) 0 5 Log2(m) 10 15 0.6 0.8 1α 1.2 1.4 observe that the MWM marginalsare similar to that of the real (cid:4)a(cid:5) (cid:4)b(cid:5) datatrace(cid:2)whiletheGaussianWIGmarginalsdi(cid:8)ersigni(cid:11)cantly(cid:7) WealsoobservethattheWIGtraceshaveaconsiderablenumber Figure (cid:24)(cid:29) (cid:4)a(cid:5) Variance(cid:2)time plot of the LBL(cid:2)TCP(cid:2)(cid:7) data (cid:19) (cid:20)(cid:12) the of negative points(cid:2) a result of the low mean and high standard WIGdata (cid:19) (cid:20)(cid:12) and one realization of the MWM synthesis (cid:19)(cid:3)(cid:20)(cid:14) (cid:4)b(cid:5) (cid:4) (cid:5) deviation of the real data trace(cid:7) Multifractal spectra of the LBL(cid:2)TCP(cid:2)(cid:7) data and one realization of the MWM synthesis(cid:14) Wenext comparethecorrelationmatchingabilitiesofthe two models(cid:7) Wecomparethevariance(cid:3)timeplotsoftherealdata(cid:2)the MWMtraces(cid:2)andthe WIGtracesin Figure(cid:24)(cid:4)a(cid:5)(cid:7) Thevariance(cid:3) except for large values of (cid:12)(cid:7) This corresponds to a close match time plot estimates were obtained by averaging the empirical of the scaling of higher(cid:3)ordermoments(cid:2) but a somewhat less ac(cid:3) variance(cid:3)timeplotsof(cid:17)(cid:16)independent realizationsofthemodels(cid:7) curate match of the scaling of the negative moments(cid:7) We observethat(cid:2) asexpected(cid:2) both the MWM andWIG models do a good job of matching the correlation structure of the real data(cid:7) (cid:9)(cid:2)(cid:8) Queuing results We plot the multifractal spectra (cid:4)see Section (cid:24)(cid:5) of the LBL(cid:3) Much e(cid:8)ort has been exerted studying the e(cid:8)ect of the corre(cid:3) TCP(cid:3)(cid:17) data and the synthetic MWM trace in Figure (cid:24)(cid:4)b(cid:5) (cid:4)cal(cid:3) lation structure on queuing performance (cid:14)(cid:16)(cid:2)(cid:0)(cid:25)(cid:18)(cid:0)(cid:27)(cid:15)(cid:7) Gaussian culations for the negative moments of the WIG data become modelsthatcapturethecorrelationstructureoftra(cid:6)chavebeen numerically unstable and hence the spectra for the WIG is not proposed (cid:14)(cid:0)(cid:2)(cid:0)(cid:19)(cid:15) and theoretical results for the tail queue proba(cid:3) included(cid:5)(cid:7) We observe that the spectra match extremely well bility have been obtained (cid:14)(cid:27)(cid:2)(cid:0)(cid:0)(cid:2)(cid:30)(cid:20)(cid:15)(cid:7) These models are in many (cid:26) Proceedings SigMetrics (cid:4)(cid:5)(cid:5)(cid:6) Atlanta(cid:6) GA 0 cases appropriate for modeling data tra(cid:6)c(cid:7) For example(cid:2) the WIG model has been shown to capture the queuing behavior of −1 video tra(cid:6)c well (cid:14)(cid:0)(cid:19)(cid:15)(cid:7) For the WAN(cid:23)LAN data traces that we → consider here(cid:2) however(cid:2) this is not the case(cid:7) x)] −2 The multiplicative structure of the MWM captures both the Q> Wcoritrhelaittisoninshoefrtehnetraepapldroaxtiamaastweleylllaosgnthoermhaiglhmera(cid:3)orgrdinearlmdoismtreinbtus(cid:3)(cid:7) log[P(10−3 WIG+ WIG tion(cid:2) it also comes close to matching the marginals of the real MWM −4 LBL−TCP−3 data traces(cid:7) 90% conf.:WIG+ 90% conf.:WIG Intuitively(cid:2) the more tra(cid:6)c characteristics a model matches(cid:2) 90% conf.: MWM −5 the better will it match the queuing behavior of real tra(cid:6)c(cid:7) 0 2 4 6 8 Hence(cid:2) it is not surprising that a perfect (cid:11)tting of second(cid:3)order x= Queue size in bytes → x 104 (cid:4)a(cid:5) 0 correlationsand marginalsasdone in(cid:14)(cid:16)(cid:16)(cid:15)leads to a good match of queueing behavior(cid:7) Here(cid:2) we take a di(cid:8)erent approach comparing two simple and −1 quite related models in their ability to capture the queueing be(cid:3) → havior of the two real data sets(cid:7) With this experiment we hope x)] −2 > to shed some light on the impact of marginals and higher(cid:3)order P(Q correlations on queuing behavior(cid:7) g[10−3 In all experiments(cid:2) data traces are fed as input to an in(cid:11)nite lo WIG+ WIG MWM lengthsingle(cid:3)serverqueuewithlinkcapacity(cid:26)(cid:20)(cid:20)bytes(cid:23)unittime(cid:7) −4 BC−pAug89 90% conf.:WIG+ Weestimatethetailqueueprobabilitiesofthevariousdatatraces 90% conf.:WIG 90% conf.: MWM as −5 Pi(cid:14)Q(cid:2)x(cid:15)(cid:21) number of time instants Q(cid:2)x(cid:5) (cid:4)(cid:16)(cid:17)(cid:5) 0 0.5 x= Que1ue size in b1y.5tes → 2 x 105 total time duration of trace i (cid:4)b(cid:5) b Wealsoprovidecon(cid:11)denceintervalswithcon(cid:11)dencelevelof(cid:27)(cid:20)" Figure(cid:25)(cid:29) Comparisonofthequeuingperformanceofrealdatatraces L forthe estimated queue distribution (cid:4)(cid:0)(cid:7)L(cid:5)#i(cid:6)(cid:5)Pi(cid:14)Q(cid:2)x(cid:15)(cid:2) where with those of synthetic WIG and MWM traces(cid:14) In (cid:4)a(cid:5)(cid:12) we observe L is the total number of traces(cid:2) assuming that it is a Gaussian that the MWM synthesis matches the queuing behavior of the LBL(cid:2) b random variable (cid:14)(cid:30)(cid:0)(cid:15)(cid:7) TCP(cid:2)(cid:7) data closely(cid:12) while the WIG synthesis does not(cid:14) Even when With both real traces(cid:2) we performed the same queuing ex(cid:3) negativevaluesoftheWIGdataaresetto(cid:4)(cid:4)WIG(cid:12)(cid:5)(cid:12)theWIGtraces do not come close to matching the correct queuing behavior(cid:14) In (cid:4)b(cid:5)(cid:12) periment(cid:7) We (cid:11)rst trained the MWM and WIG models on the we observe a similar behavior with theBC(cid:2)pAug(cid:21)(cid:22) data(cid:14) real traces as described in Section (cid:19)(cid:7)(cid:17)(cid:7) We then synthesized (cid:30)(cid:26)(cid:20) (cid:5)(cid:9) MWM and WIG traces of length (cid:16) (cid:2) fed them as input to our theoretical queue and obtained their queuing behavior(cid:7) Recall modeling tra(cid:6)c with marginals similar to those of the real data that both the WIG and MWM capture the mean(cid:2) variance and traces considered here(cid:7) correlationstructure of the real data(cid:7) The results for the realtrace BC(cid:3)pAug(cid:26)(cid:27)areshown in Figure In Figure (cid:25)(cid:4)a(cid:5) we compare the average queuing behavior of (cid:25)(cid:4)b(cid:5) and are similar to those for the LBL(cid:3)TCP(cid:3)(cid:17)trace(cid:7) Clearly(cid:2) theMWMandWIGtracestothatoftherealtraceLBL(cid:3)TCP(cid:3)(cid:17)(cid:7) the MWM again performs far better than the WIG model in We observe that the MWM traces match the queuing behavior capturing the queuing behavior of the real data(cid:7) of the real data trace much better than the WIG traces(cid:7) From Thesequeuingexperimentsindicatethatthecorrelationstruc(cid:3) Figure(cid:19)wenoticethattheWIGdatatraceshaveaconsiderable ture of tra(cid:6)c is not the only factorthat decides the queuing be(cid:3) numberofnegativedatapoints(cid:7) ThisisbecausetheLBL(cid:3)TCP(cid:3)(cid:17) havior of data tra(cid:6)c(cid:7) Since the MWM outperforms the WIG data set has a large ratio of standard deviation to mean(cid:2) which model inmatchingqueuingbehavior(cid:2)weconcludethattheaddi(cid:3) when modeled by a Gaussian process leads to a large fraction tional tra(cid:6)c characteristicsof real data captured by the MWM(cid:2) of data points going negative(cid:7) In order to test whether these like marginals and higher order moments(cid:2) have a substantial ef(cid:3) negative values are the cause for the poor performance of the fect on the queuing behavior of tra(cid:6)c with statistics similar to WIG model(cid:2) we set negative values to zero in the WIG traces the real data sets that we considered here(cid:7) and obtained the queuing behavior of these new traces(cid:7) We call the new data traces WIG(cid:22)(cid:7) We see from Figure (cid:25)(cid:4)a(cid:5) that the queuing performance of the WIG(cid:22) traces is not substantially (cid:9) MWM is a Cascade better than that of the WIG traces(cid:7) Thus(cid:2) we conclude that the Gaussian WIG traces do not give WenowlinktheMWMwiththetheoryofcascades(cid:7) Thetechni(cid:3) a good approximation to the queuing behavior of the real data cal details in this section are not necessary for understanding or setinspiteofcapturingthecorrelationstructureoftherealdata applyingtheMWMandcanbeomittedona(cid:11)rstreading(cid:7) Multi(cid:3) trace(cid:7) Furthermore(cid:2) the ad hoc procedure of setting all negative plicative cascades generalize the self(cid:3)similarity of fractal models values to zero does not improve matters(cid:7) In fact(cid:2) the ad hoc such as fGn and fBm by o(cid:8)ering greater (cid:28)exibility and richer procedure used in creating the WIG(cid:22) data traces destroys the scaling properties(cid:2) including burstiness and scaling of higher(cid:3) statistics of the traces(cid:7) Other ad hoc procedures like excluding order moments (cid:14)(cid:16)(cid:19)(cid:2)(cid:16)(cid:27)(cid:15)(cid:7) Identifying the MWM algorithm with all negative data points or setting all negative points to their a multiplicative cascade allows us to bene(cid:11)t from the accumu(cid:3) absolute value also destroy the statistics of the traces(cid:7) This re(cid:3) lated theoretical and practical knowledge of the (cid:11)eld of multi(cid:2) veals some of the problems associated with Gaussian models for fractals(cid:2) including a precise understanding of the convergenceof Ribeiro et al(cid:2)(cid:2) Simulation of nonGaussian Long(cid:3)Range(cid:3)Dependent Traffic (cid:27) the MWM algorithm(cid:2) properties of the marginal distributions(cid:2) (cid:10)(cid:2)(cid:7) Measuring burstiness advantages over monofractal fGn models(cid:2) and a range of pos(cid:3) (cid:0)n For the ease of notation let kn(cid:16) (cid:6) t mean that t (cid:4) sible re(cid:11)nements and extensions(cid:7) For these reasons(cid:2) we (cid:11)nd it (cid:0)n (cid:0)n (cid:14)kn(cid:16) (cid:3)(cid:4)kn(cid:22)(cid:0)(cid:5)(cid:16) (cid:5) and n(cid:6)(cid:7)(cid:7) The strength of growth(cid:2) also useful to examine the MWM within the context of cascadesand calledthedegreeofH(cid:4)older continuity(cid:3)attimetofaprocessY(cid:4)t(cid:5) multifractals(cid:7) (cid:4)thatcorrespondstoD(cid:4)t(cid:5)oftheMWM(cid:5)withpositiveincrements The backbone of a cascade is a construction where one starts can be characterized by at a coarse scale and develops details of the process on (cid:11)ner scales iteratively in a multiplicative fashion(cid:7) The MWM is one (cid:12)(cid:4)t(cid:5) (cid:21) kn(cid:0)li(cid:0)mn(cid:5)t(cid:12)nkn where such cascade(cid:2) as (cid:4)(cid:0)(cid:19)(cid:5) and (cid:4)(cid:0)(cid:24)(cid:5) reveal(cid:7) In accordance with the notation for cascades(cid:2) setting n (cid:0) (cid:0)n (cid:0)n (cid:12)kn (cid:0) (cid:2) log(cid:0) Y(cid:4)(cid:4)kn(cid:22)(cid:0)(cid:5)(cid:16) (cid:5)(cid:2)Y(cid:4)kn(cid:16) (cid:5) (cid:5) (cid:4)(cid:16)(cid:24)(cid:5) n (cid:2) (cid:8) (cid:8) (cid:7) (cid:7) i (cid:4)(cid:0)(cid:22)(cid:4)(cid:2)(cid:0)(cid:5)ki(cid:0)(cid:2)Ai(cid:0)(cid:5)(cid:2)ki(cid:0)(cid:2)(cid:5) The smaller the (cid:12)(cid:4)t(cid:5)(cid:2) the(cid:8)larger the increments of Y(cid:2)(cid:8)and the M(cid:7) (cid:21)U(cid:7) and Mki (cid:21) (cid:3) (cid:20)(cid:6)i(cid:10)n(cid:3) (cid:16) (cid:9)burstier(cid:10)it is at time t(cid:7) The frequencyofoccurrenceofa given (cid:4)(cid:16)(cid:30)(cid:5) strength (cid:12)(cid:2) as visible from an analysis on coarse scales can be and substituting into (cid:4)(cid:0)(cid:24)(cid:5) leads us to (cid:4)see Figure (cid:26)(cid:4)a(cid:5)(cid:5) measured by the multifractal spectrum(cid:29) n C(cid:14)k(cid:15)(cid:21)(cid:16)(cid:0)nM(cid:7)(cid:7) Mi(cid:2)ki(cid:3) (cid:4)(cid:16)(cid:19)(cid:5) f(cid:4)(cid:12)(cid:5)(cid:29)(cid:21)(cid:4)li(cid:5)m(cid:7)nl(cid:5)im(cid:2)n(cid:0) log(cid:0)$fkn (cid:21) (cid:20)(cid:3)(cid:5)(cid:5)(cid:5)(cid:3)(cid:16)n(cid:2)(cid:0) (cid:4)(cid:16)(cid:25)(cid:5) Yi(cid:6)(cid:5) n (cid:29) (cid:12)kn (cid:4)(cid:4)(cid:12)(cid:2)(cid:13)(cid:3)(cid:12)(cid:22)(cid:13)(cid:5)g(cid:5) (cid:4) with the ki and ki de(cid:11)ned in the same way as for (cid:4)(cid:0)(cid:24)(cid:5)(cid:7) Byde(cid:11)nition(cid:2)f takesvaluesbetween(cid:20)and(cid:0)andisoftenshaped Our aim in this section is to both introduce and give an in(cid:3) like a (cid:16) and concave(cid:2) but not always(cid:7) The smaller f(cid:4)(cid:12)(cid:5) is(cid:2) the tuitive understanding of cascades to the reader(cid:7) After studying (cid:9)fewer(cid:10) points t will show (cid:12)(cid:4)t(cid:5)(cid:3)(cid:12)(cid:7) If (cid:12) denotes the value (cid:12)(cid:4)t(cid:5) the nature of the MWM(cid:12)s marginals(cid:2) we compare cascades with assumed by (cid:9)most(cid:10) points t then f(cid:4)(cid:12)(cid:5)(cid:21)(cid:0)(cid:7) Gaussian LRD processes such as the WIG(cid:7) As already hinted Note that this analysis via increments (cid:4)(cid:16)(cid:24)(cid:5) is su(cid:6)cient pro(cid:3) in the introduction(cid:2) cascades such as the MWM are ideal for vided Y(cid:4)t(cid:5) has no polynomial trends(cid:7) If(cid:2) on the other hand(cid:2) modeling burstiness(cid:7) We explain this here by developing the polynomial terms are present(cid:2) then the increment(cid:3)analysis will multifractal formalism (cid:4)for further details(cid:2) see (cid:14)(cid:16)(cid:27)(cid:15)(cid:5)(cid:7) yield f(cid:4)(cid:12)(cid:5) (cid:21) (cid:0) for (cid:12) (cid:21) k (cid:4) IIN where k is the order of the (cid:11)rst non(cid:3)vanishing derivative of Y(cid:7) Then one has to eliminate the polynomial in(cid:28)uence(cid:2) a(cid:5) via wavelets or(cid:2) b(cid:5) by subtracting the (cid:10)(cid:2)(cid:3) Lognormal marginals trend if known(cid:7) The known trend for self(cid:3)similar processes is none other than the (cid:9)mean arrival rate(cid:10)(cid:7) Multiplicative structures(cid:2) in particular the product representa(cid:3) i It is(cid:2) therefore(cid:2) important to mention that ouranalysisof real tion (cid:4)(cid:16)(cid:19)(cid:5)(cid:2) naturally lead to lognormal marginals(cid:7) If the Mki are tracesinSection(cid:19)showsnointegerscalingexponent(cid:12)(cid:4)t(cid:5)(cid:2)except allpositiveandidenticallydistributed(cid:2) thenC(cid:14)k(cid:15)willbeapprox(cid:3) for (cid:12)(cid:4)t(cid:5) (cid:21) (cid:0) for a small number of t(cid:2) that is(cid:2) f(cid:4)(cid:0)(cid:5) (cid:6) (cid:0)(cid:7) Thus(cid:2) imately lognormal by the CLT(cid:7) Figure (cid:19) shows that Gaussian we conclude that polynomial trends are not present in the real modeling seems un(cid:11)t in this network scenario! various other au(cid:3) tra(cid:6)c traces studied here(cid:7) Since we did not remove any trend thors make a case for marginal distributions(cid:2) including the log(cid:3) fromthe realdatapriorto ouranalysis(cid:2)thisresultsuggeststhat normal(cid:2)withtailsthataremuchheavierthantheGaussian(cid:14)(cid:16)(cid:20)(cid:2)p the data is not well characterizedby self(cid:3)similar models(cid:7) (cid:17)(cid:24)(cid:30)(cid:15)(cid:2)(cid:14)(cid:16)(cid:0)(cid:15)(cid:7) Wedonotclaimthatthelognormalisappropriatefor all tra(cid:6)c at all scales(cid:2) and for a limited number of scales a cas(cid:3) cade signal can behave di(cid:8)erently from a lognormal(cid:7) However(cid:2) (cid:10)(cid:2)(cid:8) Higher(cid:4)order moments and the MF spec(cid:4) this link betweencascadesanduseful marginalmodelsfor tra(cid:6)c trum points to the viability of cascades for providing realistic tra(cid:6)c Cascades such as the MWM possess rich multifractal spectra(cid:7) models(cid:7) Unlikecascades(cid:2)thestrongself(cid:3)similarityofthefBm(cid:4)(cid:0)(cid:5)forcesit tohaveatrivialmultifractalbehavior(cid:7) Tobeprecise(cid:2)forthefBm(cid:2) (cid:12)(cid:4)t(cid:5)(cid:21)H for all t(cid:7) To demonstrate this(cid:2) we will use information (cid:10)(cid:2)(cid:0) Cascades vs(cid:2) fGn about the scaling of higher(cid:3)order moments of the two types of Thereisafundamentaldi(cid:8)erencebetweencascademodelingand processes to obtain their multifractal spectra(cid:7) modelingviaself(cid:3)similarprocessessuchasfGnortheWIG(cid:2)which Let us de(cid:11)ne treattra(cid:6)casameanratesuperimposedwithfractalnoise(cid:7) Ad(cid:3) (cid:0) ditive self(cid:3)similar models (cid:9)hover(cid:10) around the mean with occa(cid:3) T(cid:4)q(cid:5) (cid:0) nl(cid:5)im(cid:2) (cid:2)nlog(cid:0)IE(cid:14)Sn(cid:4)q(cid:5)(cid:15)(cid:3) where (cid:4)(cid:16)(cid:26)(cid:5) sional outbursts in both positive and negative directions(cid:2) while (cid:0)n(cid:0)(cid:5) multiplicative cascades (cid:9)sit(cid:10) just above the zero line and emit (cid:0)n (cid:0)n q Sn(cid:4)q(cid:5) (cid:0) Y(cid:4)(cid:4)kn(cid:22)(cid:0)(cid:5)(cid:16) (cid:5)(cid:2)Y(cid:4)kn(cid:16) (cid:5) occasional positive jumps or spikes(cid:7) In mathematical terms this kXn(cid:6)(cid:7)(cid:8) (cid:8) distinctionisbestcapturedbyexaminingnegativemoments(cid:29) for (cid:0)n(cid:0)(cid:8)(cid:5) (cid:8) n self(cid:3)similarmodels(cid:2)thesearethenegativemomentsofthefractal (cid:21) (cid:16)n(cid:5)kn(cid:5) noise(cid:2)hencetheycaptureuninterestinglysmallvariationsaround kXn(cid:6)(cid:7) themean!forcascades(cid:2)ontheotherhand(cid:2)thesearethenegative moments of the processitself(cid:2) so they capture unnaturally small Notethat T isalwaysconcave(cid:2)sincelog(cid:0)IESn(cid:4)q(cid:5)isconcave(cid:7) For values and provide useful information(cid:7) a typical plot of T and f see Figure (cid:26) (cid:4)b(cid:5) and (cid:4)c(cid:5)(cid:7) (cid:0)(cid:20) Proceedings SigMetrics (cid:4)(cid:5)(cid:5)(cid:6) Atlanta(cid:6) GA M00 1.5 slope=α 1.5 0 1 0.15 (q,T(q)) 1−T(0)=1 slope=q q1=1 q0=0 . 0 M01 M00 0.5 M11.M00 1 →T(q) −0−.015 −T*(α) (0,T(0))=(0,−1)(1,T(1))=(1,0) *α→T() 0.05−T(1)=0 (α,T*(α)) −1.5 M02.M01.M00M21.M01.M00M22.M11.M00M23.M11.M00 −2−−.−3251 0 1 2 3 −0−.15−0.5−T(q)0 0.5 1 1.5 2 2.5 3 0 0.25 0.5 0.75 1 q → α → (cid:4)a(cid:5) (cid:4)b(cid:5) (cid:4)c(cid:5) Figure(cid:26)(cid:29) (cid:4)a(cid:5)(cid:17) TheMWMtranslatesimmediatelyintoamultiplicativecascadeinthetimedomain(cid:4)cf(cid:14) (cid:4)(cid:18)(cid:13)(cid:5)(cid:5)(cid:14) (cid:4)b(cid:5) (cid:4)c(cid:5)(cid:17) Wedemonstratethe (cid:0) (cid:6)(cid:7) LegendretransformT T inthesimplecaseofconcave(cid:12)di(cid:3)erentiablefunctionssuchasthespectraofatypicalMWM(cid:4)(cid:4)(cid:7)(cid:7)(cid:5)withp(cid:13)(cid:3)(cid:7)(cid:2)(cid:2)(cid:12) (cid:2) (cid:6)(cid:7) (cid:0) (cid:0) (cid:0) H (cid:13)(cid:7)(cid:14)(cid:7)(cid:5)(cid:14) Set (cid:9)(cid:13)T (cid:10)q(cid:11)(cid:12) then T (cid:10)(cid:9)(cid:11) is suchthat thetangent at (cid:10)q(cid:4)T(cid:10)q(cid:11)(cid:11) passes through (cid:10)(cid:4)(cid:4) T (cid:10)(cid:9)(cid:11)(cid:11)(cid:14) Inother words(cid:12) T (cid:10)(cid:9)(cid:11)(cid:12)q(cid:9)(cid:13)T(cid:10)q(cid:11)(cid:14) (cid:0) (cid:8) (cid:8) By symmetry(cid:12) the tangent at (cid:10)(cid:9)(cid:4)T (cid:10)(cid:9)(cid:11)(cid:11) has slope q and passes through (cid:10)(cid:4)(cid:4) T(cid:10)q(cid:11)(cid:11)(cid:14) There are two notable special values of q(cid:14) Trivially(cid:12) (cid:0) (cid:8) T(cid:10)(cid:4)(cid:11)(cid:13) (cid:3)(cid:12) whencethe maximumof T is (cid:3)(cid:14) Inaddition(cid:12) every positive incrementprocess has T(cid:10)(cid:3)(cid:11)(cid:13)(cid:4)(cid:12) whence T touchesthe bisector(cid:14) (cid:8) (cid:9) (cid:3)i(cid:4) Themultifractalspectrumf(cid:4)(cid:12)(cid:5)andT(cid:4)q(cid:5)arecloselyrelatedas thatthemomentsoftheM convergetotheonesofthelimiting the following instructive hand(cid:3)waving argument shows(cid:7) Group(cid:3) random variable M for the next equation(cid:2) and end by assuming n ing in the sum Sn(cid:4)q(cid:5) of (cid:4)(cid:16)(cid:26)(cid:5) the terms according to (cid:12)kn (cid:3) (cid:12)(cid:2) that M (cid:21) (cid:4)(cid:0)(cid:22)A(cid:5)(cid:7)(cid:16) with A being (cid:11)(cid:3)distributed as in (cid:4)(cid:0)(cid:27)(cid:5) to and using (cid:4)(cid:16)(cid:25)(cid:5) we get obtain(cid:29) (cid:0)n(cid:5) q nfG(cid:3)(cid:5)(cid:4) (cid:0)nq(cid:5) Sn(cid:4)q(cid:5) (cid:21) (cid:5) (cid:5)n(cid:6)(cid:5)(cid:4)(cid:16) (cid:5) (cid:3) (cid:5)(cid:16) (cid:16) MWM(cid:29) T(cid:4)q(cid:5) (cid:21) (cid:2)(cid:0)(cid:2)log(cid:0)IE(cid:14)Mq(cid:15) (cid:4)(cid:17)(cid:17)(cid:5) P P P (cid:4)(cid:16)(cid:27)(cid:5) (cid:0)ninf(cid:2)(cid:3)q(cid:5)(cid:0)fG(cid:3)(cid:5)(cid:4)(cid:4) %(cid:4)p(cid:22)q(cid:5)%(cid:4)(cid:16)p(cid:5) (cid:3) (cid:16) (cid:5) (cid:21) (cid:2)(cid:0)(cid:2)log(cid:0) if q (cid:2)(cid:2)p(cid:2) %(cid:4)(cid:16)p(cid:22)q(cid:5)%(cid:4)p(cid:5) We conclude that we must (cid:9)expect(cid:10) T(cid:4)q(cid:5) to equal inf(cid:5)(cid:4)q(cid:12) (cid:2) fG(cid:4)(cid:12)(cid:5)(cid:5)(cid:7) For the special case of an MWM process(cid:2) i(cid:7)e(cid:7)(cid:2) Y (cid:21) D(cid:2) and T(cid:4)q(cid:5)(cid:21)(cid:2)(cid:7) if q (cid:10)(cid:2)p(cid:7) it can be shown (cid:4)see (cid:14)(cid:30)(cid:16)(cid:15)(cid:5) that the dual relation holds(cid:7) This ThefunctionT(cid:4)q(cid:5)isasimplestatisticaldescriptionofthepro(cid:3) relation is called the multifractal formalism and reads cess that captures marginal information(cid:2) but which alsogoverns (cid:3) the (cid:9)burstiness(cid:10) through the multifractal formalism(cid:7) It must be f(cid:4)(cid:12)(cid:5)(cid:21)T (cid:4)(cid:12)(cid:5)(cid:29)(cid:21)inqf(cid:4)q(cid:12)(cid:2)T(cid:4)q(cid:5)(cid:5)(cid:5) (cid:4)(cid:17)(cid:20)(cid:5) emphasized here that the multifractal parameters T(cid:4)q(cid:5) of the (cid:3) (cid:4) MWM process do not necessarily imply that the process can(cid:3) Simple ca(cid:4)(cid:4)lculus shows that T (cid:4)(cid:12)(cid:5) (cid:21) q(cid:12)(cid:2)T(cid:4)q(cid:5) at (cid:12) (cid:21) T (cid:4)q(cid:5) not be modeled parsimoniously(cid:7) For example(cid:2) in the case of the pr(cid:3)ovided T (cid:4)q(cid:5) (cid:6) (cid:20)(cid:7) This relation via the Legendre transform MWM(cid:2) the (cid:11)(cid:3)distributions for the multipliers are controlled by T is typical of the theory of large deviations (cid:14)(cid:30)(cid:17)(cid:15)(cid:7) The goal the parameters pj (cid:4)(cid:16)(cid:20)(cid:5)(cid:7) If one replaced the right side of (cid:4)(cid:16)(cid:20)(cid:5) by there is to establish relations such as (cid:4)(cid:17)(cid:20)(cid:5) under most general the powerlaw for fGn then all values T(cid:4)q(cid:5) would be determined assumptions(cid:7) To use the correct terminology(cid:2) f is the rate func(cid:2) by H (cid:14)(cid:16)(cid:27)(cid:15)(cid:7) tion of a so(cid:3)called large deviation principle (cid:4)LDP(cid:5)(cid:29) it measures Now let us compute T(cid:4)q(cid:5) for the self(cid:3)similar fBm(cid:7) From (cid:4)(cid:0)(cid:5) n howfrequentlyor how likely the observed(cid:12)kn deviatesfrom the we (cid:11)nd (cid:9)expected value(cid:10) (cid:12)(cid:7) (cid:0)n(cid:0)(cid:5) apIpnrooxridmerattioones(cid:16)t(cid:0)imnTat(cid:3)qe(cid:4)T(cid:3)(cid:4)qS(cid:5)nf(cid:4)rqo(cid:5)m(cid:7) Fdoartath(cid:2)eitMisWcuMstothmisarisyetqouuivsaeltehnet IE jB(cid:4)(cid:4)kn(cid:22)(cid:0)(cid:5)(cid:16)(cid:0)n(cid:5)(cid:2)B(cid:4)kn(cid:16)(cid:0)n(cid:5)jq kXn(cid:6)(cid:7) to (cid:0)j(cid:0)(cid:5) n (cid:0)n q n(cid:0)nqH q (cid:0)jT(cid:3)q(cid:4) (cid:0)j(cid:3)(cid:0) q (cid:21)(cid:16) IE jB(cid:4)(cid:16) (cid:5)j (cid:21)(cid:16) IE(cid:14)jB(cid:4)(cid:0)(cid:5)j (cid:15) (cid:4)(cid:17)(cid:30)(cid:5) (cid:16) (cid:8) j(cid:16) Uj(cid:2)kj (cid:5) (cid:4)(cid:17)(cid:0)(cid:5) (cid:11) (cid:12) kX(cid:6)(cid:7) which yields for fBm Any linear (cid:11)t of logS(cid:3)j(cid:4)(cid:4)q(cid:5) against j will give the slope T(cid:4)q(cid:5)(cid:7) LetuscalculateT(cid:4)q(cid:5)fortheMWMmodel(cid:2) i(cid:7)e(cid:7)(cid:2)Y (cid:21)D(cid:7) Using qH(cid:2)(cid:0) for q (cid:2)(cid:2)(cid:0)(cid:2) i (cid:4) fBm(cid:29) T(cid:4)q(cid:5)(cid:21) (cid:4)(cid:17)(cid:19)(cid:5) independenceofthemultipliersMki anddenotingby thesum (cid:4) (cid:2)(cid:7) for q (cid:10)(cid:2)(cid:0)(cid:7) n overall kn (cid:21)(cid:20)(cid:3)(cid:5)(cid:5)(cid:5)(cid:3)(cid:16) (cid:2)(cid:0) we (cid:11)nd P (cid:4) This is probably the most compact way to express the n q (cid:5) q (cid:7) q monofractal character of fBm(cid:29) taking the Legendre transform IE(cid:14)Sn(cid:4)q(cid:5)(cid:15) (cid:21) IE(cid:4)Mkn(cid:5) (cid:5)(cid:5)(cid:5)IE(cid:4)Mk(cid:2)(cid:5) (cid:5)IE(cid:4)M(cid:7)(cid:5) of T shows that fBm possesses only one degree of (cid:9)burstiness(cid:10) X(cid:4) (cid:4)(cid:12)(cid:4)t(cid:5) (cid:21)H(cid:5) which is omnipresent (cid:4)compare also (cid:4)(cid:17)(cid:5)(cid:5)(cid:7) (cid:3)n(cid:4) q (cid:3)(cid:5)(cid:4) q (cid:7) q (cid:21) IE(cid:4)M (cid:5) (cid:5)(cid:5)(cid:5)IE(cid:4)M (cid:5) (cid:5)IE(cid:4)M(cid:7)(cid:5) X n q (cid:7) q n (cid:3)i(cid:4) (cid:10)(cid:2)(cid:9) Multifractal scaling of moments and LRD (cid:21) IE(cid:4)M(cid:7)(cid:5) (cid:5)(cid:16) (cid:5) IE M (cid:5) (cid:4)(cid:17)(cid:16)(cid:5) iY(cid:6)(cid:5) (cid:9) (cid:10) The multifractal scaling exponent T(cid:4)(cid:16)(cid:5) of a process Y is closely Inthesecondstepwemadeuseofthefactthatthemultipliers relatedtotheLRDparameterH(cid:2)sinceboth measurethepower(cid:3) i (cid:3)i(cid:4) Mki are identically distributed to M (cid:7) To this we add the fact law behavior of some second(cid:3)order statistics(cid:7) More precisely(cid:2)
Description: