ebook img

Extreme Value Statistics of the Total Energy in an Intermediate Complexity Model of the Mid-latitude Atmospheric Jet. Part I: Stationary case PDF

2.5 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Extreme Value Statistics of the Total Energy in an Intermediate Complexity Model of the Mid-latitude Atmospheric Jet. Part I: Stationary case

Extreme Value Statistics of the Total Energy in an Intermediate Complexity Model of the Mid-latitude Atmospheric Jet. Part I: Stationary case. 6 0 Mara Felici,1,2,∗ Valerio Lucarini,1 Antonio Speranza,1 and Renato Vitolo1 0 2 1 PASEF – Physics and Applied Statistics of Earth Fluids, n a Dipartimento di Matematica ed Informatica, Universit`a di Camerino, J 1 Via Madonna delle Carceri, 62032 Camerino (MC), Italy 1 2 Dipartimento di Matematica U. Dini, Universit`a di Firenze, ] h p viale Morgagni 67/A – 50134 Firenze, Italy - o e g . s c i s y h p [ 1 v 1 8 0 1 0 6 0 / s c i s y h p : v i X r a Abstract An intermediate complexity baroclinic model for the atmospheric jet at middle-latitudes is used as a stochastic generator of earth-like time series: in the present case the total energy of the system. Statistical inference of extreme values is applied to yearly maxima sequences of the time series, in the rigorous setting provided by extreme value theory. In particular, the Generalized Extreme Value (GEV) family of distributions is used here as a fundamental model for its simplicity and generality. Several physically realistic values of the parameter T , descriptive of the forced equator- E to-poletemperaturegradientandresponsibleforsettingtheaverage baroclinicity intheatmospheric model, are examined. Stationary time series of the total energy are generated and the estimates of the three GEV parameters – location, scale and shape – are inferred by maximum likelihood methods. Standard statistical diagnostics, such as return level and quantile-quantile plots, are systematically applied to asses goodness-of-fit. The location and scale GEV parameters are found to have a piecewise smooth, monotonically increasing dependence on T . This is in agreement with E the similar dependence on T observed in the same system when other dynamically and physically E relevant observables are considered. The shape parameter also increases with T but is always E negative, as a priori required by the boundedness of the total energy of the system. The sensitivity of the statistical inference process is studied with respect to the selection procedure of the maxima: the roles of both the length of maxima sequences and of the length of data blocks over which the maxima are computed are critically analyzed. Issues related to model sensitivity are also explored by varying the resolution of the system. PACS numbers: 02.50.Tt, 02.70.-c,47.11.-j, 92.60.Bh, 92.70.Gt ∗Electronic address: [email protected]fi.it; URL: www.unicam.it/matinf/pasef Contents I. Introduction 4 II. Data and Methods 8 A. Total Energy of the Atmospheric Model 8 B. Parameter Estimation and Model Assessment in GEV Inference 9 III. GEV Inferences for 1000 Annual Maxima 13 A. Smoothness of GEV Inferences with Respect to System Parameters 14 IV. Sensitivity of the GEV inferences 16 A. Sensitivity with Respect to the Extreme Events Sample Size 16 B. Sensitivity with Respect to the Extreme Events Selection Procedure: Soft Extremes18 C. Sensitivity with Respect to the Model Resolution 19 V. Summary and Conclusions 21 Acknowledgments 23 A. Classical Theory of Extreme Value Distributions 24 B. A Model for the Mid-Latitudes Atmospheric Circulation 26 References 29 I. INTRODUCTION The study of climatic extreme events is of paramount importance for society, particularly in the fields of engineering and environmental and territorial planning. Indeed, temporal variations in the statistics of extreme events may have more acute and disruptive effects than changes in the mean climate [24]. In works of economical nature (see e.g. Nordhaus [39]), the special role played by the extreme events in terms of impacts is included with the hypothesis that the costs associated with climatic change can be represented as strongly nonlinear func- tions of the observed variations in surface temperature. This constitutes a clear motivation for which, when the impacts of climatic change are analyzed, the interest for variations in the statistics of extreme events plays a strategic role [15, 34]. In the scientific literature, some recent papers in which the existence of trends in the frequency of extreme (precipitation) events was pushed forward in quantitative terms are those by Karl et al. [22, 23]. Here the authors stated that the percentage of the U.S.A. with a much above normal proportion of total annual precipitation from extreme precipitation events (daily events at or above 2 inches) showed an increase from 9% in 1910-1920 to about 11% in the ’90s. Despite severe scientific criticism to these papers by many other researchers in the field, the basic idea that the frequency of extreme events may change together with average surface temperature was discussed more and more and, eventually, it became one of the issues of analysis for the Intergovernmental Panel for Climate Change: a specific a specific report on Changes in extreme weather and climate events was issued in 2002 [16]. Basic questions, when dealing with extremes of complex processes, is: what is the correct way of measuring extremes? Are we concentrating on local or global fluctuations of the system in question? How do we measure local extremes? Extremes of wind speeds, of rainfall amounts, ofeconomicaldamage? Moreover, theenhancement intheextreme events might bequantified either in terms of number of events, or in size of the average extreme event, or a combination thereof. Several other ambiguities make it often difficult to follow literature on the subject. Overall, two important weaknesses of much work on the subject of extreme meteo-climatic events and of their trends are: • the lack of interpretation of the dynamical mechanisms that should cause the hypoth- esized changes in the frequency of extremes of various nature; often such mechanisms are just alluded to instead of being explicitely formulated and analyzed; • the lack of a common and theoretically founded definition of “extremes”. The deficit in the first point above may negatively affect both deterministic and statistical studies of the phenomena in question. One major example on globalprocesses is that, despite the great attention attracted by the subject, very few researchers have investigated in detail the basic mechanisms that should associate an increased CO concentration to enhanced 2 extreme weather events. The chain of mechanisms possibly linking CO concentration and 2 weather extremes is too long even for an adequate qualitative discussion here, but we shall concentrate on the basic sequence: enhanced surface temperature −→ enhanced baroclinic- ity −→ changes in the upper tail of the probability distribution function of the baroclinic disturbances. But no robust analysis of this complex dynamical “chain” has been offered so far. As for the second point above, the lack of a common rigorous framework for the statis- tical analysis of extremes (with few exceptions such as e.g. [25, 49, 50]) provides a serious drawback for the interpretation and comparison of results from different studies. Moreover, this problem is not even justified, since mathematical theories of extreme events are well- developed [3, 5, 8, 9, 10, 11, 29] and the derived methods are quite successful in many appli- cations [25, 40, 49, 50]. One basic ingredient of the theory relies on Gnedenko’s theorem [11], which states that, under fairly mild assumptions, the distribution of the block-maxima of a sample of independent identically distributed variables converges to a family of three distinct distributions, the so-called Generalized Extreme Value (GEV) distributions. See Appendix A fora brief description. Noticethat oneofthe earliest applications of thistheory inthenatural sciences occurred specifically in a meteorologic-climatic setting [20]. Other statistical models for extreme events include the r-largest statistics, threshold exceedance models such as the generalized Pareto distribution, and point processes, see [5]. The reliability of parametric estimates for extreme value models strongly depends on the asymptotic nature of extreme value theory. In particular, at least the following issues should be checked or addressed [5]: 1. independence of the selected extreme values; 2. using a sufficiently large number of extremes; 3. using values that are genuinely extreme. Despite the importance of the third requirement, many studies actually deal with so-called soft extremes [26], which aremaxima oftoo short data blocks or with toosmall returnperiods for the basic assumptions of the theory to hold. This is often the consequence of the limited amount of available data: on one hand, one has to restrict to maxima of data blocks, thereby discarding most available data; on the other hand one would like to have a long sequence of extreme values. Thenet result isthat theassumptions of theextreme valuetheorems oftengo unchecked and are sometimes plainly impossible to check, since the available climatic records cover at best the last century. Therefore, thinking in terms of annual maxima, in such cases we only have 100 extremes. The inevitable consequence of adapting the definition of extremes to the needs of the work is a serious reduction of reliability of the resulting estimates. The goal of this paper is to infer and critically quality-check the statistical description of extreme values in the GEV distributions framework on the “earth-like” time series produced by a dynamical system descriptive of the mid-latitude atmospheric circulation featuring a chaotic regime. Such system has internally generated noise and can be effectively considered as a stochastic generator of data. Time series of the system’s total energy E(t) are used, which is a relevant global physical quantity. We analyze how the GEV distribution inferred from block maxima of E(t) depends on the value of the most important parameter of the system, namely the forced equator-to-pole temperature difference T , which controls the E baroclinicity of the model. The reliability of the GEV fits is studied, by considering both shorter sequences of extremes and soft extremes. Moreover, issues related to model error and sensitivity arebriefly examined by analyzing the effects of variationsin modelresolution. The use of numerically generated data allows us to avoid all the difficulties related to shortness of the available climatic records, such as missing observations and low-quality data. In particular, we do not need to worry about the wastage of data caused by the selection of annual maxima, which is a serious limitation when considering observational data. In such methodological sense, our approach is similar to that of [48] as far as statistical inference is concerned. However, an important difference is that the statistics of the time series E(t) generated by the atmospheric model cannot be directly chosen: there is no explicit formula relating the probability density function of the adopted observable and the parameter T . E This problem we analyze elsewhere [35]. The structure of the paper is now outlined. In Sec. II we first describe the set-up of the numerical experiments performed with the atmospheric model and then the methods of statistical analysis of extreme values adopted for the total energy time series. The results for the considered reference case of 1000 yearly maxima are presented in Sec. III. Assessment of the sensitivity of the inferences is studied in Sec. IV, by varying the length of yearly maxima sequences, the block length over which maxima are taken, and the model resolution. The dependence of the GEV parameters with respect to T is also analyzed in this section. E Sec. V summarizes the results and their relation with the above discussion. The theory and the methods of Extreme Value distributions, as far as needed in the present work, are briefly reviewed in Appendix A. The model of the baroclinic jet used as a stochastic generator is described in Appendix B, referring to [35] for a thorough discussion. II. DATA AND METHODS A. Total Energy of the Atmospheric Model We consider a quasi-geostrophic intermediate complexity model [35, 36, 46] (also see Ap- pendix B), providing a basic representation of the turbulent jet and of the baroclinic conver- sion and barotropicstabilization processes which characterize the physics of the mid-latitudes atmospheric circulation. The model is relaxed towards a given equator-to-pole temperature profile which acts as baroclinic forcing. It features several degrees of freedom in the latitudi- nal direction and two layers in the vertical - the minimum for baroclinic conversion to take place [41, 44]. The system’s statistical properties radically change when the parameter T , E determining the forced equator-to-poletemperature gradient, is changed. In particular, as T E increases a transition occurs from a stationary to an earth-like chaotic regime with internally generated noise. By chaotic, we mean that the system possesses a strange attractor in phase space [7]. For a detailed description of the model physics and dynamics see [35]. In the present setting, the model is used as a stochastic generator of earth-like time series for testing the reliability of different statistical approaches [48] and studying the dynamics of extremes [35]. A uniformly spaced grid of 21 values of the parameter T is fixed in the E range [10,50], starting from 10 and increasing with step 2. The baroclinic model is run for T fixed at each of these values, producing 21 simulations of length 1000 years (preceded by E an initial transient of five years) where the total energy E(t) is written every 6 hours. The formula of the total energy is given in Appendix B, equation (B15). We recall that, in the non-dimensionalization of the system, T = 1 corresponds to 3.5K, 1 unit of total energy E corresponds to roughly 5×1017J, and t = 0.864 is one day, see [35] for details. For each of the selected values of T , a chaotic attractor is numerically detected in the E phase space of the model. This is illustrated by the autocorrelations of the time series of the total energy E(t) (Fig. 1), which decay to zero on a time scale that is comparable with that of the atmospheric system (roughly 10-15 days [33]). Since all parameters of the model are kept fixed in each simulation, by discarding the initial transient, the time series of E(t) may be considered a realization of a stationary stochastic process. The distribution of the total energy time series is visualized by means of the histograms and boxplots in Fig. 2, for three values of T . Notice that, as T increases, E E • the upper tail of the distribution becomes heavier, whereas the lower tail shortens; • both the average value and the variability of the total energy time series increase. The latter point is clearly visualized in Fig. 3, where the time-averaged total energy is dis- played for each of the 21 stationary time series, together with confidence intervals. Through- out the paper, confidence intervals are computed as average plus/minus sample standard deviation multiplied by 1.96. In concluding this section a theoretical remark is in order here. All examined strange attractors are implicitly assumed to possess a unique Sinai-Ruelle-Bowen (SRB) ergodic in- variant measure [7]. This is indeed a rather general and difficult problem in Dynamical Systems and Physics: on the one hand, existence of a unique SRB measure is necessary to rigorously associate a stationary stochastic process with the dynamical evolution law. On the other hand, existence of a unique SRB measure is a very strong regularity assumption for a dynamical system: in general it is even the question whether invariant measures exist at all and, if so, whether a finite or infinite number of invariant measures coexist for a given chaotic system. Moreover, even if an SRB measure exists and is unique, it is in general non-parametric: there is no explicit formula relating the statistical behavior to the system’s equation and parameters. B. Parameter Estimation and Model Assessment in GEV Inference As discussed in the previous section, the time series we work with are characterized by fast decay of autocorrelations, which implies weak (short time-range) dependence of the observations, compare Fig. 1. Inference of threshold exceedance models [5, 8, 29] is in this case complicated by the choices of suitable threshold values and cluster size for declustering (see e.g. [5, Chap. 5]), which might be somewhat arbitrary in the applications. On the other hand, since the dependence is short-range, maxima of the total energy time series, taken over sufficiently large data blocks, are with good approximation independent. This is why we have preferred the use of the GEV with respect to threshold models. Moreover, since we can generate time series of arbitrary length, for simplicity we refrained from using the r-largest statistics, which is often valid alternative to the GEV, especially when data scarcity is an issue. In this section, therefore, we recall the methods of GEV inference as far as needed in the present work. The exposition is largely based on [5]. Also see [3, 5, 8, 9, 10, 11, 29] for methodology and terminology of extreme value theory. Gnedenko’stheorem[11],orthethree types theorem,firstpresentedinaslightlylessgeneral form by Fisher and Tippet [9]) states that, under fairly mild assumptions, the distribution of the block-maxima of a sample of independent identically distributed variables converges, in a suitable limit, to one of three types of extreme value distributions. The three types are in fact special cases of the GEV distribution (also called von Mises type), having the following expression: −1/ξ x−µ G(x) = exp − 1+ξ (1) σ ( ) (cid:20) (cid:18) (cid:19)(cid:21) for x in the set {x : 1+ ξ(x−µ)/σ > 0} and G(x) = 0 otherwise, with −∞ < µ < +∞, σ > 0, and −∞ < ξ < +∞. The quantities (µ,σ,ξ) are called location, scale and shape parameter, respectively. In such a framework, statistical inference of extreme values amounts to estimating the GEV distributional parameters (µ,σ,ξ)for a given time series and assessing the quality of the fit. If ξ > 0 (ξ < 0) the distribution is usually referred to as Fr´echet (Weibull) distribution, if ξ = 0 we have the Gumbel distribution, which can be expressed as (A5). See Appendix A for further theoretical details and [3, 5, 8, 10, 29] for examples and discussion. In practical application of the extreme value theory the parent distribution function of the data is typically unknown. Therefore, both the type of limiting distribution and the parameter values must be inferred from the available data and the quality of the resulting estimates should always beassessed. ForGEVinference, a sequence of maxima isconstructed bysubdividingtheavailabledata{x }intoblocksofequallengthandextractingthemaximum i fromeach block. The block length isone of the choices playing the usual, critical role between bias and variance in the parametric estimates. Too short blocks increase the length of the maxima sequence but, at the same time, they increase the risk of failure of the limit (A4). If the blocks are too long, the resulting scarcity of maxima induces an enhanced uncertainty of the inferred values of the GEV parameters. In many situations a reasonable (and sometimes compulsory) choice is to consider the annual maxima (see [5]). Assume that the observations in the time series are equispaced in time and that none of them is missing (both conditions are often violated in concrete cases, see e.g. [40]). Let n be the number of observations in a year and denote by M ,...,M the sequence of n,1 n,m the annual maxima, i.e., the maxima over data blocks of length n. Under the assumption of independence of the X , the variables M ,...,M are independent as well. In fact, n n,1 n,m approximate independence of the M holds also in the case of weak dependent stationary n,i sequences, see [5, 29] for definitions and examples. Among the numerous methods to infer the GEV parameters (graphical or moment-based techniques, see [3]), we adopt the maximum likelihood estimator for its great adaptability to changes of models. Denote by θ = (µ,σ,ξ) the parameter vector for the GEV density g(x;θ), the latter being the derivative of G(x) = G(x;θ) in (1). In the stationary context, the block

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.