ClimDyn(2014)43:1271–1283 DOI10.1007/s00382-013-1939-x Evaluation of CMIP5 dynamic sea surface height multi-model simulations against satellite observations Felix W. Landerer • Peter J. Gleckler • Tong Lee Received:17May2013/Accepted:6September2013/Publishedonline:6October2013 (cid:2)Springer2013 Abstract We evaluate the representation of dynamic sea amplitudes in this region. There is considerable spread surface height (SSH) fields of 33 global coupled models acrosstheCMIP5ensemblefortheseasonalandinterannual (GCMs)contributedtothefifthphaseoftheCoupledModel SSHvariabilitypatterns.Becauseoftheshortobservational Intercomparison Project (CMIP5). We use observations period, the interannual variability patterns depend on the from satellite altimetry and basic performance metrics to time-period over which they are derived, while no such quantifytheabilityoftheGCMstoreplicateobservedSSH dependencyisfoundforthetime-meanpatterns.Themodel of the time-mean, seasonal cycle, and inter-annual vari- performancemetricsforSSHpresentedhereprovideinsight ability patterns. The time-mean SSH representation has intoGCMshortcomingduetoinadequatemodelphysicsor markedly improved from CMIP3 to CMIP5, both in terms processes. While the diagnostics of CMIP5 GCM perfor- ofoverallreductioninroot-meansquaredifferences,andin mancerelativetoobservationsrevealthatsomemodelsare termsofreducedGCMensemblespread.Biasesofthetime- clearlybetterthanothers,modelperformanceissensitiveto mean SSH field in the Indian and Pacific Ocean equatorial the spatio-temporal scaleschosen. regionsareconsistentwithbiasesinthezonalsurfacewind stress fields identified with independent measurements. In Keywords Sea surface height (cid:2) CMIP5 (cid:2) the Southern Ocean, the latitude of the maximum meridi- GCM skill (cid:2) Model evaluation (cid:2) AR5 onal gradient of the zonal mean SSH CMIP5 models is shifted equatorward, consistent with the GCMs’ spatial biasesinthemaximumofthezonalmeanwesterlysurface 1 Introduction windstressfields.However,whiletheSouthernOceanSSH gradients correlate well with the maximum Antarctic cir- Sealevelchangesreflecttheocean’sintegralresponsetoa cumpolar current transports, there is no significant corre- broadspectrumofprocessesthataffecttheoceanscurrents lation with the maximum zonal mean wind stress and density structure, as well as the total ocean volume amplitudes, consistent with recent findings that the eddy (Milne et al. 2009). Numerous studies have documented parameterisations in GCMs dominate over wind stress that sea level has risen globally throughout the twentieth century at a mean rate of about 1.8 mm/year (Meehl et al. 2007). The rate has increased to about 3.1 mm/year since Electronicsupplementarymaterial Theonlineversionofthis article(doi:10.1007/s00382-013-1939-x)containssupplementary 1993(ChurchandWhite2011),andisprojectedtoincrease material,whichisavailabletoauthorizedusers. further over the twenty-first century under global warming scenarios (Perrette et al. 2013). Both the observed and F.W.Landerer(&)(cid:2)T.Lee projected sea level changes are spatially highly non-uni- JetPropulsionLaboratory,CaliforniaInstituteofTechnology, Pasadena,CA,USA formduetodifferentialoceanwarming,windchanges,the e-mail:[email protected] influence of ocean dynamics, and gravitational and solid earth responses (Milne et al. 2009; Yin 2012; Meyssignac P.J.Gleckler et al. 2012; McGregor et al. 2012). The World Climate ProgramforClimateModelDiagnosisandIntercomparison, LawrenceLivermoreNationalLaboratory,Livermore,CA,USA Research Program’s (WCRP’s) Coupled Model 123 1272 Landereretal. IntercomparisonProject(CMIP;Phase3and5)includesan mean. Therefore, the global mean of SSH is zero at every unprecedented set of climate simulations from coupled time step, and we do not consider global mean sea level general circulation models (GCMs) for the recent past as changes here. The latter can be steric due to net ocean well as for future climate (Meehl et al. 2007). This pro- heat changes, or non-steric due to net ocean mass changes vides a unique opportunity to assess how well climate from melting land ice. For ocean warming, global mean models simulate key characteristics of sea surface height sea level can be computed off-line even if the GCMs (SSH)withhighqualitysatellitedata,ashasbeendonefor employ a volume-conserving Boussinesq approximation other observables (e.g. Kwok 2011; Li et al. 2012; Lee (Greatbatch 1994; Griffies and Greatbatch 2012). While et al. 2013; Jiang et al. 2012). the dynamic SSH patterns related to surface momentum The main goal of our paper is to evaluate how well the and buoyancy fluxes are explicitly and adequately simu- spatialandtemporalfeaturesofSSHpatternssimulatedby lated(Yinet al.2010a),thecurrentgenerationCMIP3and theCMIP5modelscomparestoavailableobservations.All CMIP5 models donotaccount fornetocean masschanges GCMs show global steric sea level rise from heat uptake from melting glaciers and ice sheets. If associated sea under twenty-first century warming scenarios, but the level changeswereuniform,the global meancouldsimply individual magnitudes vary (Yin 2012). In addition, be subtracted from the observations. Hoewever, due to regional sea level changes can deviate by up to ± 100 % gravitational and loading effects on sea level (Farrell and from the global mean changes (Yin 2012; Landerer et al. Clark 1976), net ocean mass changes have a distinct non- 2007; Yin et al. 2010a; Pardaens et al. 2010), but these uniform SSH pattern, with the largest deviations from a regional patterns show only a few regions of agreement uniform rise occuring in the near field of the mass sources across the various models (Perrette et al. 2013; Yin 2012). (Tamisiea et al. 2001). For the purposes of this paper, the The CMIP3 multi-model ensemble (MME) exhibited sig- issue is then whether land-ice fingerprint signals might be nificant inter-model spread in the magnitude of projected present in the 20-year altimetry record, and whether they SSH-change patterns, such that over large areas of the could lead to a bias in the comparison to the CMIP global oceans the inter-model differences towards the end models. ofthetwenty-firstcenturywerelargerthanthemodel-mean Several recent papers have explored detection thresh- change (Meehl et al. 2007). Notable exceptions to this olds for land-ice fingerprints, assuming specific levels of were areas of consistent SSH change in the Southern ice melt (Kopp et al. 2010; Hay et al. 2012). While these Ocean, and patches in the Arctic, South-East Pacific and studies differ slightly in their assumptions of melt rates, West-Indian Ocean (see Fig. 10.32 in Meehl et al. 2007). background SSH variations and observing systems (i.e., In an effort to reduce the ensemble spread (Yin et al. relative sea level or sea surface height), a common result 2010a), used the global mean root-mean square difference is that accelerating ice melt rates should be detectable via (RMSD)betweentheGCMs’andobservations’time-mean their associated sea-level fingeprints over the next dec- SSHfieldstoexclude5of17CMIP3modelsfromthemulti- ades. However, for the lower ice-melt rates over the last model ensemble, which slightly improved the inter-model 20 years, the gravitational and loading effects are likely agreement for projected SSH changes at the end of the masked by dynamic SSH variability. This is consistent twenty-firstcentury.Here,weexpandthemodelevaluation with results from a joint-inversion approach, using both toannualandinterannualtimescalesasdescribedinSect.2 altimetry and time-variable gravity observations, that Our results highlight various improvements in the new found altimetryobservations to currently have at best only CMIP5ensembleoverCMIP3,andalsoaspectsofconsistent marginal resolution capability of mass-related sea level modelbiasesthatprovidenewinsightsintodeficienciesand changes (Rietbroek et al. 2012). Based on contemporary shortcomings of the underlying model formulations and geocentric sea-level fingerprints (Riva, pers. communca- physics. The paper is organized as follows: in Sect. 2, we tion, 2013), we also note that the CMIP-biases discussed describethemethodsanddatasetsandobservations;inSect. below are an order of magnitude larger (and vary in sign) 2.2,wesummarizemethodsandanalysisschemes;inSect.3, near the ice-sheets than the expected fingerprint ampli- wepresentresultsofthevariousskillmetricsandcompari- tudes over the last 20 years. Therefore, we perform our sonsfortheCMIP3andCMIP5ensembles,andinSect.4we following analysis under the assumption that the non- discussthemainconclusionsofthispaper. homogeneous sea-level fingerprints from ocean mass changes do not significantly impact the comparison to CMIP models. Dynamic SSH from the CMIP models can 2 Methods and data description then be directly compared to SSH observations from satellite altimetry as long as the global mean of each data OuranalysisfocusesondynamicSSH,whichis definedas set at every time step is removed using a common land the local sea surface height deviation from the global mask (see inset in Fig. 2). In this way we ensure that 123 Seasurfaceheight:CMIP5versusobservations 1273 Fig.1 Taylordiagramofthe 0.9 CMIP3(greydots)andCMIP5 Observations GFDL−ESM2M CMIP5 mean GISS−E2−R (coloredsymbols,seelegend) CMIP3 mean HadGEM2−CC meandynamictopographies 0.8 ACCESS1−0 HadGEM2−ES (MDTs)comparedagainstthe C observedMDTfrom ABCCCCE−CSSSM1−13−1 IINPSMLC−MC4M5A−LR orr (Maximenkoetal.2009).The CCSM4 IPSL−CM5A−MR el CMIP3andCMIP5ensemble 0.7 CESM1−BGC IPSL−CM5B−LR at averages(shownasstars)show CESM1−WACCM MIROC−ESM−CHEM i o thebestagreementwith n CMCC−CM MIROC−ESM n observations(seemapinsetin o CNRM−CM5 MIROC4h Fig.2foroceanareasused). ati 0.57 CSIRO−Mk3−6−0 MIROC5 0.8 vi NotethatCMIP3RMSDvalues e CanCM4 MPI−ESM−LR d areconsistentwiththoseshown d 0.5 CanESM2 MPI−ESM−MR inFig.2of(Yinetal.2010b) ar EC−EARTH MPI−ESM−P nd GFDL−CM2p1 MRI−CGCM3 Sta 0.4 GGFFDDLL−−CESMM32G NNoorrEESSMM11−−MME 0.9 6 0. 5 0.95 0. 4 D 0. S M 3 R 0. 2 0. 0.99 1 0. 0 0 0.4 0.5 0.57 0.7 0.8 0.9 0.25 0.35 m] 0.2 D [ S M R 0.25 T D 0.15 M an m EM Global Me 0.00.51 ensmeanCMCC−CMEC−EARTHbcc−csm1−1GFDL−CM2p1MIROC4hCNRM−CM5HadGEM2−ESACCESS1−0HadGEM2−CCMPI−ESM−MRMRI−CGCM3GFDL−CM3MIROC5GFDL−ESM2MNorESM1−MNorESM1−MEIPSL−CM5A−MRCSIRO−Mk3−6−0CESM1−WACCMIPSL−CM5A−LRACCESS1−3CESM1−BGC_esGFDL−ESM2GMPI−ESM−PCCSM4MIROC−ESMMPI−ESM−LRMIROC−ESM−CHCanESM2IPSL−CM5B−LRGISS−E2−RCanCM4 inmcm4 00..0155 1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233 34 Fig.2 GlobalmeanRMSDbetween33CMIP5models,theCMIP5 runs (central red mark is the median, box edges are 25th and 75th model mean, and the observations of the MDT from (Maximenko percentiles).NotethedifferentverticalscalefortheINMCM4model etal.2009). Thegrey barsrepresent themodel/observationsRMSD on the right. CanCM4 is for years 1961–2005 only, HadGEM2-CC value for the time-period 1993–2002, and the whiskers the distribu- for 1959–2005 only, and MIROC4h for 1950–2005 only. The map tion of RMSDs between the observed MDT and 10-year sliding inset shows the ocean area over which the model statistics were windowsforeachCMIP5GCMforthe‘historical’twentiethcentury computedforallCMIP5GCMs;marginalseashavebeenexcluded biases are not due to a global mean offset. As land-ice 2.1 Data sets melt continous to accelerate and contribute to sea level changes in the future, source-specific SSH fingerprint We use sea surface height above Geoid fields from 33 patterns may need to be considered to avoid biases in GCMs from the new CMIP5 multi-model ensemble, and model-to-observation comparisons. also18CMIP3GCMstoassessimprovementsfromtheold 123 1274 Landereretal. to the new ensemble. We use the first available GCM We compare the CMIP SSH fields against observations realizations of the twentieth century experiments (identi- of the time-mean dynamic topography (MDT) from fied as ‘20c3m’ in CMIP3 and ‘historical’ in CMIP5) if (Maximenko et al. 2009), which combines observations available. Several GCMs in the CMIP5 ensemble differ from satellite altimetry, near-surface drifters, NCEP wind only in their non-ocean model components (e.g., Nor- and observations from the Gravity Recovery and Climate ESM1-ME and NorESM1-M), but we include all GCM Experiment (GRACE) over the time period 1992–2002. variants in our analysis. For CMIP3, the twentieth century For time-variable SSH signals, we use satellite altimetry experimentsstartbetween1860and1900,andendin2000; observations from 1993–2012 provided by AVISO (Ducet inCMIP5,thehistoricalrunsstartbetween1850and1960, et al. 2000), which are now available as part of the and end in 2005. As the SSH observations extend into the ‘obs4MIP’ project ( We twenty-first century (see below), we appended the CMIP3 re-grid all CMIP and observational data sets onto a com- ‘20c3m’ runs with the appropriate model fields from an mon 1x1 latitude-longitude grid using bi-linear interpola- A1B scenariorun(Nakicenovic and Swart 2000)toensure tion. We note that the time-mean model assessment is not consistent time overlap, but the choice of scenario is not dependendontheuseoftheMaximenko-MDTorthetime- criticalastheyaresimilaroverthisshortextensionperiod. mean AVISO data. In fact, the MDT and time-mean ForCMIP5,weuseresultsfromthe‘historical’simulations AVISO fieldshaveapatterncorrelationofR = 0.998,and only. Tests with a CMIP5 model subset revealed no sig- either one yields essentially identical in the following nificant sensitivities to the particular model years (e.g., analysis(i.e.,inFig. 1).SincedynamicSSHissignificantly 1993–2012 vs. 1993–2005 only) for the time-mean and influenced by surface momentum fluxes from wind over annual climatology comparisons performed here. We many regions, we also use the Scatterometer Climatology exclude some marginal ocean areas or enclosed seas from of Ocean Winds (SCOW) based on QuikSCAT satellite our analysis where the models exhibit SSH biases that are measurements (Risien and Chelton 2008) to assess if clearly unrealistic and would unduly distort squared-dif- common biases between SSH and wind stress exist. This ference metrics (e.g., MIROC-ESM CMIP5 models have will enable us to evaluate features that we expect to be SSHvaluesof?15moverHudsonBay,andvaluesof-15 similar between SSH and wind stress using independent m over the Mediterranean). We suspect that these biases measurements. A detailed evaluation and comparison have more to do with model resolution or how the diag- between CMIP3 and CMIP5 simulated surface wind stress nostic SSH calculation is performed rather than a funda- and QuikSCAT observations can be found in (Lee et al. mentaldeficiencyinthesimulation.Ancompleteoverview 2013). of CMIP3 and CMIP5 ocean and atmosphere GCMs and theirresolutionscanbefoundinTables3and4ofJourdain 2.2 Metrics et al. (2013). We note that the spatial resolution of the ocean components in the CMIP3 and 5 archive varies, but The choice of a metric to evaluate model performance is even the newer CMIP5 runs generally do not resolve somewhat subjective but we use several basic statistical oceanic eddies, which are thus parameterized. For the two measures that are routinely used in meteorological and CMIP5 GCMs GISS-E2-R and MIROC5, we added the climateanalysis,namely:theglobalmeanstatisticsofroot- equivalentwaterthicknessoftheirrespective seaicefields mean-square differences (centered RMSD), spatio-tempo- to obtain an effective sea surface height (Griffies et al. ral correlation, and standard-deviation. These basic mea- 2009). sures can be examined collectively, e.g., in Taylor- Insufficient spin-up of the control runs in CMIP-type diagrams (Taylor 2001) and are useful first steps for GCMsoftenshowresidualdriftthatcanberemovedinthe quantifying how well the models agree with observations forced runs (Yin 2012; Yin et al. 2010a). In the following as well as with each other. We also examine spatial fields analysiswedonotcorrectthehistoricalrunsforcontrolrun ofabsolutebiasesbetweenmodelsandobservations.While drift because (1) most of the drift maps into the global it is reasonable to examine model-to-observation agree- meansealevelwhichweremove,and(2)becausethedrift mentforthetime-meanandannualcyclefields,interannual should have little impact on the mean annual cycle and and longer variations are expected to differ as CMIP sim- interannual variations (see discussion below). Significant ulationsareforcedexternally,butfreetoevolveintermsof drift can hypothetically impact the time-mean dynamic internalvariability.This,andtherelativelyshorttime-span SSH topography, but our analysis indicates that this is not of available global observations, makes it challenging to the case here (see Sect. 3.1). However, for detection-attri- examinehowtheCMIPmodelsagreewithobservationson butionanalysesandprojections,adriftcorrectioniscrucial interannual time scales. Therefore, we limit our CMIP-to- for estimating trends and variability. (Yin 2012; Gleckler observations comparison on patterns and amplitudes of et al. 2012). interannual RMS variability after removing a seasonal 123 Seasurfaceheight:CMIP5versusobservations 1275 climatology and band-pass filtering between 1 and 10 years, and we assess the impact of choosing different 80 time-periods (see Sect. 3.3 for details). 60 40 3 Results 20 3.1 Time-mean SSH field 0 −20 We first analyze the representation of the time-mean dynamic SSH fields, which are closely related to the time- −40 mean ocean circulation that governs the large-scale trans- ports of heat, freshwater, and nutrients. A Taylor diagram −60 for the 33 CMIP5 MDT fields averaged over 1992–2002 −80 (Fig. 1) reveals pattern correlations varying between 0.95 and 0.99, and standard deviations slightly larger than what 0 50 100 150 200 250 300 350 is observed. The RMS differences between observations −2 −1 0 1 [meters] and CMIP5 models cluster between 12 mm (CMCC-GM) and 19 mm (CanCM4); the INMCM4 model (RMSD = 36mm)appearstobeanoutlierinthisanalysis. 80 Similartopreviousfindingsformanyothermodelvariables 60 (Gleckler et al. 2008), the CMIP5 ensemble mean SSH- MDTyieldsthehighestcorrelationandthelowestRMSDat 40 about 9 mm. Based on the MDT Taylor statistics, the 20 CMIP5 simulations have improved markedly over the CMIP3runs(greydotsinFig.1)inseveralaspects.Firstly, 0 the absolute RMSDs are generally reduced in CMIP5. Secondly, the spread of the global RMSDs among the dif- −20 ferent CMIP5 models has noticeably decreased (with the −40 exception of INMCM4) from the CMIP3 spread, which featuredRMSDsthatvariedbyafactorofupto&3across −60 the ensemble (Yin et al. 2010a). Thirdly, the multi-model −80 mean RMSD in CMIP5 is about 25 % smaller than in CMIP3 (note, however, that the CMIP3 ensemble consists 0 50 100 150 200 250 300 350 of18models,whereasourCMIP5ensembleconsistsof33). −0.2 0 0.2 [meters] Oceanic variability time scales can be longer (several decadesandlonger)thanwhattheMDTobservationsused Fig.3 Top mean dynamic topography observations from (Maxi- here cover (11 years). Since unforced internal climate menkoetal.2009).Bottombiasbetweentheensemblemeanofthe33 CMIP5modelsandtheobservations.Thebiaswascomputedoverthe variability [e.g., El Nin˜o-Southern Oscillation (ENSO), time-mean SSH for model years 1992–2002. For individual model NorthAtlanticOscillation(NAO),etc.]intheGCMsisnot biases,seeSuppl.Material.Units:meters constrained to be synchronized to real-world occurrence, we wondered if the RMSDs between observed and CMIP available, as is commonly done, but the ensemble size MDTs might be due to long-term variability and changes available for many models is very small or limited to a over time. To examine this, we again computed the global single realization. With our approach, the use of a sliding RMSDoveraslidingwindowofallpossible11-yearmean windowinasinglerealization,eachmodelisevaluatedina SSH fields for each CMIP5 model over the twentieth consistent manner. The results of the sliding window ana- century (from 1870 to 2005) against the single realization lysis, shown aspercentileboxesinFig.2,indicatethat the of the observed MDT. While the individual RMS differ- RMSDsforeachindividualmodelarerelativelystableover ences of overlapping windows are not independent from the twentieth century, with individual model RMSDs eachother,thisapproachdemonstratesthatdifferencesina varyinggenerallylessthan10 %.Weinterprettheseresults model’s RMSD due to an evolving 20 year climatology is suchthatinterannualordecadal-scalevariabilityisunlikely relatively small compared to inter-model differences. the main source of the differences between observed and Alternatively, one could look at all the realizations simulated time-mean SSH fields. 123 1276 Landereretal. The RMSD variations among the CMIP3 models are 0.3 generally larger than the twentieth-century RMSD varia- 0.2 tions for each individual CMIP3 model (not shown), indi- m] cating that the biases are systematic—and different– for T [ 0.1 mostofthemodels.InCMIP5,model-to-modeldifferences D M 0 are more similar in amplitude to an individual model’s H twentieth century variations. We also examined if the SS−0.1 CMIP5-observations MDT RMSDs are consistently at a −0.2 CMIP5 mean minimum(relativetotheentiretwentiethcentury)overthe MDT observations −0.3 contemporaneous time-period (1992–2002, grey bars in Fig. 2). If that were the case, it could indicate that exter- Indian Pacific Atlantic nally forced SSH changes (i.e., from greenhouse gases) 2] m havesignificantlyandconsistentlyaffectedtheMDTinthe N/ 0.02 CMIP simulations. The results in Fig. 2 are somewhat s [ 0.01 s iMncDoTncRluMsiSvDe ianpptehaisrsrteogbaerda:twahmilienimfourmsodmureinmgothdeelpse,ritohde d stre −0.001 overlapping with observations, other models have larger win −0.02 e −0.03 RMSDs in the overlapping time period. Not finding a c consistentsignatureinthismetricisperhapsnotsurprising urfa −0.04 s −0.05 given the global scale of the analysis, as well as the nal −0.06 CQMuiIcPk5S CmAeTan intrinsic variability (see Sect. 3.3). The spatial patterns of o z 50 100 150 200 250 300 350 theMDTbiasesvaryacrosstheGCMs,butsomeconsistent longitude East features emerge (Fig. 3, see also Suppl. Material Fig. 1SM). Most GCMs have prominent biases over the Fig.4 Top absolute MDT bias between individual CMIP5 models (grey), the CMIP5 model mean (black), and the MDT observations Southern Ocean and the Indian Oceans, and we thus (red).Thebiaseswerecomputedbetween2S–2Noverthetime-mean examine these regions in more detail in the following SSHformodelyears1992–2002,matchingtheobservations.Bottom sections. While much of the Pacific Ocean regions away equatorialzonalwindstress(2S–2N)oftheCMIP5ensemble(black) from the equator are relatively well simulated in most and QuickSCAT observations (red); see (Lee etal. 2013) for a detailedanalysisofindividualCMIP5GCMs GCMs,wealsoexaminetheequatorialPacificbiasesdueto the importance of this region for interannual climate modes. stressistooweakandleadstoaweakerSSHgradientthan observed; in the East, models and observations agree well 3.1.1 Equatorial regions for both zonal wind stress and SSH. Over the Atlantic basin, the easterly wind stress in CMIP5 is generally too Ocean dynamics in the equatorial regions are strongly weak, which contributes to weaker than observed CMIP5 influenced by surface momentum fluxes (wind stress). We SSH gradients. Other forcings than wind stress influence therefore investigated if the CMIP5 MDT biases are con- SSH variability in this region (McGregor et al. 2012), sistentwithbiasesofmeanzonalsurfacewindstressinthe thoughitisnotclearifthiscanexplainthetime-meanbias equatorial regions between 2S and 2N. Because we are as well. Qualitatively, very similar results hold for the looking at individual ocean basin biases, the equatorial CMIP3 models (not shown here; see (Lee et al. 2013) for zonal mean SSH between 2S and 2N for each ocean basin details). (Pacific, Atlantic, and Indian Ocean) has been subtracted. A detailed assessment of surface wind stress in CMIP3 In the Indian Ocean, the time-mean strength of Westerlies andCMIP5simulationsisdiscussedin(Leeet al.2013).In in the CMIP5 models is generally too weak relative to the theequatorialregions,theMDTstructureiscloselyrelated QuickSCAT observations (Fig. 4). This in turn implies a to the time-mean vertical pycnocline structure, and hence too weak Indian Ocean equatorial SSH zonal gradient. any biases in MDT would be mirrored in the pycnocline Indeed,the mean equatorial upwardtilt towards the east is depth. The too weak SSH gradients in the Indian and consistently too weak in the CMIP5 models (Fig. 4). AtlanticOceanscorrespondtoapycnoclinethatistooflat, Similarly, the CMIP5 mean SSH biases in the tropical which in turn influences the models’ capabilities to prop- Pacific are consistent with the zonal wind stress biases: in erly generate tropical climate modes such as the Indian theWest,themodelshaveatoostrongeasterlywindstress Ocean Zonal/Dipole Mode. A recent analysis has found component that leads to a steeper SSH gradient than that the unrealistic Indian Ocean pycnocline structure observed; in the Central Pacific, the models’ zonal wind (which mostly depends on temperature) in the CMIP 123 Seasurfaceheight:CMIP5versusobservations 1277 models leads to a thermocline-SST feedback that is too −42 observations strong, and hence an overestimate of Indian-Ocean dipole amplitude (Cai and Cowan 2013). −44 −46 3.1.2 Southern Ocean −48 Thetime-meanSSHfieldintheSouthernOcean(primarily e d u−50 a down-sloping meridional gradient towards the South) is atit closelyrelated tothe strengthofthe Antarctic circumpolar L −52 current (ACC). The circumpolar flow is affected by the zonal momentumbalance, surface buoyancy fluxes of heat −54 and freshwater, and by the Southern Ocean overturning. Thezonalcurrentvelocitiesarerelatedtothemeridionally −56 tiltedisopycnalsurfacessetupbytheEkmantransportand −58 overturning circulation in the Southern Ocean (Gill 1982). However, significant poleward eddy-induced transports max(SSH_dy) max(tau_x) tend to reduce the meridional isopycnal gradients, which Fig.5 Comparison of the zonal mean latitudinal position of the would tend to reduce the ACC transport. Due to their rel- maxima of the meridional SSH gradients andwesterly surface wind ativelycoarsehorizontalresolution,theoceanmodelsused stressmaximainthe33CMIP5models(box-plots;medianisshown in coupled CMIP simulations require a parameterisations in red, box edges are at the 25th and 75th percentiles), and for the for these eddy-induced transports, for example through a observations(blackdots).See(Leeetal.2013)fordetailsonthewind stressanalysis quasi-Stokes diffusivity constant based on (Gent and McWilliams 1990) (also Griffies 1998, and references therein). Here,weevaluatetwoquestionsregardingtheSouthern v] 0.2 Ocean mean dynamic topography biases seen in CMIP5 S 200 models: (1) Are biases in the latitudinal location of the port [ −2] maximummeanzonalwesterlywindstressfieldconsistent ans Nm with biases in the latitudinal location of the maximum C tr ax [ meridional sea surface height gradient ðSSHdx(cid:2)yÞmax; and (2) AC 0.15mτx are there significant correlations between the maximum ACC transports, westerly wind stress maxima, and maxi- 100 R=0.89 R=0.34 mum meridional sea surface height gradients? Several 00..0066 00..0088 00..11 00..1122 00..1144 00..1166 00..1188 recent papers have documented coherent equatorward dSSH/dymax biasesofthemaximumwesterlywindstress(andhencethe westerlyjetsthatdrivetheACC)inCMIP5models(Swart Fig.6 Scatter plot of the maxima of westerly surface wind stress and Fyfe 2012; Meijers et al. 2012). Consistent with this (graycircles),andmaximumACCtransportsthroughdrakepassage bias, we find that the maxima of the zonal mean MDT (blue triangles) in CMIP5 models as a function of the zonal mean maximaoftheSouthernOceanmeridionalSSHgradients.Zonalwind meridional gradients also show a clear tendency for being stressandACCvaluesaretakenfromTable3of(Meijersetal.2012), biased northward compared to observations, although the andplottedagainstthecorrespondingSSHvalues spread is larger than for the zonal wind stress maxima (Fig. 5). As far as correlations between the maximum amplitude, whereas the models’ meridional density differ- amplitudes of ðSSHx(cid:2)Þ ; ACC transport and westerly ence Dq across the ACC was. Our results, using dy max y wind stress are concerned, we found no significant rela- ðSSHx(cid:2)Þ as a first-order proxy for ACC transport (the dy max tionship between the maxima of westerly wind stress and correlation is R = 0.89, Fig. 6), are in line with both of ðSSHdx(cid:2)yÞmax (grey dots in Fig. 6). Similar conclusions were these studies. As discussed in detail by (Kuhlbrodt et al. presented recently by (Meijers et al. 2012), who showed 2012) based on CMIP3 simulations, the quasi-Stokes dif- that the maximum ACC transport is not significantly cor- fusivity parameterisations j of eddy-induced transports, relatedwiththemaximumwindstressintheCMIP5-GCM used in many CMIP models, can explain the large across- ensemble.Furthermore, (Kuhlbrodt et al.2012) alsofound modelvarianceofthesimulatedACCmaximumtransports. that the ACC transport in CMIP3 models is not signifi- Therefore, wind stress is not the dominant factor that cantly correlated to the maximum westerly wind stress determinesACCtransportsandthecorrespondingSouthern 123 1278 Landereretal. Fig.7 Taylordiagram Observations summarizingthenormalized CMIP5 mean statisticsofthemeanmonthly 0.1 0.2 CMIP3 mean 0.3 ACCESS1−0 climatologySSHfieldsofthe 0.4 ACCESS1−3 CMIP5models(seelegend), BCC−CSM1−1 andCMIP3models(greydots). 0.5 CCSM4 Foreachmodel,12monthly C CESM1−BGC mapsareusedtocomputethe 0.6 or CESM1−WACCM r Taylorstatistics(seeMethods el CMCC−CM fordetails) 1.4 0.7 ati CCNSIRRMO−−CMMk35−6−0 n o CanCM4 o n viati1.2 1.4 0.8 CGaFnDELS−MCM22p1 e GFDL−CM3 d d GFDL−ESM2G ar 1 GFDL−ESM2M d GISS−E2−R n Sta0.8 1 0.9 HHaaddGGEEMM22−−CESC INMCM4 IPSL−CM5A−LR 0.6 0.6 0.95 IIPPSSLL−−CCMM55AB−−MLRR MIROC−ESM−CHEM MIROC−ESM 0.4 MIROC4h MIROC5 0.99 0.2 MMPPII−−EESSMM−−LMRR MPI−ESM−P MRI−CGCM3 000000 NorESM1−ME 000000 0.4 0.6 0.8 1 1.2 1.4 NorESM1−M Ocean SSHpatterns.Many CMIP5 modelsemploysimilar CMIP5 historical runs, and from the model years 1993 to parameterisations as used in CMIP3, and hence the 2010 for CMIP3 (twentieth century runs extended by the dependency of ACC transports and sea surface height in correspondingA1Bscenariorun);wetestedandconfirmed the Southern Ocean would be strongly dependent on the that the individual seasonal model statistics are not sensi- oceanmodels’valuesofj, ratherthanonzonalwindstress tive to the particular choice of the time periods. Seasonal magnitudes. variations of SSH in the tropical regions are dominated by The detailed processes, in particular from unresolved surface wind field changes, whereas in the higher latitudes eddies, that affect the relationship between mean zonal buoyancy fluxes and heat content changes tend to be more wind stress biases and SSH biases as shown here are not important. As we did for the time-mean fields, we com- easily discernible from the zonal mean analysis (Figs. 5 putedTaylorstatisticsofeachCMIP5model,butthistime and6).Severalstudieshaveemphasizedtheimportantrole for the 12 monthly climatology anomaly SSH fields for of eddies in the Southern Ocean momentum balance each model (Gleckler et al. 2008). (Boning et al. 2008), such that one may not expect linear FromFig.7,itisapparentthatthespreadoftheCMIP5 relationshipsbetweenwindstress,ACCtransportandSSH ensembleissomewhatlowerthanthespreadoftheCMIP3 changes. Also, surface buoyancy fluxes (heat and fresh- ensemble,though the level of improvement is modest, and water) have a significant influence on ocean dynamics at the CMIP3 and CMIP5 ensemble mean annual Taylor mid to high latitudes, and therefore any biases in these statistics are almost identical. However, no single model fields likely contribute to the SSH biases described above achieves a correlation score above 0.8, in stark contrast to (Meijers et al. 2012; Russell et al. 2006; Carman and other variables that have a strong annual cycle such as McClean 2011). surfacetemperature (Gleckleret al.2008).Thereasonsfor these comparatively low correlations may be related to 3.2 Monthly climatology SSH variations generally smaller ratios of annual versus interannual vari- ability inthe oceans when compared tothe atmosphere. In In this section, we briefly discuss the representation of the addition, SSH is influenced by surface heat fluxes as well monthly climatology anomalies of SSH in CMIP5 com- as wind stress curl, making the seasonal SSH dynamics paredtotheclimatologyoftheAVSIOaltimetricdata.The more complex than surface temperatures. At least part of monthlyclimatologies,relativetothetime-meanfields,are the higher than observed seasonal standard variations in derived from the model years 1993 through 2005 for the manyCMIP5modelsiscontributedbytoolargeaseasonal 123 Seasurfaceheight:CMIP5versusobservations 1279 SSH amplitude in the equatorial and tropical regions, in particular over the Indian and Pacific Ocean (see Suppl. Material Fig. 2). Compared to observations from the QuikSCAT instrument,manyCMIP5 modelstend toover- estimate the overall magnitude of the seasonal wind stress anomalies, especially in spring and autumn (Lee et al. 2013).Consistentwiththis,themedianoftheglobalmean seasonal SSH amplitudes across the individual CMIP5 GCMs is about 13 % larger than the observed AVISO signal. However, while most CMIP5 models overestimate theobservedspatio-temporalstandarddeviation,themodel mean standard deviation is about 10 % lower than observed, indicating that individual model biases tend to average out in the ensemble mean (Fig. 7 and SM-Fig. 2). 3.3 Interannual SSH variations Comparing variability on interannual to decadal scales betweenmodelsandobservationsisnotstraightforwardas the GCMs’ internal variability is not constrained to be in phase with observations.Inthis sense, RMS differencesor temporal correlations are not appropriate metrics to assess model performance for longer than seasonal time scales; instead,wefocusoncomparingthepatternsandamplitudes ofinter-annualvariability,similarto(Gleckleret al.2008). To extract these interannual patterns, we remove for each modelandtheobservationsthemeanannualcycle,detrend the SSH data, and apply a band-pass filter with corner frequencies of 1 and 10 years; interannual variability is then the standard variation at each gridpoint of the filtered Fig.8 SpatialpatternsofinterannualSSHvariabilityoftheAVISO fields. The choice of 10 years as the maximum period is observations(top)andthemeanoftheinterannualvariabilitypatterns motivatedbythelimitedlengthoftheobservationalrecord. of the CMIP5 models (bottom). The monthly mean climatology has beensubtracted,thedatahavebeendetrended,andaband-passfilter The SSH observations from altimetry cover 18 years, and (1–10years)hasbeenapplied weusethesamerecordlength(i.e.,1988–2005)togenerate theinterannualvariabilitymapsfortheCMIP5models(see features - high variability in the equatorial Central to East SM-Fig.3).Wenotethatdetrendingandband-passfiltering Pacific, western Pacific warm pool, tropical Indian Ocean, the de-seasonalized SSH observations (1993–2010) redu- North Atlantic Gulf Stream, Pacific Kuroshio, and South- cestheglobalmeaninter-annualvariabilitybyabout50 %; ernOceanACCfront-arereproducedinthemodels,albeit detrending only reduces the inter-annual variability by to varying degrees. For example, the NorESM1 models 12 %. appeartobe able tocapture the patternsand amplitudesin Intheequatorial-tropicallatitudebandbetween20Sand the Indian and Pacific Oceans quite well, whereas the two 20N, the spatial pattern of observed interannual SSH var- MIROC-ESMs underestimate amplitudes in these regions iability as defined here is dominated by tropical Pacific (see SM-Fig. 3). As discussed above, biases in the mean variability related to ENSO and variability in the Indian fields can have an effect on a model’s capability to gen- Ocean. In the subtropical to subpolar latitudes, variability erateclimatevariabilitymodes,inparticularinthetropical peaks appear in the meandering parts of Kuroshio exten- regionswherethepycnoclinedepthplaysanimportantrole sionandGulfStream/NorthAtlanticCurrent,andalongthe in generating the Indian Ocean Dipole (Cai and Cowan ACC boundary (Fig. 8, top). The average interannual var- 2013). iability pattern of the CMIP5 models broadly replicates Moving to a more quantitative comparison between these features, albeit at lower amplitudes (e.g., in the observed and CMIP5-simulated interannual variability, we tropical Pacific Ocean; Fig. 8). The spatial amplitude pat- calculatedTaylor-diagramsfromtheinterannualamplitude tern of the 1-10 year interannual variability varies among patterns. Other authors have analyzed interannual vari- the CMIP5 models (see SM-Fig. 3). The main observed ability by using Taylor-diagrams, and pattern correlations 123 1280 Landereretal. Fig.9 Taylordiagram Observations summarizingthenormalized CMIP5 mean statisticsofinterannualSSH 0.1 ACCESS1−0 0.2 variabilitypatternsofthe 1.4 0.3 ACCESS1−3 CMIP5modelsoverthetropical 0.4 BCC−CSM1−1 CCSM4 PacificOceanarea(between 0.5 CESM1−BGC 20Sand20N).Themonthly C meanclimatologyhasbeen 1.2 0.6orr CCEMSCMC1−−CWMACCM subtracted,thedatahavebeen el CNRM−CM5 a d(NM1eo–Rttr1eIe0-nCtdhyGeaedtCa,rmMsa)on3ddheahalsasbvbIaeeNnendMne-gpCaaapMtspisvl4ifieealdtne.dr viation 1 1.2 0.7tion0.8 CCCECSaannI−RCEEOSMA−MR4M2TkH3−6−0 correlations(seealsoFig.10) e GFDL−CM2p1 andarethusnotdisplayedhere d d0.8 0.9 GFDL−CM3 r GFDL−ESM2G a d GFDL−ESM2M n a 0.9 GISS−E2−R St0.6 HadGEM2−CC 0.6 HadGEM2−ES INMCM4 0.95 IPSL−CM5A−LR 0.4 IPSL−CM5A−MR IPSL−CM5B−LR 0.3 MIROC−ESM−CHEM MIROC−ESM 0.99 MIROC4h MIROC5 MPI−ESM−LR MPI−ESM−MR 000000 MPI−ESM−P 000000 0.4 0.6 0.8 1 1.2 1.4 MRI−CGCM3 NorESM1−ME NorESM1−M 1:ACCESS1−0 2:ACCESS1−3 0.9 3:CCSM4 4:CESM1−BGC−esm 5:CESM1−WACCM 0.8 6:CMCC−CM 7:CNRM−CM5 8:CSIRO−Mk3−6−0 0.7 9:CanCM4 10:CanESM2 0.6 11:EC−EARTH 12:GFDL−CM2p1 13:GFDL−CM3 0.5 14:GFDL−ESM2G 15:GFDL−ESM2M 16:GISS−E2−R 0.4 17:HadGEM2−CC 18:HadGEM2−ES 19:IPSL−CM5A−LR 0.3 20:IPSL−CM5A−MR 21:IPSL−CM5B−LR 0.2 22:MIROC−ESM−CHEM 23:MIROC−ESM 24:MIROC4h 0.1 25:MIROC5 26:MPI−ESM−LR 27:MPI−ESM−MR 0 28:MPI−ESM−P 29:MRI−CGCM3 30:NorESM1−ME −0.1 31:NorESM1−M 32:bcc−csm1−1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 33:inmcm4 Fig.10 Spread of pattern correlation between interannual SSH and computing the pattern correlation to the observed interannual variability of the CMIP5 models and observations over the tropical variability(thelatterremainsthesame).Dataprocessingandfiltering PacificOceanarea(between20Sand20N).Foreachmodel,thedata isasinFig.9 pointsarederivedbyslidinga18-yearwindowoverthehistoricalrun, using the first EOF-moments of SSH-variability or trends Suppl. Material for a global analysis). The most striking (Meyssignac et al.2012;McGregoret al. 2012).We focus feature of Fig. 9 is the large ensemble spread, and the the following analysis of interannual variability on the reduces performance of the models compared to time- Pacific between 20S and 20N because interannual vari- invariable metrics. The highest pattern correlation is about ability in this region is comparatively large (Fig. 9; see R = 0.8, whereas all CMIP5-GCMs reach at least 123