Does a new paper really reconcile instrumental and model-based climate sensitivity estimates?

by Nic Lewis
A new paper in Science Advances by Cristian Proistosescu and Peter Huybers (hereafter PH17) claims that accounting for the decline in feedback strength over time that occurs in most CMIP5 coupled global climate models (GCMs), brings observationally-based climate sensitivity estimates from historical records into line with model-derived estimates.

A longer version of this post is at ClimateAudit, with additional technical details.
PH17 is not the first paper to attempt to bring observationally-based climate sensitivity estimates from historical records into line with model-derived estimates, but it makes a rather bold claim and, partly because Science Advances seeks press coverage for its articles, has been attracting considerable attention.
Some of the methodology the paper uses is complicated, with its references to eigenmode decomposition and full Bayesian inference. However, the underlying point it makes is simple. The paper addresses equilibrium climate sensitivity (ECS)[i] of GCMs as estimated from information corresponding to that available during the industrial period. PH17 terms such an estimate ICS; it is usually called effective climate sensitivity. Specifically, PH17 estimates ICS for GCMs by emulating their global surface temperature (GST) and top-of-atmosphere radiative flux imbalance responses under a 1750–2011 radiative forcing history matching the IPCC AR5 best estimates.
In a nutshell, PH17 claims that for the current generation (CMIP5) GCMs, the median ICS estimate is only 2.5°C, well short of their 3.4°C median ECS and centred on the range of observationally-based climate sensitivity estimates, which they take as 1.6–3.0°C. My analysis shows that their methodology and conclusion is incorrect for several reasons, as I shall explain. My analysis of their data shows that the median ICS estimate for GCMs is 3.0°C, compared with a median for sound observationally-based climate sensitivity estimates in the 1.6–2.0°C range. To justify my conclusion, I need first to explain how ECS and ICS are estimated in GCMs, and what FH17 did.
For most GCMs, ICS is smaller than ECS, where ECS is estimated from ‘abrupt4xCO2’ simulation data,[ii] on the basis that their behaviour in the later part of the simulation will continue until equilibrium. When CO2 concentration – and hence forcing, denoted by F – is increased abruptly, most GCMs display a decreasing-over-time response slope of TOA flux (denoted by H in the paper, but normally by N) to changes in GST (denoted by T). That is, the GCM climate feedback parameter λ decreases with time after forcing is applied.[iii] Over any finite time period, ICS will fall short of ECS in the GCM simulation. Most but not all CMIP5 coupled GCMs behave like this, for reasons that are not completely understood. However, there is to date relatively little evidence that the real climate system does so.
Figure 1, an annotated reproduction of Fig. 1 of PH17, illustrates the point. The red dots show annual mean T (x-coordinate) and H (y-coordinate) values during the 150-year long abrupt4xCO2 simulation by the NorESM1-M GCM.[iv] The curved red line shows a parameterised ‘eigenmode decomposition’ fit to the annual data. The ECS estimate for NorESM1-M based thereon is 3.2°C, the x-axis intercept of the red line. The estimated forcing in the GCM for a doubling of CO2 concentration (F2×) is 4.0 Wm−2, the y-axis intercept of the red line. The ICS estimate used, per the paper’s methods section, is represented by the x-axis intercept of the straight blue line, being ~2.3°C. That line starts from the estimated F2× value and crosses the red line at a point corresponding approximately to the same ratio of TOA flux to F2× as currently exists in the real climate system. If λ were constant, then the red dots would all fall on a straight line with slope −λ and ICS would equal ECS; if ECS (and ICS) were 2.3°C the red dots would all fall on the blue line, and if ECS were 3.2°C they would all fall on the dashed black line. The standard method of estimating ECS for a GCM from its abrupt4xCO2 simulation data, as used in IPCC AR5, has been to regress H on T over all 150 years of the simulation and take the x-axis intercept. For NorESM1-M, this gives an ECS estimate of 2.8°C, below the 3.2°C estimate based on the eigenmode decomposition fit. Regressing over years 21–150, a more recent and arguably more appropriate approach, also gives an ECS estimate of 3.2°C.
[i] ECS is defined as the increase in global surface temperature (GST) resulting from a doubling of atmospheric CO2 concentration once the ocean has fully equilibrated.
[ii] The abrupt4xCO2 simulations involve abruptly quadrupling CO2 concentration from an equilibrated preindustrial climate state; most such CMIP5 simulations were run for 150 years, but a few for up to 300 years. The use of abrupt4xCO2 simulation data to estimate the ECS of GCMs, most often by regression of TOA flux against GST change, is standard. Most GCMs have not been run to equilibrium with doubled CO2 concentration. Even where they have, any change in their energy leakage over time or with climate state would bias the resulting ECS value.
[iii] The authors define λ as −ΔH(t)/ΔT(t), corresponding to the negative of the slope for the overall changes in H and T at time t after a forcing is imposed, rather than as −dH/dT|t, the negative of the instantaneous slope at time t.
[iv] Values are changes from those in the equilibrated control simulation from which the abrupt4xCO2 simulation was branched, adjusted for drift and halved to restate for doubled CO2 concentration, making the assumption that for CO2 forcing is exactly proportional to log(concentration).

Fig. 1. Reproduction of Fig. 1 of PH17, with added brown and blue lines illustrating ICS estimates
Observationally-based climate sensitivity estimates derived from instrumental data are determined as ICS, since the climate system is currently in disequilibrium, with a positive TOA flux imbalance.
The most robust observational estimates of climate sensitivity based on instrumental data use an “energy budget” approach, described in IPCC AR5. That is they estimate the ratio of the change in GST to that in total forcing net of TOA flux imbalance, and scale the resulting estimate by F2× to convert it to ICS, as an approximation to ECS. To minimise the impact of measurement errors and internal climate system variability, these changes are usually taken between decadal or longer base and final intervals early and late in the instrumental period. The intervals chosen should be well matched in terms of volcanic activity (which has different effects from other forcing agents) and multidecadal Atlantic variability. Both Otto et al 2013 (estimate based on 2000s data) and Lewis and Curry 2015 satisfied these requirements. Otto et al used a GCM-derived forcing time series adjusted to match the overall change per IPCC AR5; Lewis & Curry used forcing time series from AR5 itself. Their observationally-based ICS median estimates were respectively 2.0°C and 1.6°C.
PH17’s statement: “A recent review of observationally based estimates of ICS shows a median of 2°C and an 80% range of 1.6° to 3°C” is based on a sample of 8 studies that included outdated and/or unsound ones. A number of other sound observationally-based ICS estimates not included in the sample used by PH17 fall within the 1.6–2.0°C range spanned by the Otto et al and Lewis & Curry estimates (Ring et al 2012 1.8°C; Aldrin et al 2012 1.76°C; Lewis 2013 1.64°C; Skeie et al 2014 1.67°C; Lewis 2016 1.67°C). I consider 1.6–2.0°C more representative than 1.6–3.0°C of the range of median ICS observationally based estimates from high quality recent studies.
PH17 uses an energy budget method to estimate ICS. If the energy-budget method is applied, based on the evolution of forcing over the historical period, to a GCM in which λ decreases with time, as in Figure 1, the resulting ICS estimate will obviously be lower than the GCM’s estimated ECS. However, contrary to what PH17 claims, if ICS is estimated using sound methods then the underestimation relative to ECS is typically modest, and the median CMIP5 model ICS estimate is still well above ICS for the real climate system as estimated by the best quality instrumental studies.
[See the technical post at Climate Audit for description of the eigenmode decomposition fitting method used in PH17].
ICS calculation
In PH17, ICS was inferred by applying total historical forcing F (per AR5 median estimate time series) over 1750–2011 to the estimated eigenmode fits for each GCM, thus deriving emulated time series of its H and T values. This was done 5,000 times for each GCM, sampling from the derived posterior probability distribution for the eigenmode fit parameter values. The 2.5°C estimate for GCM-derived ICS is the median across the 24 GCMs of all the sample ICS estimates – 120,000 in all.[i] This approach seems very reasonable in principle, but the devil is in its detailed application.
PH17 states that ICS is obtained as F2×/λ(t), where λ(t) = (FH)/T, with F, H and T being departures in 2011 from preindustrial conditions. Each of F, H and T is taken to have zero value in preindustrial conditions; total 1750 forcing was zero in the AR5 time series and the initial simulated values of H and T are zero.
PH17 also states that as values of F2× associated with each posterior draw could vary from the 3.7 Wm−2 assumed in the AR5 estimate of historical forcing, they multiplied F by F2×/3.7 for each draw before obtaining the values of H and T. While doing so is logical, it actually has no effect on the derived value of λ, since the multiplier scales equally both the numerator and denominator of the fraction representing λ. What is, however, critical to correct estimation for ICS for a GCM is that the F2× value into which the estimated λ is divided is, as implied by PH17, the estimated F2× for that particular GCM (which will vary between samples), and not some other value, such as the 3.7 Wm−2 used in AR5. Per PH17 Table S1 the median estimated GCM F2× values range from 2.9 to 5.8 Wm−2.
Error in ICS calculation
Cristian Proistosescu has very helpfully provided me with a copy of his data and Matlab code, so I have been able to check how the PH17 ICS values were actually calculated. Unfortunately, it turns out that the calculation in PH17 is wrong. Although for each GCM and each set of its sample eigenmode parameters, PH17’s code scales the AR5 forcing time series by the F2× value corresponding to its sampled eigenmode parameters (and thus also scales the related simulated H and T time series), it then divides the resulting λ estimate into 3.7 Wm−2 rather than into the F2× value applicable to that sample. Essentially, what PH17 did was to correctly estimate the slope of the blue line but, instead of estimating ICS directly from its x-axis intercept, they shifted the blue line down so that its y-axis intercept was 3.7 Wm–2.. In the case shown in Figure 1, doing so reduces the ICS estimate from 2.3°C to 2.1°C.
I have rerun the PH17 code with the ICS calculation corrected, applying the F2× value applicable to each sample to compute the ICS estimate for that sample. The resulting overall median ICS estimate increases from 2.5°C to 2.8°C. The 2.5°C value found by PH17 is quite clearly incorrect.
Volcanic Forcing
The corrected median ICS estimate for GCMs of 2.8°C, based on changes over the entire 1750-2011 period, is still a little below the value I would have expected from previous work of mine using rather similar methods. The reason for this is the incorrect treatment of volcanic forcing in PH17. The points involved are quite subtle.
The problem is that PH17 did not adjust the AR5 forcing time series to make average volcanic forcing zero. If one does not do so, that implies preindustrial (natural only) forcing was on average negative relative to that in 1750 (when all forcings, including volcanic forcing, are set at zero in the AR5 time series), meaning that in 1750 the climate system (which is assumed to be in equilibrium with pre-1750 average forcing) would not be in equilibrium with 1750 forcing (which is higher by the negative of average pre-1750 natural forcing. That would invalidate the PH17 derivation of (FH)/T and hence of ICS. Although average pre-1750 natural forcing values are not given in AR5, it is reasonable to estimate them from the average over 1750–2011. That average is negligible for solar forcing, but material for volcanic forcing, at −0.40 Wm–2.
The need to account for preindustrial volcanic forcing when computing subsequent warming is known,[ii] although it appears to have been overlooked by many GCM modellers. A simple solution is to adjust the AR5 forcing time series so that it has a zero mean over 1750-2011. This is essentially the same approach as was used when the RCP scenario forcing time series were produced. The volcanic forcing in 1750 then becomes +0.4 Wm–2, reflecting unusually low volcanism in that year.
When I adjusted the AR5 forcing time series by subtracting the average volcanic forcing over 1750–2011, the ICS median estimate over 1750-2011 rose to 2.92°C.
IRF versus ERF
There is a third reason why the PH17 estimate of ICS for GCMs is too low.
When CO2 concentration is abruptly doubled, it initially produces what is termed instantaneous radiative forcing (IRF). However, for estimating the response of the climate system it is best to use effective radiative forcing (ERF), which is forcing after the atmosphere has adjusted and surface adjustments that do not involve any change in GST have taken place; see IPCC AR5 Box 8.1. Such adjustments take up to a year, perhaps more, to complete. The IPCC AR5 forcing series are for ERF, and adopt an F2× value of 3.71 W m–2. ERF for CO2 is believed to be some way below IRF.
However, in PH17, F2× is estimated by projecting back to time zero using, primarily, mean values for the first and second years of the abrupt 4xCO2 simulations. Since during year one the atmosphere and surface are adjusting (independently of GST change) to the quadrupling in CO2 concentration, doing so produces a F2× value that is in excess of ERF. Thus, PH17 derives a median GCM F2× of ~4 Wm–2 (the median values for λ and Contribution to inferred equilibrium warming given in Table 1, imply, in conjunction with the median GCM ECS given in Table S1, an F2× value of 4.0 Wm–2).
It is difficult to estimate ERF F2× for CO2 very accurately from abrupt 4xCO2 simulation data. A reasonable method is to use regression over years 1 to 20 of the abrupt 4xCO2 simulation,[iii] which is consistent with the recommendation in Hansen et al (2005)[iv] of regressing over the first 10 to 30 years. The ensemble median F2× obtained by doing so is the best part of 10% lower than per PH17, although the ratio for individual GCM medians varies between 0.72 and 1.20. To obtain an apples-to-apples comparison, the F2× values implicit in the fitted model eigenvalue parameters must be for ERF, as for observationally-based estimates, not for something between ERF and IRF. The brown line in Figure 1 illustrates the issue. The intersection of the blue and brown lines corresponds to where we are now, in terms of how long the climate system has had on average to adjust to forcing increments during the historical period (scaled to a doubling of CO2 concentration). The brown line corresponds to estimating ICS using the same data relating to the current climate system state as for the blue line, but with the F2× estimate reduced from PH17’s 4.0 W m–2 to 3.6 Wm–2. The result is to increase the ICS estimate by approaching 0.2°C – the difference between the x-intercept of the brown and the blue lines. I cannot accurately estimate the depressing effect on ICS estimation of using F2× estimates that exceed those corresponding to ERF, as doing so would require refitting the statistical model and obtaining fresh sets of 5,000 sample eigenmode fits for each GCM.[v] However, based on my previous work I estimate the effect to be ~ 0.1°C. When this is added to the 2.94°C median ICS estimate, after correcting the two problems previously dealt with, for time periods used in instrumental-observation studies the median GCM based ICS estimate would slightly exceed 3.0°C.
Other issues
There are a few other points relevant to appraisal of PH17.
The PH17 calculations of T for CMIP5 GCMs using AR5 forcing time series reveal that, for the median fitted eigenmode parameters, simulated warming between 1860–79 and 2000–09 was 1.10°C.[vi] That exceeds recorded warming (using a globally-complete GST dataset)[vii] of 0.84°C by almost a third, supporting the conclusion that the median GCM is substantially too sensitive.
It is also worth noting that, although of considerable interest in relation to understanding climate system behaviour, any difference between ICS and ECS is of relatively little importance when estimating warming over the next few centuries on scenarios involving continuing growth of emissions and CO2 concentrations, as the slow mode will contribute only a small part of the total warming.
Conclusions
When correctly calculated, median ICS estimate for CMIP5 GCMs, based on the evolution of forcing over the historical period, is 3.0°C, not 2.5°C as claimed in PH17. Although 3.0°C is below the median ECS estimate for the GCMs of 3.4°C, it is well above a median estimate in the 1.6–2.0°C range for good quality observationally-based climate sensitivity estimates. PH17’s headline claim that it reconciles historical and model-based estimates of climate sensitivity is wrong.
End Notes
[i] The total sample size is slightly lower, since for eight of the GCMs the simulation method fails in a number of cases, due to use of an approximation that breaks down for samples with a very small fitted short time constant.
[ii] Gregory et al 2013 doi:10.1002/grl.50339; Meinshausen et al 2011 DOI 10.1007/s10584-011-0156-z Appendix 2
[iii] As in Andrews et al (2015, DOI: 10.1175/JCLI-D-14-00545.1)
[iv] Efficacy of climate forcings, doi:10.1029/2005JD005776
[v] Probably requiring setting λ1 = λ2;it is impossible to estimate a separate λ1 if one seeks to estimate ERF, as the relevant time constant, τ1, is too short – typically less than a year.
[vi] With volcanic forcing adjusted to zero mean over 1750-2011
[vii] Cowtan and Way v2 kriged HadCRUT4v5: http://www-users.york.ac.uk/%7Ekdc3/papers/coverage2013/series.html
End Notes
[1] ECS is defined as the increase in global surface temperature (GST) resulting from a doubling of atmospheric CO2 concentration once the ocean has fully equilibrated.
[2] The abrupt4xCO2 simulations involve abruptly quadrupling CO2 concentration from an equilibrated preindustrial climate state; most such CMIP5 simulations were run for 150 years, but a few for up to 300 years. The use of abrupt4xCO2 simulation data to estimate the ECS of GCMs, most often by regression of TOA flux against GST change, is standard. Most GCMs have not been run to equilibrium with doubled CO2 concentration. Even where they have, any change in their energy leakage over time or with climate state would bias the resulting ECS value.
[3] The authors define λ as −ΔH(t)/ΔT(t), corresponding to the negative of the slope for the overall changes in H and T at time t after a forcing is imposed, rather than as −dH/dT|t, the negative of the instantaneous slope at time t.
[4] Values are changes from those in the equilibrated control simulation from which the abrupt4xCO2 simulation was branched, adjusted for drift and halved to restate for doubled CO2 concentration, making the assumption that for CO2 forcing is exactly proportional to log(concentration).




[9] The total sample size is slightly lower, since for eight of the GCMs the simulation method fails in a number of cases, due to use of an approximation that breaks down for samples with a very small fitted short time constant.
[10] Gregory et al 2013 doi:10.1002/grl.50339; Meinshausen et al 2011 DOI 10.1007/s10584-011-0156-z Appendix 2
[11] As in Andrews et al (2015, DOI: 10.1175/JCLI-D-14-00545.1)
[12] Efficacy of climate forcings, doi:10.1029/2005JD005776
[13] Probably requiring setting λ1 = λ2;it is impossible to estimate a separate λ1 if one seeks to estimate ERF, as the relevant time constant, τ1, is too short – typically less than a year.
[14] With volcanic forcing adjusted to zero mean over 1750-2011
[15] Cowtan and Way v2 kriged HadCRUT4v5: http://www-users.york.ac.uk/%7Ekdc3/papers/coverage2013/series.html

Moderation note:  As with all guest posts, please keep your comments civil and relevant.

Source