Reply to Patrick Brown’s response to comments on his Nature article

by Nic Lewis
My reply to Patrick Brown’s response to my my comments on his Nature article.

Introduction
I thank Patrick Brown for his detailed response (also here) to statistical issues that I raised in my critique “Brown and Caldeira: A closer look shows global warming will not be greater than we thought” of his and Ken Caldeira’s recent paper (BC17).[1] The provision of more detailed information than was given in BC17, and in particular the results of testing using synthetic data, is welcome. I would reply as follows.
Brown comments that I suggested that rather than focusing on the simultaneous use of all predictor fields, BC17 should have focused on the results associated with the single predictor field that showed the most skill: The magnitude of the seasonal cycle in OLR. He goes on to say: “Thus, Lewis is arguing that we actually undersold the strength of the constraints that we reported, not that we oversold their strength.”
To clarify, I argued that BC17 undersold the statistical strength of the relationships involved, in the RCP8.5 2090 case focussed on in their Abstract, for which the signal-to-noise ratio is highest. But I went on to say that I did not think the stronger relationships would really provide a guide to how much global warming there would actually be late this century on the RCP8.5 scenario, or any other scenario. That is because, as I stated, I disagree with BC17’s fundamental assumption that the relationship of future warming to certain aspects of the recent climate that holds in climate models necessarily also applies in the real climate system. I will return to that point later. But first I will discuss the statistical issues.
Statistical issues
When there are many more predictor variables than observations, the dimensionality of the predictor information has to be reduced in some way to avoid over-fitting. There are a number of statistical approaches to achieving this using a linear model, of which the partial least squares (PLS) regression method used in BC17 is arguably one of the best, at least when its assumptions are satisfied. All methods estimate a statistical model fit that provides a set of coefficients, one for each predictor variable.[2] The general idea is to preserve as much of the explanatory power of the predictors as possible without over-fitting, thus maximizing the fit’s predictive power when applied to new observations.
If the PLS method is functioning as intended, adding new predictors should not worsen the predictive skill of the resulting fitted statistical model. That is because, if those additional predictors contain useful information about the predictand(s), that information should be incorporated appropriately, while if the additional predictors do not contain any such information they should be given zero coefficients in the model fit. Therefore, the fact that, in the highest signal-to-noise ratio, RCP8.5 2090 case focussed on both in BC17 and my article, the prediction skill when using just the OLR seasonal cycle predictor field is very significantly reduced by adding the remaining eight predictor fields indicates that something is amiss.
Brown say that studies are often criticized for highlighting the single statistical relationship that appears to be the strongest while ignoring or downplaying weaker relationships that could have been discussed. However, the logic with PLS is to progressively include weaker relationships but to stop at the point where they are so weak that doing so worsens predictive accuracy. Some relationships are sufficiently weak that including them adds too much noise relative to information useful for prediction. My proposal of just using the OLR seasonal cycle to predict RCP8.5 2090 temperature was accordingly in line with the logic underlying PLS – it was not a case of just ignoring weaker relationships.
Indeed, the first reference for the PLS method that BC17 give (de Jong, 1993), justified PLS by referring to a paper [3] that specifically proposed carrying out the analysis in steps, selecting one variable/component at a time and not adding an additional one if it worsened the statistical model fit’s predictive accuracy. At the predictor field level, that strongly suggests that, in the RCP8.5 2090 case, when starting with the OLR seasonal cycle field, one would not go on to add any of the other predictor fields, as in all cases doing so worsens the fit’s predictive accuracy. And there would not be any question of using all predictor fields simultaneously, since doing so also worsens predictive accuracy compared to using just the OLR seasonal cycle field.
In principle, even when given all the predictor fields simultaneously PLS should have been able to optimally weight the predictor variables to build composite components in order of decreasing predictive power, to which the add-one-at-a-time principle could be applied. However, it evidently was unable to do so in the RCP8.5 2090 case or other cases. I can think of two reasons for this. One is that the measure of prediction accuracy used – RMS prediction error when applying leave-one-out cross-validation – is imperfect. But I think that the underlying problem is the non-satisfaction of a key assumption of the PLS method: that the predictor variables are free of uncertainty. Here, although the CMIP5-model-derived predictor variables are accurately measured, they are affected by the GCMs’ internal variability. This uncertainty-in-predictor-values problem was made worse by the decision in BC17 to take their values from a single simulation run by each CMIP5 model rather than averaging across all its available runs.
Brown claims (a) that each model’s own value is included in the multi-model average which gives the multi-model average an inherent advantage over the cross-validated PLSR estimate and (b) that this means that PLSR is able to provide meaningful Prediction Ratios even when the Spread Ratio is near or slightly above 1. Point (a) is true but the effect is very minor. Based on the RCP8.5 2090 predictions, it would normally cause a 1.4% upwards bias in the Spread Ratio. Since Brown did not adjust for the difference of one in the degrees of freedom involved, the bias is twice that level – still under 3%. Brown’s claim (b), that PLS regression is able to provide meaningful Prediction Ratios even when the Spread Ratio is at or virtually at the level indicating a skill no higher than when always predicting warming equal to the mean value for the models used to estimate the fit, is self-evidently without merit.
As Brown indicates, adding random noise affects correlations, and can produce spurious correlations between unrelated variables. His test results using synthetic data are interesting, although they only show Spread ratios. They show that one of the nine synthetic predictor fields produced a reduction in the Spread ratio below one that was very marginally – 5% – greater than that when using all nine fields simultaneously. But the difference I highlighted, in the highest signal RCP8.5 2090 case, between the reduction in Spread ratio using just the OLR seasonal cycle ratio and that using all predictors simultaneously was an order of magnitude larger – 40%. It seems very unlikely that the superior performance of the OLR seasonal cycle on its own arose by chance.
Moreover, the large variation in Spread ratios and Prediction ratios between different cases and different (sets of) predictors calls into question the reliability of estimation using PLS. In view of the non-satisfaction of the PLS assumption of no errors in the predictor variables, a statistical method that does take account of errors in them would arguably be more appropriate. One such method is the RegEM (regularized expectation maximization) algorithm, which was developed for use in climate science.[4] The main version of RegEM uses ridge regression with the ridge coefficient (the inverse of which is analogous to the number of retained components in PLS) being chosen by generalized cross-validation. Ridge regression RegEM, unlike the TTLS variant used by Michael Mann, produces very stable estimation. I have applied RegEM to BC17’s data in the RCP8.5 2090 case, using all predictors simultaneously.[5] The resulting Prediction ratio was 1.08 (8% greater warming), well below the comparative 1.12 value Brown arrives at (for grid-level standardization). And using just the OLR seasonal cycle , the excess of the Prediction ratio over one was only half that for the comparative PLS estimate.
Issues with the predictor variables and the emergent constraints approach
I return now to BC17’s fundamental assumption that the relationship of future warming to certain aspects of the recent climate that holds in climate models also applies in the real climate system. They advance various physical arguments for why this might be the case in relation to their choice of predictor variables. They focus on the climatology and seasonal cycle magnitude predictors as they find, compared with the monthly variability predictor, these have more similar PLS loading patterns to those when targeting shortwave cloud feedback, the prime source of intermodel variation in ECS.
There are major problems in using climatological values (mean values in recent years) for OLR, OSR and the TOA radiative imbalance N. Most modelling groups target agreement of simulated climatological values of these variables with observed values (very likely spatially as well as in the global mean) when tuning their GCMs, although some do not do so. Seasonal cycle magnitudes may also be considered when tuning GCMs. Accordingly, how close values simulated by each model are to observed values may very well reflect whether and how closely the model has been tuned to match observations, and not be indicative of how good the GCM is at representing the real climate system, let alone how realistic its strength of multidecadal warming in response to forcing is.
There are further serious problems with use of climatological values of TOA radiation variables. First, in some CMIP5 GCMs substantial energy leakages occur, for example at the interface between their atmospheric and ocean grids.[6] Such models are not necessarily any worse in simulating future warming than other models, but they need (to be tuned) to have TOA radiation fluxes significantly different from observed values in order for their ocean surface temperature change to date, and in future, to be realistic.
Secondly, at least two of the CMIP5 models used in BC17 (NorESM1-M and NorESM1-ME) have TOA fluxes and a flux imbalance that differ substantially from CERES observed values, but it appears that this merely reflects differences between derived TOA values and actual top-of-model values. There is very little flux imbalance within the GCM itself.[7] Therefore, it is unfair to treat these models as having lower fidelity – as BC17’s method does for climatology variables – on account of their TOA radiation variables differing, in the mean, from observed values.
Thirdly, most CMIP5 GCMs simulate too cold an Earth: their GMST is below the actual value, by up to several degrees. It is claimed, for instance in IPCC AR5, that this does not affect their GMST response to forcing. However, it does affect their radiative fluxes. A colder model that simulates TOA fluxes in agreement with observations should not be treated as having good fidelity. With a colder surface its OLR should be significantly lower than observed, so if it is in line then either the model has compensating errors or its OLR has been tuned to compensate, either of which indicates its fidelity is poorer than it appears to be. Moreover, complicating the picture, there is an intriguing, non-trivial correlation between preindustrial absolute GMST and ECS in CMIP5 models.
Perhaps the most serious shortcoming of the predictor variables is that none of them are directly related to feedbacks operating over a multidecadal scale, which (along with ocean heat uptake) is what most affects projected GMST rise to 2055 and 2090. Predictor variables that are related to how much GMST has increased in the model since its preindustrial control run, relative to the increase in forcing – which varies substantially between CMIP5 models – would seem much more relevant. Unfortunately, however, historical forcing changes have not been measured for most CMIP5 models. Although one would expect some relationship between seasonal cycle magnitude of TOA variables and intra-annual feedback strengths, feedbacks operating over the seasonal cycle may well be substantially different from feedbacks acting on a multidecadal timescale in response to greenhouse gas forcing.
Finally, a recent paper by scientists as GFDL laid bare the extent of the problem with the whole emergent constraints approach. They found that, by a simple alteration of the convective parameterization scheme, they could engineer the climate sensitivity of the GCM they were developing, varying it over a wide range, without them being able to say that one model version showed a greater fidelity in representing recent climate system characteristics than another version with a very different ECS.[8] The conclusion from their Abstract is worth quoting:”Given current uncertainties in representing convective precipitation microphysics and the current inability to find a clear observational constraint that favors one version of the authors’ model over the others, the implications of this ability to engineer climate sensitivity need to be considered when estimating the uncertainty in climate projections.” This strongly suggests that at present emergent constraints cannot offer a reliable insight into the magnitude of future warming. And that is before taking account of the possibility that there may be shortcomings common to all or almost all GCMs that lead them to misestimate the climate system response to increased forcing.
[1] Patrick T. Brown & Ken Caldeira, 2017. Greater future global warming inferred from Earth’s recent energy budget, doi:10.1038/nature24672.
[2] The predicted value of the predictand is the sum of the predictor variables each weighted by its coefficient, plus an intercept term.
[3] A Hoskuldsson, 1992. The H-principle in modelling with applications to chemometrics. Chemometrics and Intelligent Laboratory Systems, 14, 139-153.
[4] Schneider, T., 2001: Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values. J. Climate, 14, 853–871.
[5] Due to memory limitations I had to reduce the longitudinal resolution by a factor of three when using all predictor fields simultaneously. Note that RegEM standardizes all predictor variables to unit variance.
[6] Hobbs et al, 2016. An Energy Conservation Analysis of Ocean Drift in the CMIP5 Global Coupled Models. DOI: 10.1175/JCLI-D-15-0477.1.
[7] See discussion following this blog comment.
[8] Ming Zhao et al, 2016. Uncertainty in model climate sensitivity traced to representations of cumulus precipitation microphysics. J Cli, 29, 543-560.

Link

https://judithcurry.com/2017/12/24/reply-to-patrick-browns-response-to-comments…

Title	Items
UNZ	837
In This Together	76
Julius Reuchel	39
Truth Comes to Light	1878
The Unweb Developer	15
Grand theft world	2889
Ivor Cummings	171
World Freedom Alliance	1183
Swebb TV	18
SGT Report	19578
Friends Against Government	114
Scott Horton	630
Tim Woods	636
Ron Paul Institute	187
Covid Infos	63
Technocracy News	1962
Ochelli Effect	521
Computing Forever	137
Summit news	4424
Unlimited Hangout	405
American Institute for Economic Research	3089
The last American Vagabond	856
The Gray Zone	255
Covert Action Magazine	690
The high wire	318
Tareq Haddad	32
Please Stop the Ride	102
The Infectious Myth	27
Lockdown Skeptics	3538
Sam Husseini	50
Dr. Andrew Kaufman	4
Swiss Propaganda Research	367
Off Guardian	1950
Cory Morningstar	19
James Bovard	663
WWI Hidden History	51
Grayzone Project	749
Pass Blue	466
Dilyana Gaytandzhieva	32
John Pilger	437
The Real News	402
Scrutinised Minds	39
Need To Know News	5518
FEE	7340
Marine Le Pen	472
Francois Asselineau	25
Opassande	55
HAX on 5July	220
Henrik Alexandersson	1894
Mohamed Omar	409
Professors Blog	10
Arg Blatte Talar	40
Angry Foreigner	19
Fritte Fritzson	12
Teologiska rummet	36
Filosofiska rummet	297
Vetenskapsradion Historia	364
Snedtänkt (Kalle Lind)	437
Les Crises	5899
Richard Falk	390
Ian Sinclair	236
SpinWatch	71
Counter Currents	20574
Kafila	1103
Gail Malone	59
Transnational Foundation	221
Rick Falkvinge	96
The Duran	19500
Vanessa Beeley	555
Nina Kouprianova	29
MintPress	7402
Paul Craig Roberts	6988
News Junkie Post	91
Nomi Prins	27
Kurt Nimmo	191
Strategic Culture	7683
Sir Ken Robinson	98
Stephan Kinsella	1144
Liberty Blitzkrieg	890
Sami Bedouin	65
Consortium News	2685
21 Century Wire	6186
Burning Blogger	324
Stephen Gowans	178
David D. Friedman	322
Anarchist Standard	16
The BRICS Post	1558
Tom Dispatch	736
Levant Report	18
The Saker	8224
The Barnes Review	623
John Friend	770
Psyche Truth	160
Jonathan Cook	184
New Eastern Outlook	7880
School Sucks Project	1932
Giza Death Star	2993
Andrew Gavin Marshall	28
Red Ice Radio	1098
GMWatch	3090
Robert Faurisson	150
Espionage History Archive	38
Jay's Analysis	1823
Le 4ème singe	92
Jacob Cohen	238
Agora Vox	30494
Cercle Des Volontaires	539
Panamza	3561
Fairewinds	127
Project Censored	1944
Spy Culture	983
Conspiracy Archive	135
Crystal Clark	76
Timothy Kelly	1003
PINAC	1482
The Conscious Resistance	1721
Independent Science News	118
The Anti Media	6913
Positive News	830
Brandon Martinez	30
Steven Chovanec	63
Lionel	323
The Mind renewed	562
Natural Society	2627
Yanis Varoufakis	1424
Tragedy & Hope	138
Dr. Tim Ball	114
Web of Debt	207
Porkins Policy Review	495
Conspiracy Watch	174
Eva Bartlett	769
Libyan War Truth	395
DeadLine Live	2006
Kevin Ryan	74
BSNEWS	2315
Aaron Franz	426
Traces of Reality	166
Revelations Radio News	307
Dr. Bruce Levine	244
Peter B Collins	1983
Faux Capitalism	205
Dissident Voice	16972
Climate Audit	246
Donna Laframboise	682
Judith Curry	1397
Geneva Business Insider	40
Media Monarchy	4120
Syria Report	87
Human Rights Investigation	98
Intifada (Voice of Palestine)	1685
Down With Tyranny	14579
Laura Wells Solutions	91
Video Rebel's Blog	691
Revisionist Review	485
Aletho News	31557
ضد العولمة	27
Penny for your thoughts	3947
Northerntruthseeker	4206
كساريات	37
Color Revolutions and Geopolitics	27
Stop Nato	5698
AntiWar.com Blog	5173
AntiWar.com Original Content	10472
Corbett Report	3491
Stop Imperialism	491
Land Destroyer	1685
Webster Tarpley Website	1463

Reply to Patrick Brown’s response to comments on his Nature article

Tags