Expert judgement and uncertainty quantification for climate change

by Judith Curry
When it comes to climate change, the procedure by which experts assess the accuracy of models projecting potentially ruinous outcomes for the planet and society is surprisingly informal. – Michael Oppenheimer

My concerns about the consensus seeking process used by the IPCC have been articulated in many previous posts [link].  I have argued that the biases  introduced into the science and policy process by the politicized UNFCCC and IPCC consensus seeking approach are promoting mutually assured delusion.
Nature Climate Change has published what I regard to be a very important paper:  Expert judgement and uncertainty quantification for climate change, by Michael Oppenheimer, Christopher Little, Roger M. Cooke [link to abstract].
Abstract. Expert judgement is an unavoidable element of the process-based numerical models used for climate change projections, and the statistical approaches used to characterize uncertainty across model ensembles. Here, we highlight the need for formalized approaches to unifying numerical modelling with expert judgement in order to facilitate characterization of uncertainty in a reproducible, consistent and transparent fashion. As an example, we use probabilistic inversion, a well-established technique used in many other applications outside of climate change, to fuse two recent analyses of twenty-first century Antarctic ice loss. Probabilistic inversion is but one of many possible approaches to formalizing the role of expert judgement, and the Antarctic ice sheet is only one possible climate-related application. We recommend indicators or signposts that characterize successful science-based uncertainty quantification.
With regards to the main technical aspects — structured expert judgment and probabilistic inversion:

  • I wrote a previous post on structured expert judgement [link]
  • The paper’s Supplementary Information is available online [link], which describes the method of probabilistic inversion.

From the Princeton press release:
Science can flourish when experts disagree, but in the governmental realm uncertainty can lead to inadequate policy and preparedness. When it comes to climate change, it can be OK for computational models to differ on what future sea levels will be. The same flexibility does not exist for determining the height of a seawall needed to protect people from devastating floods.
For the first time in the climate field, a Princeton University researcher and collaborators have combined two techniques long used in fields where uncertainty is coupled with a crucial need for accurate risk-assessment — such as nuclear energy — in order to bridge the gap between projections of Earth’s future climate and the need to prepare for it. Reported in the journal Nature Climate Change, the resulting method consolidates climate models and the range of opinions that leading scientists have about them into a single, consistent set of probabilities for future sea-level rise.
Giving statistically accurate and informative assessments of a model’s uncertainty is a daunting task, and an expert’s scientific training for such an estimation may not always be adequate.
Oppenheimer and his co-authors use a technique known as “structured expert judgment” to put an actual value on the uncertainty that scientists studying climate change have about a particular model’s prediction of future events such as sea-level rise. Experts are each “weighted” for their ability to quantify uncertainty regarding the situation at hand by gauging their knowledge of their respective fields. More consideration is given to experts with higher statistical accuracy and informativeness. Another technique, called probabilistic inversion, would adjust a climate model’s projections to reflect those experts’ judgment of its probability.
Structured expert judgment has been used for decades in fields where scenarios have high degrees of uncertainty, most notably nuclear-energy generation, Oppenheimer explained. Similar to climate change, nuclear energy presents serious risks, the likelihood and consequences of which — short of just waiting for them to occur — need to be accurately assessed.
When it comes to climate change, however, the procedure by which experts assess the accuracy of models projecting potentially ruinous outcomes for the planet and society is surprisingly informal, Oppenheimer said.
When the Intergovernmental Panel on Climate Change (IPCC) — an organization under the auspices of the United Nations that periodically evaluates the effects of climate change — tried to determine the ice loss from Antarctica for its Fourth Assessment Report released in 2007, discussion by the authors largely occurred behind closed doors, said Oppenheimer, who has been long involved with the IPCC and served as an author of its Assessment Reports.
In the end, the panel decided there was too much uncertainty in the Antarctic models to say how much ice the continent would lose over this century. But there was no actual traceable and consistent procedure that led to that conclusion, Oppenheimer said. As models improved, the Fifth Assessment Report, released in 2013, was able to provide numerical estimates of future ice loss but still based on the informal judgment of a limited number of participants.
Claudia Tebaldi, a project scientist at the National Center for Atmospheric Research, said that the researchers propose a much more robust method for evaluating the increasing volume of climate-change data coming out than experts coming up with “a ballpark estimate based on their own judgments.”
“Almost every problem out there would benefit from some approach like this, especially when you get to the point of producing something like the IPCC report where you’re looking at a number of studies and you have to reconcile them,” said Tebaldi, who is familiar with the research but had no role in it. “It would be more satisfying to do it in a more formal way like this article proposes.”
The implementation of the researchers’ technique, however, might be complicated, she said. Large bodies such as the IPCC and even individual groups authoring papers would need a collaborator with the skills to carry it out. But, she said, if individual research groups adopt the method and demonstrate its value, it could eventually rise up to the IPCC Assessment Reports.
For policymakers and the public, a more transparent and consistent measurement of how scientists perceive the accuracy of climate models could help instill more confidence in climate projections as a whole, said Sander van der Linden. With no insight into how climate projections are judged, the public could take away from situations such as the IPCC’s uncertain conclusion about Antarctica in 2007 that the problems of climate change are inconsequential or that scientists do not know enough to justify the effort (and possible expense) of a public-policy response, he said.
“Systematic uncertainties are actually forms of knowledge in themselves, yet most people outside of science don’t think about uncertainty this way,” said van der Linden. “We as scientists need to do a better job at promoting public understanding of uncertainty. Thus, in my opinion, greater transparency about uncertainty in climate models needs to be paired with a concerted effort to improve the way we communicate with the public about uncertainty and risk.”
Some excerpts from the  closing section of the paper, A Path Forward, that I think absolute nails it:
Stepping back from probabilistic inversion to the general problem of uncertainty quantification, we end by suggesting a few signposts pointing towards an informative approach.

  • First, uncertainty quantification should have a component that is model independent. All models are idealizations and so all models are wrong. An uncertainty quantification that is conditional on the truth of a model or model form is insufficient.
  • Second, the method should be widely applicable in a transparent and consistent manner. As already discussed, several approaches to uncertainty quantification have been proposed in the climate context but fall short in their generalizability or clarity.
  • Third, the outcomes should be falsifiable. Scientific theories can never be strictly verified, but to be scientific they must be falsifiable. Whether theories succumb to crucial experiments or expire under a ‘degenerating problem shift’, the principle of falsifiability remains a point of departure.

With regard to uncertainty quantification, falsification must be understood probabilistically. The point of predicting the future is that we should not be too surprised when it arrives. Comparing new observations with the probability assigned to them by our uncertainty quantification gauges that degree of surprise. With this in mind, outcomes should also be subject to arduous tests. Being falsifiable is necessary but not sufficient. As a scientific claim, uncertainty quantification must withstand serious attempts at falsification. Surviving arduous tests is sometimes called confirmation or validation, not to be confused with verification. Updating a prior distribution does not constitute validation. Bayesian updating is the correct way to learn on a likelihood and prior distribution, but it does not mean that the result of the learning is valid. Validation ensues when posterior ‘prediction intervals’ are shown to capture out-of-sample (for example, future) observations with requisite relative frequencies. 
JC reflections
I have been following Roger Cooke’s research closely for the last year or so, and we have begun an email dialogue. I am very pleased to see that his ideas are being seriously applied to issues of relevance to the IPCC and climate change.
Michael Oppenheimer, although regarded as an activist and sometimes an alarmist about climate change, has made important contributions in assessing and criticizing the the IPCC process [link], particularly with regards to the consensus process and the treatment and communication of uncertainty.
I find this statement made by Oppenheimer in the press release to be particularly stunning:
When it comes to climate change, however, the procedure by which experts assess the accuracy of models projecting potentially ruinous outcomes for the planet and society is surprisingly informal.
It is well nigh time for the IPCC and other assessments to up the level of their game and add some rational structure to their assessment and uncertainty analysis.  The proposal by Oppenheimer et al. provides an excellent framework for such a structure.
However, there is one missing element here, that was addressed in my paper Reasoning About Climate Uncertainty.  This relates to the actual point in the logical hierarchy where expert judgment is brought in. Excerpt:

Identifying the most important uncertainties and introducing a more objective assessment of confidence levels requires introducing a more disciplined logic into the climate change assessment process. A useful approach would be the development of hierarchical logical hypothesis models that provides a structure for assembling the evidence and arguments in support of the main hypotheses or propositions. A logical hypothesis hierarchy (or tree) links the root hypothesis to lower level evidence and hypotheses. While developing a logical hypothesis tree is somewhat subjective and involves expert judgments, the evidential judgments are made at a lower level in the logical hierarchy. Essential judgments and opinions relating to the evidence and the arguments linking the evidence are thus made explicit, lending structure and transparency to the assessment. To the extent that the logical hypothesis hierarchy decomposes arguments and evidence to the most elementary propositions, the sources of disputes are easily illuminated and potentially minimized.
Bayesian Network Analysis using weighted binary tree logic is one possible choice for such an analysis. However, a weakness of Bayesian Networks is its two-valued logic and inability to deal with ignorance, whereby evidence is either for or against the hypothesis. An influence diagram is a generalization of a Bayesian Network that represents the relationships and interactions between a series of propositions or evidence Three-valued logic has an explicit role for uncertainties that recognizes that evidence may be incomplete or inconsistent, of uncertain quality or meaning. Combination of evidence proceeds generally as for a Bayesian combination, but are modified by the factors of sufficiency, dependence and necessity.

Hall et al. conclude that influence diagrams can help to synthesize complex and contentious arguments of relevance to climate change. Breaking down and formalizing expert reasoning can facilitate dialogue between experts, policy makers, and other decision stakeholders. The procedure used by Hall et al. supports transparency and clarifies uncertainties in disputes, in a way that expert judgment about high level root hypotheses fails to do.
I regard it to be a top priority for the IPCC to implement formal, objective procedures for assessing consensus and uncertainty.  Continued failure to do so will be regarded as laziness and/or as political protection for an inadequate status quo.

Filed under: Uncertainty

Source