by Judith Curry
Any attempt to impose agreement will “promote confusion between consensus and certainty”. The goal should be to quantify uncertainty, not to remove it from the decision process. – Willy Aspinall
Context
The framing of the climate change problem as caused by humans and the UNFCCC strategy of the Precautionary Principle has led to a framework where scientists work in a politicized environment of ‘speaking consensus to power’ – a politicized process where consensus is manufactured to identify carbon emissions stabilization targets.
Many times, I have voiced my concerns about the consensus-seeking approach used by the IPCC:
- Groups and herds: implications for the IPCC
- The IPCC’s inconvenient truth
- Manufacturing consensus: clinical guidelines
- Do scientific assessments need to be consensual to be authoritative?
- Climate change: no consensus on consensus
- IPCC: functional stupidity?
- Mutually assured delusion
The biases that the UNFCCC and IPCC have introduced into the science and policy process are arguably promoting mutually assured delusion. There has to be a better way to assess the science. The strategy of structured expert judgment presents a much better strategy for building a rational consensus (as opposed to a political consensus) that realistically accounts for uncertainties and diversity of opinions/perspectives.
Rational consensus under uncertainty
The challenge is laid out in a document by Cooke et al.: Rational consensus under uncertainty. Excerpts (JC bold):
Governmental bodies are confronted with the problem of achieving rational consensus in the face of substantial uncertainties. The area of accident consequence management for nuclear power plants affords a good example. Decisions with regard to evacuation, decontamination, and food bans must be taken on the basis of predictions of environmental transport of radioactive material, contamination through the food chain, cancer induction, and the like. These predictions use mathematical models containing scores of uncertain parameters. Decision makers want to take, and want to be perceived to take, these decisions in a rational manner. The question is, how can this be accomplished in the face of large uncertainties? Indeed, the very presence of uncertainty poses a threat to rational consensus. Decision makers will necessarily base their actions on the judgments of experts. The experts, however, will not agree among themselves, as otherwise we would not speak of large uncertainties. Any given expert’s viewpoint will be favorable to the interests of some stakeholders, and hostile to the interests of others. If a decision maker bases his/her actions on the views of one single expert, then (s)he is invariably open to charges of partiality toward the interests favored by this viewpoint.
An appeal to ‘impartial’ or ‘disinterested’ experts will fail for two reasons. First, experts have interests; they have jobs, mortgages and professional reputations. Second, even if expert interests could somehow be quarantined, even then the experts would disagree. Expert disagreement is not explained by diverging interests, and consensus cannot be reached by shielding the decision process from expert interests. If rational consensus requires expert agreement, then rational consensus is simply not possible in the face of uncertainty. If rational consensus under uncertainty is to be achieved, then evidently the views of a diverse set of experts must be taken into account. The question is how? Simply choosing a maximally feasible pool of experts and combining their views by some method of equal representation might achieve a form of political consensus among the experts involved, but will not achieve rational consensus. If expert viewpoints are related to the institutions at which the experts are employed, then numerical representation of viewpoints in the pool may be, and/or may be perceived to be influenced by the size of the interests funding the institutes.
We collect a number of conclusions regarding the use of structured expert judgment.
1. Experts’ subjective uncertainties may be used to advance rational consensus in the face of large uncertainties, in so far as the necessary conditions for rational consensus are satisfied.
2. Empirical control of experts’ subjective uncertainties is possible.
3. Experts’ performance as subjective probability assessors is not uniform, there are significant differences in performance.
4. Experts as a group may show poor performance.
5. A structured combination of expert judgment may show satisfactory performance, even though the experts individually perform poorly.
6. The performance based combination generally outperforms the equal weight combination.
7. The combination of experts’ subjective probabilities, according to the schemes discussed here, generally has wider 90% central confidence intervals than the experts individually; particularly in the case of the equal weight combination.
We note that poor performance as a subjective probability assessor does not indicate a lack of substantive expert knowledge. Rather, it indicates unfamiliarity with quantifying subjective uncertainty in terms of subjective probability distributions.
Some examples of implementation of Cooke’s structured expert judgement strategy is described in Procedures Guide for Structured Expert Judgment.
There is a also a relevant .ppt presentation by Tim Bedford (Management Sciences at University of Strathclyd) entitled Use and Abuse of Expert Judgement in Risk Studies, with a good discussion of biases and framing problems.
Geophysical applications
A 2010 article published in Nature [link] provides some interesting geophysical applications.
A route to more tractable expert advice
Willy Aspinall
Abstract. There are mathematically advanced ways to weigh and pool scientific advice. They should be used more to quantify uncertainty and improve decision-making, says Willy Aspinall.
When a volcano became restless on the small, populated island of Montserrat, West Indies, in 1995, there was debate among scientists: did the bursts of steam and ash presage an explosive and deadly eruption, or would the outcome be more benign? Authorities on the island, a British overseas territory, needed advice to determine warning levels, and whether travel restrictions and evacuations were needed. The British government asked me, as an independent volcanologist, to help reconcile differing views within the group.
As it happened, I had experience not only with the region’s volcanoes, but also with a unique way of compiling scientific advice in the face of uncertainty: the Cooke method of ‘expert elicitation’. This method weighs the opinion of each expert on the basis of his or her knowledge and ability to judge relevant uncertainties. The approach isn’t perfect. But it can produce a ‘rational consensus’ for many hard-to-assess risks, from earthquake hazards to the probable lethal dose of a poison or the acceptable limits of an air pollutant. For broader questions with many unknowns — such as what the climate will be like in 50 years — the method can tackle small specifics of the larger question, and identify gaps in knowledge or expose significant disparities of opinion.
As a group, myself and ten other volcanologists decided to trial this methodology. We were able to provide useful guidance to the authorities, such as the percentage chance of a violent explosion, as quickly as within an hour or two. More than 14 years on, volcano management in Montserrat stands as the longest-running application of the Cooke method.
Faced with uncertainty, decision-makers invariably seek agreement or unambiguous consensus from experts. But it is not reasonable to expect total consensus when tackling difficult-to-predict problems such as volcanic eruptions. The method’s originator Roger Cooke says that when scientists disagree, any attempt to impose agreement will “promote confusion between consensus and certainty”. The goal should be to quantify uncertainty, not to remove it from the decision process.
Of the many ways of gathering advice from experts, the Cooke method is, in my view, the most effective when data are sparse, unreliable or unobtainable. There are several methods of such expert elicitation, each with flaws. The traditional committee still rules in many areas — a slow, deliberative process that gathers a wide range of opinions. But committees traditionally give all experts equal weight (one person, one vote). This assumes that experts are equally informed, equally proficient and free of bias. These assumptions are generally not justified.
Another kind of elicitation — the Delphi method — was developed in the 1950s and 1960s. This involves getting ‘position statements’ from individual experts, circulating these, and allowing the experts to adjust their own opinions over multiple rounds. What often happens is that participants revise their views in the direction of the supposed ‘leading’ experts, rather than in the direction of the strongest arguments.
Cooke’s method instead produces a ‘rational consensus’. To see how this works, take as an example an elicitation I conducted in 2003, to estimate the strength of the thousands of small, old earth dams in the United Kingdom. Acting as facilitator, I first organized a discussion between a group of selected experts about how water can leak into the cores of such ageing dams, leading to failure. The experts were then asked individually to give their own opinion of the time-to-failure in a specific type of dam, once such leakage starts. They answered with both a best estimate and a ‘credible interval’, for which they thought there was only a 10% chance that the true answer was higher or lower.
I also asked each expert a set of eleven ‘seed questions’, for which answers are known, so that their proficiency could be calibrated. As is often the case, several experts were very sure of their judgement and provided very narrow uncertainty ranges. But the more cautious experts with longer time estimates and wider uncertainty ranges did better on the seed questions, so their answers were weighted more heavily. Their views would probably have been poorly represented if the decision had rested on a group discussion in which charismatic, confident personalities might carry the day. Self-confidence is not a good predictor of expert performance, and, interestingly, neither is scientific prestige and reputation.
Sometimes an elicitation reveals two camps of opinion. This usually arises from ambiguity in framing the problem or because the two sub-groups have subtly different backgrounds. In this case, a few of the experts with longer time-to-failure estimates were working engineers with practical experience, rather than academics. Highlighting such clear differences is useful; sometimes it reveals errors or misunderstandings in one of the groups. In this case it triggered further investigation.
Uncertainty in science is unavoidable and throws up many challenges. On one side, you may have scientists reluctant to offer their opinions on important societal topics. On the other, you may have decision-makers who ignore uncertainty because they fear undermining public confidence or opening regulations to legal challenges. The Cooke approach can bridge these two positions. It is most useful when there is no other sensible way to make risk-based decisions — apart from resorting to the precautionary principle, or, even less helpfully, evading an answer by agreeing to disagree.
JC reflections
I think Cooke’s ideas, and Aspinelli’s case studies of geophysically-relevant applications, show great promise in assessing the state of knowledge in a range of climate change topics.
I can particularly imagine applying the method of structured expert judgment in context of applying influence diagrams and 3-valued logic (Italian flag) to represent uncertainty in complex climate change problems [link to post].
In applying this to something like the 20th century climate attribution problem, the challenge would be twofold:
- assembling a sufficiently diverse group of scientists, in the highly politicized IPCC process
- identifying appropriate ‘seed questions’ and weighting the experts
I was particularly struck by this statement:
We note that poor performance as a subjective probability assessor does not indicate a lack of substantive expert knowledge. Rather, it indicates unfamiliarity with quantifying subjective uncertainty in terms of subjective probability distributions.
Hence, a premium on scientists who understand uncertainty and can quantify subjective uncertainty. I really like the idea that better training of scientists on how to assess and think about uncertainty is key to better expert judgment.
Uncertain T. Monster is pleased.
Filed under: Consensus, Uncertainty