by Judith Curry
Group failures often have disastrous consequences—not merely for businesses, nonprofits, and governments, but for all those affected by them. – Cass Sunstein and Reid Hastie
Context
The social psychology of groups conducting scientific assessments (e.g. the IPCC) is a topic that in my opinion does not receive sufficient attention. For background, here are some previous CE posts:
- Importance of intellectual and political diversity in science
- No consensus on consensus
- Do scientific assessments need to be consensual to be authoritative?
- Are climate scientists being forced to toe the line?
- We are all confident idiots
- Cognitive bias – how petroleum scientists deal with it
This past week, there have been two articles on this topic, that provide important insights of relevance to the IPCC assessment process.
Groups
Sunstein and Hastie have a lengthy article in the Harvard Business Review entitled Making Dumb Groups Smarter. Excerpts:
The advantage of a group, wrote one early advocate of collective intelligence—Aristotle—is that “when there are many who contribute to the process of deliberation, each can bring his share of goodness and moral prudence…some appreciate one part, some another, and all together appreciate all. Unfortunately, groups all too often fail to live up to this potential.
Groups err for two main reasons. The first involves informational signals. Naturally enough, people learn from one another; the problem is that groups often go wrong when some members receive incorrect signals from other members. The second involves reputational pressures, which lead people to silence themselves or change their views in order to avoid some penalty—often, merely the disapproval of others. But if those others have special authority or wield power, their disapproval can produce serious personal consequences.
As a result of informational signals and reputational pressures, groups run into four separate though interrelated problems. When they make poor or self-destructive decisions, one or more of these problems are usually to blame:
- Groups do not merely fail to correct the errors of their members; they amplify them.
- They fall victim to cascade effects, as group members follow the statements and actions of those who spoke or acted first.
- They become polarized, taking up positions more extreme than those they held before deliberations.
- They focus on what everybody knows already—and thus don’t take into account critical information that only one or a few people have.
If most members of a group tend to make certain errors, then most people will see others making the same errors. What they see serves as “proof” of erroneous beliefs. Reputational pressures play a complementary role: If most members of the group make errors, others may make them simply to avoid seeming disagreeable or foolish.
If a project, a product, a business, a politician, or a cause gets a lot of support early on, it can win over a group even if it would have failed otherwise. Many groups end up thinking that their ultimate convergence on a shared view was inevitable. Beware of that thought. The convergence may well be an artifact of who was the first to speak—and hence of what we might call the architecture of the group’s discussions.
Two kinds of cascades—informational and reputational—correspond to our two main sources of group error. In informational cascades, people silence themselves out of deference to the information conveyed by others. In reputational cascades, they silence themselves to avoid the opprobrium of others.
Group members think they know what is right, but they nonetheless go along with the group in order to maintain the good opinion of others.
“Political correctness,” a term much used by the political right in the 1990s, is hardly limited to left-leaning academic institutions. In both business and government there is often a clear sense that a certain point of view is the proper one and that those who question or reject it, even for purposes of discussion, do so at their peril. They are viewed as “difficult,” “not part of the team,” or, in extreme cases, as misfits.
In the actual world of group decision making, of course, people may not know whether other members’ statements arise from independent information, an informational cascade, reputational pressures, or the availability heuristic. They often overestimate the extent to which the views of others are based on independent information. Confident (but wrong) group decisions are a result.
Suppose a group has a great deal of information—enough to produce the unambiguously right outcome if that information is elicited and properly aggregated. Even so, the group will not perform well if its members emphasize broadly shared information while neglecting information that is held by one or a few. The finding? Common information had a disproportionately large impact on discussions and conclusions.
Making wiser groups:
- Silence the leader
- ‘Prime’ critical thinking
- Appoint a devil’s advocate
- Establish contrarian teams
JC comments: Lots of implications here for the IPCC assessment process, particularly
- reputational pressures (I am the poster child for the ostracism faced by someone who disagrees)
- cascade effects from previous assessment reports
- consensus seeking approach that marginalizes/ignores dissenting perspectives
Herding
Nate Silver has an article Here’s proof some pollsters are putting a thumb on the scale. Excerpts:
It’s time to stop worrying about outliers and start worrying about inliers. Earlier this year, my colleague Harry Enten documented evidence of pollster “herding” — the tendency of polling firms to produce results that closely match one another, especially toward the end of a campaign. What’s wrong with the polls agreeing with one another? The problem is that it’s sometimes a case of the blind leading the blind.
It’s not the inaccuracy of the polling average that should bother you so much as the consensus around such a wrong result. This consensus very likely reflects herding. In this case, pollsters herded toward the wrong number.
The impolite way to put it is that this was CYA (cover-your-ass) time for pollsters. Some that had produced “outlier” results before suddenly fell in line with the consensus.
The other giveaway is the one we discovered before in Iowa. By the end of the campaign, new polls diverged from the polling averages by less than they plausibly could if they were taking random samples and not tinkering with them.
To be clear, I’m not accusing any pollsters of faking results. But some of them were probably “putting their thumbs on the scale,” manipulating assumptions in their polls such that they more closely matched the consensus.
In some cases, the pollsters’ intentions may have been earnest enough. Perhaps they ran a poll in Iowa and it came back Ernst +7. That can’t be right, they’d say to themselves. No one else has the race like that. So they’d dig into their crosstabs and find something “wrong.” Ahh — that’s the problem, not enough responses from Ames and Iowa City.12 Let’s apply some geographic weights. That comes out to … Ernst +3? We can live with that.
Even when the pollsters mean well, this attitude runs counter to the objective, scientific nature of polling. As a general principle, you should not change the methodology in the middle of an experiment.
The problem is simple enough to diagnose: When pollsters herd, if the first couple of polls happen to get the outcome wrong, subsequent ones will replicate the mistake.
The occasional or even not-so-occasional result that deviates from the consensus is sometimes a sign the pollster is doing good, honest work and trusting its data. It’s the inliers — the polls that always stay implausibly close to the consensus and always conform to the conventional wisdom about a race — that deserve more scrutiny instead.
JC comment: The idea of herding in opinion polling seems relevant for climate model intercomparison projects, production of multiple data records of the same climate variable, and model-data inter comparison efforts, particularly when combined with issues raised by Sunstein.
Disagreement between data sets, between models, between models and data, and between models or data with what you ‘expect’ can motivate subjective focus on ‘fixing’ the disagreement. These ‘fixes’ can introduce bias.
Truly objectives methods in climate data record building and climate model calibration seem rare – the Berkeley Earth surface temperature effort seems objective, and also the calibration method used for the Hadley climate model.
JC reflections
Once again we see convincing arguments for devil’s advocate and contrarian teams in group assessments
The pitfalls of manufacturing a consensus (i.e. end up being wrong), with ever increasing levels of confidence, seem unavoidable with without formal inclusion of contrarian teams.
I was particularly struck by Sunstein/Hastie’s focus on reputitional pressures, which are HUGE in the climate debate – independent thinkers get labeled as deniers and get ostracized by those ‘with special authority or who wield power’ – i.e. climate scientists with access to the media, editors of journals, officers in professional societies, etc.
To my mind, this statement perfectly reflects the problem with interpreting the IPCC’s consensus:
In the actual world of group decision making, of course, people may not know whether other members’ statements arise from independent information, an informational cascade, reputational pressures, or the availability heuristic.
Related issues surrounding the IPCC process seem overripe for this kind of investigation and analysis.Filed under: Consensus, IPCC, Sociology of science