Data Torture in Gergis2K

Reflecting on then current scandals in psychology arising from non-replicable research,  E. Wagenmakers, a prominent social psychologist,  blamed many of the problems on “data torture”.  Wagenmakers attributed many data torture problems on ex post selection of methods. In today’s post, I’ll show an extraordinary example of data torture in the PAGES2K Australasian reconstruction.

Wagenmakers on Data Torture


Two accessible Wagenmakers’ articles on data torture are An Agenda for Purely Confirmatory Research pdf and a Year of Horrors pdf.

In the first article, Wagenmakers observed that psychologists did not define their statistical methods before examining the data, creating a temptation to tune the results to obtain a “desired result”:

we discuss an uncomfortable fact that threatens the core of psychology’s academic enterprise: almost without exception, psychologists do not commit themselves to a method of data analysis before they see the actual data. It then becomes tempting to fine tune the analysis to the data in order to obtain a desired result—a procedure that invalidates the interpretation of the common statistical tests. The extent of the fine tuning varies widely across experiments and experimenters but is almost impossible for reviewers and readers to gauge.

Wagenmakers added:

Some researchers succumb to this temptation more easily than others, and from presented work it is often completely unclear to what degree the data were tortured to obtain the reported confession.

It is obvious that Wagenmakers’ concerns are relevant to paleoclimate, where ad hoc and post hoc methods abound and where some results are more attractive to researchers.

Gergis et al 2012

As is well-known to CA readers, Gergis et al did ex post screening of their network by correlation against their target Australasian region summer temperature.   Screening reduced the network from 62 series to 27.  For a long time, climate blogs have criticized ex post screening as a bias-inducing procedure -a bias that is obvious, but which has been neglected in academic literature.  For the most part, the issue has been either ignored or denied by specialists.

Gergis et al 2012, very unusually for the field, stated that they intended to avoid screening bias by screening on detrended data, describing their screening process as follows:

For predictor selection, both proxy climate and instrumental data were linearly detrended over the 1921-1990 period to avoid inflating the correlation coefficient due to the presence of the global warming signal present in the observed temperature record. Only records that were significantly (p<0.05) correlated with the detrended instrumental target over the 1921-1990 period were selected for analysis. This process identified 27 temperature-sensitive predictors for the SONDJF warm season.

Unfortunately for Gergis and coauthors, that’s not what they actually did. Their screening was done on undetrended data. When screening was done in the described way, only 8 or so proxies survived.  Jean S discovered this a few weeks after publication of the Gergis et al article on May 17, 2012.  Two hours after Jean S’ comment at CA, coauthor Neukom notified Gergis and Karoly of the problem.

Gergis and coauthors, encouraged by Gavin Schmidt and Michael Mann, attempted to persuade the Journal of Climate editors that they should be allowed to change the description of their methodology to what they had actually done. However, the editors did not agree, challenging the Gergis coauthors to show the robustness of their results. The article was not retracted. The University of Melbourne press statement continues to say that it was published on May 17, 2012, but has been submitted for re-review (and has apparently been under review for over two years now.)


The PAGES2K Australasian network is the product of the same authors. Its methodological description is taken almost verbatim from Gergis et al 2012.  Its network is substantially identical to the Gergis 2012 network: 20 of 27 Gergis proxies carry forward to the P2K network. Several of the absent series are from Antarctica, covered separately in P2K.  The new P2K network has 28 series, now including 8 series that had been previously screened out.  The effort to maintain continuity even extended to keeping proxies in the same order in the listing, even inserting new series in the precise empty spaces left by vacating series.

Once again, the authors claimed to have done their analysis using detrended data:

All data were linearly detrended over the 1921-1990 period and AR(1) autocorrelation was taken into account for the calculation of the degrees of freedom [55].

This raises an obvious question:  in the previous test using detrended data, only a fraction passed.  So how did they pass the detrended test this time?

Read their description of P2K screening and watch the pea:

The proxy data were correlated against the grid cells of the target (HadCRUT3v SONDJF average). To account for proxies with different seasonal definitions than our target SONDJF season (for example calendar year averages) we calculate the correlations after lagging the proxies for -1, 0 and 1 years. Records with significant (p < 0.05) correlations with at least one grid-cell within a search radius of 500 km from the proxy site were included in the reconstruction. All data were linearly detrended over the 1921-1990 period and AR(1) autocorrelation was taken into account for the calculation of the degrees of freedom [55]. For coral record with multiple proxies (Sr/Ca and ä18O) with significant correlations, only the proxy record with the higher absolute correlation was selected to ensure independence of the proxy records.

Gergis et al 2012 had calculated one correlation for each proxy, but the above paragraph describes ~27 correlations: three lag periods (+1,0,-1) by nine gridcells ( not just the host gridcell, but the W,NW,N, NE,E,SE,S and SW gridcells, all of which would be within 500 km according to my reading of the above text.) The other important change is the change from testing against a regional average to testing against individual gridcells, which, in some cases, are not even in the target region.


Gergis’  test against multiple gridcells takes the peculiar Mann et al 2008 pick-two methodology to even more baroque lengths.  Thinking back to Wagenmakers’ prescription of ex ante methods, it is hard to imagine Gergis and coauthors ex ante proposing that they test each proxy against nine different gridcells for “statistical significance”. Nor does it seem plausible that much “significance” can be placed on higher correlations from a contiguous gridcell, as compared to the actual gridcell.  It seems evident that Gergis and coauthors were doing whatever they could to salvage as much of their network as they could and that this elaborate multiple screening procedure was simply a method of accomplishing that end.  Nor does it seem reasonable to data mine after the fact for “significant” correlations between three different lag periods, including one in which the proxy leads temperature.

Had the PAGES2K coauthors fully discussed the background and development of this procedure from its origin in Gergis et al 2012, it seems hard to believe that a competent reviewer would not have challenged them on this peculiar screening procedure.  Even if such data torture were acquiesced in (which is dubious), it should have mitigated by requiring adjustment of the t-statistic standard to account for the repeated tests: with 27 draws, the odds of a value that is “95% significant” obviously change dramatically.  When the draws are independent, there are well-known procedures for doing so. Using the Bonferroni correction with 27 “independent” tests, the t-statistic for each individual test would have to be  qt(1- 0.05/27,df) rather than qt(1-.05,df).  For typical detrended autocorrelations, the df is ~55. This changes the benchmark t-statistc from ~1.7 to 3.0.  The effective number of independent tests would be less than 27 because of spatial correlation, but even if the effective number of independent tests was as few as 10, it increases the benchmark t-statistic to 2.7.  All this is without accounting for their initial consideration of 62 proxies – something else that ought to be accounted for in the t-test.

While all of these are real problems, the largest problem with the Neukom-Gergis network is grounded in the data:  the long ice core and tree ring series don’t have a HS shape. However, there is a very strong trend in coral d18O data after the Little Ice Age and especially in the 20th century.  Splicing the two dissimilar proxy datasets results in hockey sticks even without screening.   Such splicing of unlike data in the guise of “multiproxy” has been endemic in paleoclimate since Jones et al 1998 and is underdiscussed. It’s something that I plan to do.

There are other peculiarities in the Gergis dataset.  Between Gergis et al 2012, PAGES2K and Neukom et al 2014,  numerous proxies are assigned to inconsistent calendar years.  If a proxy is assigned to a calendar year that is inconsistent with the calendar year of its corresponding temperature series, the calculated correlation will be less than it really is.  Some of the low detrended correlations of Gergis et al 2012 appear to have arisen from errors in proxy year assignment. I noticed this with Oroko which I analysed in detail: it ought to pass a detrended correlation test given the splicing of instrumental data and therefore failure of a detrended correlation test requires close examination.

Climate Audit

Dear friends of this aggregator

  • Yes, I intentionally removed Newsbud from the aggregator on Mar 22.
  • Newsbud did not block the aggregator, although their editor blocked me on twitter after a comment I made to her
  • As far as I know, the only site that blocks this aggregator is Global Research. I have no idea why!!
  • Please stop recommending Newsbud and Global Research to be added to the aggregator.

Support this site

News Sources

Source Items
WWI Hidden History 51
Grayzone Project 299
Pass Blue 284
Dilyana Gaytandzhieva 16
John Pilger 421
The Real News 367
Scrutinised Minds 29
Need To Know News 2973
FEE 5046
Marine Le Pen 391
Francois Asselineau 25
Opassande 53
HAX on 5July 220
Henrik Alexandersson 1085
Mohamed Omar 403
Professors Blog 10
Arg Blatte Talar 40
Angry Foreigner 18
Fritte Fritzson 12
Teologiska rummet 32
Filosofiska rummet 127
Vetenskapsradion Historia 176
Snedtänkt (Kalle Lind) 239
Les Crises 3231
Richard Falk 191
Ian Sinclair 119
SpinWatch 61
Counter Currents 10883
Kafila 556
Gail Malone 42
Transnational Foundation 221
Rick Falkvinge 95
The Duran 10443
Vanessa Beeley 174
Nina Kouprianova 9
MintPress 5808
Paul Craig Roberts 2179
News Junkie Post 59
Nomi Prins 27
Kurt Nimmo 191
Strategic Culture 5507
Sir Ken Robinson 25
Stephan Kinsella 107
Liberty Blitzkrieg 870
Sami Bedouin 65
Consortium News 2685
21 Century Wire 3838
Burning Blogger 324
Stephen Gowans 97
David D. Friedman 157
Anarchist Standard 16
The BRICS Post 1529
Tom Dispatch 575
Levant Report 18
The Saker 4758
The Barnes Review 556
John Friend 509
Psyche Truth 160
Jonathan Cook 162
New Eastern Outlook 4510
School Sucks Project 1789
Giza Death Star 2042
Andrew Gavin Marshall 15
Red Ice Radio 642
GMWatch 2466
Robert Faurisson 150
Espionage History Archive 35
Jay's Analysis 1082
Le 4ème singe 90
Jacob Cohen 214
Agora Vox 17567
Cercle Des Volontaires 445
Panamza 2376
Fairewinds 118
Project Censored 1069
Spy Culture 591
Conspiracy Archive 81
Crystal Clark 11
Timothy Kelly 607
PINAC 1482
The Conscious Resistance 924
Independent Science News 84
The Anti Media 6850
Positive News 820
Brandon Martinez 30
Steven Chovanec 61
Lionel 305
The Mind renewed 452
Natural Society 2621
Yanis Varoufakis 1054
Tragedy & Hope 122
Dr. Tim Ball 114
Web of Debt 158
Porkins Policy Review 446
Conspiracy Watch 174
Eva Bartlett 626
Libyan War Truth 352
DeadLine Live 1916
Kevin Ryan 64
Aaron Franz 254
Traces of Reality 166
Revelations Radio News 121
Dr. Bruce Levine 153
Peter B Collins 1685
Faux Capitalism 205
Dissident Voice 11362
Climate Audit 226
Donna Laframboise 478
Judith Curry 1160
Geneva Business Insider 40
Media Monarchy 2556
Syria Report 78
Human Rights Investigation 93
Intifada (Voice of Palestine) 1685
Down With Tyranny 12807
Laura Wells Solutions 46
Video Rebel's Blog 452
Revisionist Review 485
Aletho News 21799
ضد العولمة 27
Penny for your thoughts 3188
Northerntruthseeker 2583
كساريات 37
Color Revolutions and Geopolitics 27
Stop Nato 4803 Blog 3286 Original Content 7342
Corbett Report 2509
Stop Imperialism 491
Land Destroyer 1252
Webster Tarpley Website 1137

Compiled Feeds

Public Lists

Title Visibility
Funny Public