Neukom and Gergis Serve Cold Screened Spaghetti

Neukom, Gergis and Karoly, accompanied by a phalanx of protective specialists, have served up a plate of cold screened spaghetti in today’s Nature (announced by Gergis here).

Gergis et al 2012 (presently in a sort of zombie withdrawal) had foundered on ex post screening. Neukom, Gergis and Karoly + 2014 take ex post screening to a new and shall-we-say unprecedented level. This will be the topic of today’s post.

Data Availability
As a preamble, the spaghetti is cold in the sense that the network of proxies is almost identical to the proxy network of Neukom and Gergis 2012, which was not archived at the time and which Neukom refused to provide (see CA here). I had hoped that Nature would require Neukom to archive the data this time, but disappointingly Neukom once again did not archive the data. I’m reasonably optimistic that Nature will eventually require Neukom to archive the data, but the unavailability of the data when the article is released restricts commentary significantly. I’ve written to Nature asking them to require Neukom and Gergis to archive the data. (April 1 – an archive has been placed online at NOAA).

Wagenmakers’ Anti-Torture Protocol
In the wake of several social psychology scandals, there has been renewed statistical interest in the problem of “data torture”, for example by Wagenmakers (for example, here and here).

Wagenmakers observes that “data torture” can occur in many ways. He is particularly critical of the ad hoc and ex post techniques that authors commonly use to extract “statistically significant” results from unwieldy data. Ex post screening is an example of data torture. Wagenmakers urges that, for “confirmatory analysis”, authors be required to set out a statistical plan in advance and stick to it. He acknowledges that some results may emerge during analysis, but finds that such results can only be described as “exploratory”.

Wagenmakers’ anti-torture protocol not only condemns ex post statistical manipulations (including ex post screening), but excludes data used in the formulation of a hypothesis, from confirmatory testing of the hypothesis. In other words, Wagenmakers anti-torture protocol would exclude proxies used to develop previous Hockey Sticks and restrict confirmation studies to the consideration of new proxies. This would prevent the use of the same data over and over again in supposedly “independent” studies – a paleoclimate practice long criticized at CA.

In my own examination of new multiproxy reconstructions, I tend to be most interested in “new” proxies. It would be a worthwhile exercise in each new reconstruction to clearly show and discuss the “new” proxies – which are the only ones that pass Wagenmakers’ criteria.

Ex post (after the fact) screening is a form of data torture long criticized at climate blogs (CA, Jeff Id, Lucia, Lubos – not previously using the term “data torture” in this context), but widely accepted by IPCC scientists. It was an issue with Gergis et al 2012 and again with Neukom, Gergis and Karoly 2014.

Gergis et al 2012 had stated that they had mitigated post hoc screening by de-trending the data before correlation. However, as Jean S observed at Climate Audit at the time, they actually calculated correlations on non-detrended data. Jean S observed that almost no proxies passed screening using the protocol reported in the article itself.

Gergis, encouraged by Mann and Schmidt, tried to persuade the journal that they should be allowed to change the description of the methodology to match their actual calculations. However, the journal did not agree. They required Gergis and Neukom to re-do their calculations using the stated methodology and to show that any difference in protocol “didn’t matter.” Unfortunately for Gergis and Neukom, it did matter. They subsequently re-submitted, but two years later, nothing has appeared.

In their new article, Neukom and Gergis are once again back in the post-hoc screening business but have taken post hoc screening to shall-we-say unprecedented levels.

They stated that their network consisted of 325 “records”:

The palaeoclimate data network consists of 48 marine (46 coral and 2 sediment time series) and 277 terrestrial (206 tree-ring sites, 42 ice core, 19 documentary, 8 lake sediment and 2 speleothem) records [totalling 325 sites] (details in Supplementary Section 1)…

Some of the 206 tree ring sites are combined into “composites” of nearby sites: their list of proxies in Supplementary Table 1 contains 204 records and it is these 204 records that are screened.

Once again, they claimed that their screening based on local correlations with detrended data, reducing the network to 111 (54% of the original count). In the Methodology section of the article:

Proxies are screened with local grid-cell temperatures yielding 111 temperature predictors (Fig. 1) for the nested multivariate principal component regression procedure.

and in the SI:

The predictors for the reconstructions are selected based on their local correlations with the target grid…

Later in the SI, they state that detrended data was used for the local correlation:

Both the proxy and instrumental data are linearly detrended over the 1911-1990 overlap period prior to the correlation analyses. Correlations of each proxy record with all grid cells are then calculated for the period 1911-1990.

Jean S determined that only a few proxies in the network of Gergis et al 2012 (which contributes to the present network) passed a screening test using detrended data.

So how did Neukom and Gergis 2014 get a yield of over 54%?

Watch how they calculated “local” correlation. Later in the SI, they say (for all non-Antarctic cells):

We consider the “local” correlation of each record as the highest absolute correlation of a proxy with all grid cells within a radius of 1000 km and for all the three lags (0, 1 or -1 years). A proxy record is included in the predictor set if this local correlation is significant (p<0.05). … Significance levels (5% threshold) are calculated taking AR1 autocorrelation into account (Bretherton et al., 1999).

Mann et al 2008 had improved their screening yield by a “pick two” methodology. Neukom and Gergis go far beyond that by comparing to grid cells within 1000 km and three lags. As I understand this, they picked the “best” correlation from several dozen comparisons. One wonders how they calculated “significance” in such a calculation (not elaborated in the article itself). Unless their benchmarks allowed for the enormous number of comparisons, their “significance” calculations would be incorrect.

The above procedure is used for non-Antarctic proxies. For Antarctic proxies, they say:

Proxies from Antarctica, which are outside the domain used for proxy screening, are included, if they correlate significantly with at least 10% of the grid-area used for screening (latitude weighted).

At present, I am unable to interpret this test in operational terms.

With a radius of 1000 km, 54% of the proxies passed their test (111). With a reduced radius of 500 km, the yield fell to 42% (85 proxies). The acceptance rate for corals was about 80% and for other proxies was about 50% (slightly lower for ice cores).

Among the “long” proxies (ones that start earlier than 1025, thus covering most of the MWP), 9 of 12 ice core proxies were rejected, including isotope records from Siple Dome, Berkner Island, EDML Dronning Maud. The only “new” passing ice core record was a still unpublished Law Dome Na series (while Na series from Siple Dome and EDML did not “pass”).

Of the 5 long tree ring series, only Mt Read (Tasmania) and Oroko Swamp NZ “passed”. These are not new series: Mt Read (Tasmania) has been used since Mann et al 1999 and Jones et al 1998, while Oroko was considered in Mann and Jones 2003. Both were illustrated in AR4. Mann and Jones 2003 had rejected Oroko as not passing local correlation, but it “passes” Neukom and Gergis with flying colors. (The Oroko version needs to be parsed, because at least one version spliced instrumental data due to recent logging disturbance.) Not passing were three South American series, one of which included Rio Alerce, a series used in Mann et al 1998-99.

None of the “documentary” series cover the medieval period, but calibration of these series is idiosyncratic to say the least. Nearly all of these series are direct measures of precipitation. SI Table 4 shows that these series end in the late 20th century, but a footnote to the table says that the 20th century portion of nine (of 19 series) is projected, citing earlier publications of the same authors for the projection method.

The documentary record ends in the 19th or early 20th century and was extended to present using “pseudo documentaries” (see Neukom et al. 2009 and Neukom et al. 2013)

They “explain” their extrapolation as follows:

Some documentary records did not originally cover the 20th century (Supplementary Table 4). In order to be able to calibrate them, we extend them to present using the “pseudo documentary” approach described by Neukom et al. (2009; 2013). In this approach, the representative instrumental data for each record are degraded with white noise and then classified into the index categories of the documentary record in order to realistically mimic its statistical properties and not overweight the record in the multiproxy calibration process. The amount of noise to be added is determined based on the overlap correlations with the instrumental data. In order to avoid potential biases by using only one iteration of noise6 degrading, we create 1,000 “pseudo documentaries” for each record and randomly sample one realization for each ensemble member (see below, Section 2.2). All documentary records are listed in Supplementary Table 4.

For Tucuman precipitation, one of the documentary records so extended, they report a “local” correlation (under the idiosyncratic methods of the article) of 0.43 and a correlation to SH temperature of 0.37. This was a higher correlation than all but one documentary indices with actual 20th century data.

Comparison to PAGES2K
The present dataset is closely related to data used for the South American and Australasian regional PAGES2K reconstruction used in IPCC AR5. I previously discussed the PAGES2K South American reconstruction here. pointing out that it had used the Quelccaya O18 and accumulation data upside-down to the orientation employed by specialists and upside-down to Thompson’s own reports. I also discussed Neukom’s South American network in the context of the AR5 First Draft here.

Neukom et al 2014 is non-compliant with Wagenmakers’ anti-torture protocol on several important counts, including its unprecedented ex post screening and its reliance on the same proxies that have been used in multiple previous studies.

I have some work on SH proxies in inventory, some of which touches on both Neukom and Gergis 2014 and PAGES2K, and will try to write up some posts on the topic from time to time.

Climate Audit

Dear friends of this aggregator

  • Yes, I intentionally removed Newsbud from the aggregator on Mar 22.
  • Newsbud did not block the aggregator, although their editor blocked me on twitter after a comment I made to her
  • As far as I know, the only site that blocks this aggregator is Global Research. I have no idea why!!
  • Please stop recommending Newsbud and Global Research to be added to the aggregator.

Support this site

News Sources

Source Items
WWI Hidden History 51
Grayzone Project 248
Pass Blue 258
Dilyana Gaytandzhieva 16
John Pilger 418
The Real News 367
Scrutinised Minds 29
Need To Know News 2819
FEE 4926
Marine Le Pen 391
Francois Asselineau 25
Opassande 53
HAX on 5July 220
Henrik Alexandersson 1022
Mohamed Omar 392
Professors Blog 10
Arg Blatte Talar 40
Angry Foreigner 18
Fritte Fritzson 12
Teologiska rummet 32
Filosofiska rummet 119
Vetenskapsradion Historia 168
Snedtänkt (Kalle Lind) 231
Les Crises 3057
Richard Falk 182
Ian Sinclair 115
SpinWatch 61
Counter Currents 10346
Kafila 511
Gail Malone 42
Transnational Foundation 221
Rick Falkvinge 95
The Duran 10137
Vanessa Beeley 152
Nina Kouprianova 9
MintPress 5712
Paul Craig Roberts 2041
News Junkie Post 58
Nomi Prins 27
Kurt Nimmo 191
Strategic Culture 5219
Sir Ken Robinson 25
Stephan Kinsella 103
Liberty Blitzkrieg 863
Sami Bedouin 65
Consortium News 2685
21 Century Wire 3731
Burning Blogger 324
Stephen Gowans 94
David D. Friedman 156
Anarchist Standard 16
The BRICS Post 1523
Tom Dispatch 556
Levant Report 18
The Saker 4550
The Barnes Review 542
John Friend 500
Psyche Truth 160
Jonathan Cook 162
New Eastern Outlook 4321
School Sucks Project 1784
Giza Death Star 2003
Andrew Gavin Marshall 15
Red Ice Radio 628
GMWatch 2415
Robert Faurisson 150
Espionage History Archive 34
Jay's Analysis 1041
Le 4ème singe 90
Jacob Cohen 213
Agora Vox 16761
Cercle Des Volontaires 440
Panamza 2298
Fairewinds 117
Project Censored 1010
Spy Culture 568
Conspiracy Archive 78
Crystal Clark 11
Timothy Kelly 590
PINAC 1482
The Conscious Resistance 879
Independent Science News 83
The Anti Media 6768
Positive News 820
Brandon Martinez 30
Steven Chovanec 61
Lionel 300
The Mind renewed 447
Natural Society 2621
Yanis Varoufakis 1042
Tragedy & Hope 122
Dr. Tim Ball 114
Web of Debt 150
Porkins Policy Review 439
Conspiracy Watch 174
Eva Bartlett 621
Libyan War Truth 341
DeadLine Live 1914
Kevin Ryan 64
Aaron Franz 247
Traces of Reality 166
Revelations Radio News 121
Dr. Bruce Levine 151
Peter B Collins 1646
Faux Capitalism 205
Dissident Voice 11151
Climate Audit 226
Donna Laframboise 462
Judith Curry 1146
Geneva Business Insider 40
Media Monarchy 2493
Syria Report 78
Human Rights Investigation 93
Intifada (Voice of Palestine) 1685
Down With Tyranny 12455
Laura Wells Solutions 45
Video Rebel's Blog 444
Revisionist Review 485
Aletho News 21285
ضد العولمة 27
Penny for your thoughts 3120
Northerntruthseeker 2498
كساريات 37
Color Revolutions and Geopolitics 27
Stop Nato 4745 Blog 3191 Original Content 7193
Corbett Report 2451
Stop Imperialism 491
Land Destroyer 1238
Webster Tarpley Website 1119

Compiled Feeds

Public Lists

Title Visibility
Funny Public