PAGES2K and Nature’s Policy against Self-Plagiarism

Nature’s policies on plagiarism state:

Duplicate publication, sometimes called self-plagiarism, occurs when an author reuses substantial parts of his or her own published work without providing the appropriate references.

The description of the Australasian network of PAGES2K (coauthors Gergis, Neukom, Phipps and Lorrey) is almost entirely lifted in verbatim or near-verbatim chunks from Gergis et al, 2012 (withdrawn and under re-review), in apparent violation of Nature’s policy against self-plagiarism.

The Copying of Text
In this section, I will compare paragraphs from PAGES2K to paragraphs from Gergis et al 2012.  The authors of the PAGES2K Australasian section are listed as follows:

Australasia: J.G.[Gergis], A.M.L.[Andrew Lorrey], S.J.P. [Steven Phipps] & R.N.[Neukom] coordinated the study. R.N. & J.G. collated, managed and analysed the proxy data; R.N. & J.G. developed the reconstruction with input from S.J.P.

Gergis, Neukom and Phipps were all coauthors of Gergis et al 2012.
PAGES2K:

Australasia is herein defined as the land and ocean areas of the Indo-Pacific and Southern Oceans bounded by 110°E-180°E, 0°-50°S. Our instrumental target was calculated as the September-February (SONDJF) spatial mean of the HadCRUT3v 5°x5° monthly combined land and ocean temperature grid 9,54 for the Australasian domain over the 1900-2009 period.

G12:

Australasia is defined as the land and ocean areas of the Indo-Pacific and Southern Oceans bounded by 110oE-180oE, 0o-50o S. Our instrumental target was calculated as the September-February (SONDJF) spatial mean of the HadCRUT3v 5o x 5o monthly combined land and ocean temperature grid (Brohan et al., 2006; Rayner et al., 2006) for the Australasian domain over the 1900-2009 period.

PAGES2K:

Our temperature proxy network (Fig. S13) was drawn from a broader Australasian domain: 90°E-140°W, 10°N-80°S (details provided in Neukom and Gergis 46). This proxy network showed optimal response to Australasian temperatures over the SONDJF period, and contains the austral tree-ring growing season during the spring-summer months.

G12:

Our temperature proxy network was drawn from a broader Australasian domain (90oE-140o1 W, 10oN-80o S) containing 62 monthly-annually resolved climate proxies from approximately 50 sites (see details provided in Neukom and Gergis, 2011). This proxy network showed optimal response to Australasian temperatures over the SONDJF period, and contains the austral tree ring growing season during the spring-summer months.

P2K

All data were linearly detrended over the 1921-1990 period and AR(1) autocorrelation was taken into account for the calculation of the degrees of freedom 55.

G12:

For predictor selection, both proxy climate and instrumental data were linearly detrended over the 1921-1990 period to avoid inflating the correlation coefficient due to the presence of the global warming signal present in the observed temperature record. Only records that were significantly (p<0.05) correlated with the detrended instrumental target over the 1921-1990 period were selected for analysis.

P2K

We performed an ensemble ordinary least squares regression principal component reconstruction (PCR) analysis 53,57 using the 1921-1990 period for calibration and verification. Further description of the PCR method is provided by Luterbacher et al.58, and details of the extension of the ensemble approach are described below.

G12:

We performed an ensemble ordinary least squares regression Principal Component Reconstruction (PCR) analysis (Neukom et al., 2010; Gallant and Gergis, 2011; Gergis et al., 2012) using the 1921-1990 period for calibration and verification. Further description of the PCR method is provided by Luterbacher et al. (2002), and details of the extension of the ensemble approach are described below.

P2K

To assess reconstruction uncertainty associated with proxy selection and calibration, a 3000- member ensemble of reconstructions was calculated creating a varying reconstruction setting for each realization by randomly:

  • Removing five predictors from the full predictor matrix. In the early part of the reconstruction (1000-1456 CE) where five or fewer proxies are available, the number of predictors used for each ensemble member varies between one and five.
  • Varying the percentage of total variance of the predictor matrix explained by the retained PCs between 60% and 90% by varying the number of PCs used
  • Selecting a calibration period of 35-50 (non successive) years between 1921-1990 and using the remaining 20-35 years for verification.
  • Scaling the weight of each proxy record in the PC analysis with a factor of 0.67 to 1.5.

G12:

To assess reconstruction uncertainty associated with proxy selection and calibration, a 3000-member ensemble of reconstructions was calculated creating varying reconstruction setting for each realisation by randomly:

  • Removing five predictors from the full predictor matrix. In the early part of the reconstruction (1000-1456) where five or fewer proxies are available, the number of predictors used for each ensemble member varies between one and five.
  • The effect of varying the number of proxies to be removed is illustrated in Figures S2.4 and S2.5.
  • Varying the percentage of total variance of the predictor matrix explained by the retained PCs between 60% and 90% by varying the number of PCs used.Selecting a calibration period of 35-50 (non successive) years between 1921-1990 and using the remaining 20-35 years for verification.
  • Scaling the weight of each proxy record in the PC analysis with a factor of 0.67 to 1.5. The effect of varying the weighting factor is illustrated in Figures S2.6 and S2.7.

P2K:

To avoid variance biases due to the decreasing number of predictors back in time, the reconstructions of each model were scaled to the variance of the instrumental target over the 1921-1990 period. The mean of the 3000-member ensemble was considered our “best estimate” temperature reconstruction. To assess low frequency changes in Australasian temperatures, the ensemble mean was smoothed using a 30-year loess filter, which effectively removes variations with periods shorter than 15 years.

G12:

To avoid variance biases due to the decreasing number of predictors back in time, the reconstructions of each model were scaled to the variance of the instrumental target over the 1921-1990 period. The mean of the 3,000-member ensemble was considered our “best estimate” temperature reconstruction. To assess low frequency changes in Australasian temperatures, the ensemble mean was smoothed using a 30-year loess filter (Figure 3), which effectively removes variations with periods shorter than 15 years.

P2K

The ensemble PCR method allows us to quantify not only the traditional regression residual-based uncertainties referred to as “calibration error”59, but also the spread of the ensemble members generated from the random selection of the reconstruction parameters, described as the “ensemble error”. The reconstruction confidence interval was defined as the combined calibration and ensemble standard error (SE), calculated as SE=sqrt(sigma_res^2 +sigma_ens^2) with sigma_res denoting the standard deviation of the regression residuals and sigma_ens the standard deviation of the ensemble members. Uncertainties of the filtered curves were calculated the same way using the residuals of the filtered data and standard deviation between the filtered ensemble members.

G12

The ensemble PCR method allows us to quantify not only the traditional regression residual-based uncertainties referred to as “calibration error” (e.g. Cook and Kairiukstis, 1990), but also the spread of the ensemble members generated from the random selection of the reconstruction parameters, described as the “ensemble error”. The reconstruction confidence interval was defined as the combined calibration and ensemble standard error (SE), calculated as SE=sqrt(sigma_res^2 +sigma_ens^2) with sigma_res denoting the standard deviation of the regression residuals and sigma_ens the standard deviation of the ensemble members. Uncertainties of the filtered curves were calculated the same way using the residuals of the filtered data and standard deviation between the filtered ensemble members.

P2K

In addition to the 3000 verification tests incorporated into the 1921-1990 overlap period calculations, the ensemble mean was also further independently verified using withheld, early 1901-1920 data (“early verification”).

G12

In addition to the 3000 verification tests incorporated into the 1921-1990 overlap period calculations, the ensemble mean was also further independently verified using withheld, early 1901-1920 data (‘early verification’).

 
Status of Gergis et al 2012
Gergis et al 2012 was accepted by Journal of Climate in May 2012. Its acceptance was announced by the Melbourne University and the University of New South Wales (see link to http://www.science.unsw.edu.au/news/1-000-years-of-climate-data-confirms-australia-s-warming/ here, UNSW subsequently disappearing its announcement).
The article was given prominent press coverage at the time e.g. here.

SCIENTISTS have used natural records including tree rings and ice cores to reconstruct the first picture of Australia’s climate over the past 1000 years – and the news is we’re getting warmer. The study, published in the Journal of Climate, found there were no warmer periods than in the years after 1950 and temperatures have been increasingly decade on decade ever since.

and here

The study published recently in the Journal of Climate will form the Australasian region’s contribution to the 5th IPCC climate change assessment report chapter on past climate.

CA readers are undoubtedly aware of contemporary criticism of the article at Climate Audit (see tag also here). According to the journal, the article was subsequently “withdrawn” by the authors and a “new version of this manuscript has been submitted and is under review.”
In the University of Melbourne’s undated explanation, they stated that the article had been “published” on May 17, 2012, that an “issue was identified”, but did not report or concede that the article had been withdrawn, saying instead that it had been “re-submitted and reviewed again” and, somewhat inconsistently, that the revised manuscript was still “under review”.

An issue was identified in the manuscript “Evidence of unusual late 20th century warming from an Australasian temperature reconstruction spanning the last millennium” by Joelle Gergis, Raphael Neukom, Ailie Gallant and David Karoly, published in the Journal of Climate on 17 May 2012. The manuscript was re-submitted to the Journal of Climate and reviewed again. The authors have repeated the original analysis using three additional methods to assess the influence of different statistical techniques on the original results. The revised manuscript is still under review by the Journal of Climate.

A blog article at The Conversation in October 2012 also stated that the article had been “re-submitted to the Journal of Climate and is being reviewed again”.
Joelle Gergis’ website list of publications (see here) has throughout the past two years maintained the position that the article was “under review” at the Journal of Climate and still lists the article (now dated to 2014) as “under review”.
Discussion
Nature’s policy against self-plagiarism is very clear. It is also unequivocal that PAGES2K’s description of its Australasian network and methodology was lifted more or less verbatim from Gergis et al 2012.
While one can ponder the interesting issues potentially arising from self-plagiarism of retracted or withdrawn papers, I think that the most salient issue in the present case relates to self-plagiarism obligations relative to a paper “under review”, a status consistently asserted for Gergis et al. Obviously, authors regularly cite papers that are “under review” and Nature’s policies (see here and here) specifically permit the citation of papers that have been submitted.  The only way for Gergis, Neukom and Phipps to avoid self-plagiarism under Nature’s policies would have been for them to cite “Gergis et al, withdrawn and under re-review” or perhaps “Gergis et al, under review”  (listing the coauthors in the bibliography).
Perhaps they were reluctant to do so because of concerns that the notoriety of the earlier article might delay the review of the PAGES2K article, already at the IPCC witching hour. Perhaps there was some other reason. But whatever their motive, the end result is surely the same: their wholesale lifting of verbatim and near-verbatim text from “Gergis et al, withdrawn and under re-review” violated the policy against self-plagiarism set out in Nature’s plagiarism policy.

Source