Rahmstorf’s Second Trick

The Rahmstorf et al reconstruction commences in AD900 even though the Mann et al 2009 reconstruction goes back to AD500.  Once again, this raises the obvious question: why didn’t Rahmstorf show values before AD900?  Are these results adverse to his claims? Once the question is posed, you can guess the answer. 
First, here is a slightly annotated version of Rahmstorf 2015 pseudo-AMOC index, supposedly calculated as the difference between the Mann et al 2008 NH reconstruction and the Mann et al 2009 gyre reconstruction.  I did my own calculation of this series and overplotted it onto Rahmstorf 2015 Figure 3b, but it plotted ~0.15 deg C too high.  For the overplot below, I therefore subtracted 0.15 deg C from my calculation and it matched more or less exactly up to ~1850, the start of the instrumental period. I’m not sure why the differences arise after 1850. I’ll discuss one theory at the end of the post.  As in yesterday’s post, post-1995 values using M09 gridded data go up (strong blue below), whereas Rahmstorf shows a decline.

Figure 1. Annotated version of Rahmstorf et al 2015 Figure 3b showing pseudo-AMOC index. Overplotted is calculated difference between the Mann et al 2008 NH reconstruction and the Mann et al 2009 gyre reconstruction. 1850 and 1995 are marked by vertical lines.
In the next figure, I’ve plotted both the pseudo-AMOC index using Mann et al 2009 gridded data (thick blue), shown for the 500-2006 period included in the archive, and the corresponding pseudo-AMOC index using the Mann et al 2008 NH reconstruction as in the previous figure (shown as thin blue).   Either way, there is a dramatic step change in the pseudo-AMOC reconstruction at AD900, with values prior to AD900 being comparable to the supposedly alarming late 20th century values. .Perhaps Rahmstorf’s decision not to show values prior to AD900 has an innocent and unrelated explanation, but unfortunately Rahmstorf and coauthors did not provide it.

Figure 2. Annotated variation of Rahmstorf et al 2015 Figure 3b showing pseudo-AMOC index. Overplotted is calculated difference between the Mann et al 2009 NH reconstruction and the Mann et al 2009 gyre reconstruction. Prior to AD1600 (dotted red line), only two climate “fields” are reconstructed. In my plot of the pseudo-AMOC index using Mann et al 2009 gridded data, I removed the 0.15 deg C bodge discussed in the previous figure. In addition, the standard deviation increases dramatically after AD1600: Mann et al 2009 stated that only two climate “fields” (principal components) are used in the gridded reconstruction prior to AD1600 and this presumably has something to do with it
AD900 in Rahmstorf et al 2015
Rahmstorf et al didn’t show this dramatic change in behavior at AD900. This raises several questions: (1) what accounts for the large step change? (2) why wasn’t it shown or discussed by Rahmstorf et al? (3) Rahmstorf et al were obviously aware of the pre-AD900 behaviour and one presumes that they would have some fine print rationalizing their failure to show the data: what was it?
I’ll first look at the Rahmstorf text that touches on the AD900 start and how they purported to rationalize not showing pre-AD900 results.
First, Rahmstorf et al stated that Mann et al 2008 provided a “skilful” NH EIV reconstruction “back to AD900 and beyond”.  This is a little coy, since Mann et al 2008 claimed a skilful EIV NH reconstruction back to AD300 – see its Figure 3 – and obviously doesn’t explain the “late” AD900 start.

For the Northern Hemisphere mean, Mann et al. [12 – 2008] produced reconstructions using two different methods, composite-plus-scale (CPS) and errors in variables (EIV). Here we use the land-and-ocean reconstruction with the EIV method using all the available proxies, which is the reconstruction for which the best validation results were achieved
(see Supplementary Methods of Mann et al. [12-2008]). Based on standard validation scores (Reduction of Error and Coefficient of Efficiency), this series provides a skilful reconstruction back to AD 900 and beyond (95% significance compared to a red-noise null).

Next, Rahmstorf then said that the gyre reconstruction was “skilful” back to AD900, without saying anything about earlier reconstructions:

The subpolar gyre falls within the region where the individual grid-box reconstructions are assessed to be skilful compared to a red-noise null [13 – Mann et al 2009]. In addition, we performed validation testing of the subpolar-gyre mean series, which indicates a skilful reconstruction back to AD 900 (95% significance compared to a red-noise null; see Supplementary Information for details).

The Supplementary Information then stated that they carried out tests on steps from AD900 on, the networks used in the “composite reconstruction”. Again, it doesn’t say anything about steps prior to that.

To validate the proxy reconstructions of temperature we use standard techniques developed during the past two decades in the paleoclimate community. Validation of the subpolar gyre temperature reconstruction was performed on each proxy network used in the composite reconstruction (900AD, 1400AD, 1500AD, 1600AD, 1700AD, 1800AD). (See Mann et al. 2009 for details on the selection of proxy networks used in the composite reconstruction.)

These are the only references to the AD900 issue that I located in the article.  Obviously none of them reports the AD900 step or provides a definitive explanation for not showing pre-AD1900 values.
RegEM Step Changes
The AD900 step change arises from a fundamental instability in RegEM methodology that has not been reported by any publicly funded academic in peer reviewed literature, but has been discussed from time to time at CA.
Jean S and UC were the first to notice the pathology, providing the example shown below in March 2009 here, observing:

Some 0.6 C change due to one added proxy. Weight of the curtis-proxy increases quite a lot, and there are many sign changes.

One sees the above example in Mann et al 2008 Figure S6 (shown below): its EIV NH reconstruction using screened proxies (shown in magenta) has a similar step change at AD600. In the diagram below, the magenta reconstruction begins in AD400. This means that the reconstruction prior to the step change passes Mannian verification as well as the reconstruction after the step change: so one cannot assume that Mannian verification cannot occur for both reconstruction variations.

Analysis of Mannian RegEM methodology at CA pinpointed the dramatic step changes to changes in the sign (and weights) of proxies that sometimes arise from the addition of a single nondescript proxy. The effect is bizarre and undermines one’s willingness to credit RegEM, which may be one of the reasons why the results were not shown by Rahmstorf.
According to weird Mannian rules, sometimes the addition of a single nondescript series can change the number of retained regularization parameters. In a change from one to two or two to three regularization parameters, the weights assigned to individual proxies in the RegEM calculation can change dramatically and, even more importantly, change sign. This is nowhere discussed in peer reviewed literature by publicly funded academics, but is the case nonetheless.
I haven’t parsed the particular AD900 step change in Rahmstorf et al 2015, but the pathology is instantly recognizable for people with mathematical understanding of the method, a group that does not appear to include any of the coauthors of Rahmstorf et al. It is possible that reconstructions using AD800 and earlier networks do not pass Mannian validation, but this is not a given: in the Mann et al 2008 example shown above, both the reconstruction before and after the step change pass Mannian verification criteria or they would not have been shown. It is not a given that the pre-AD900 gyre reconstruction would fail Mannian verification criteria.
ARMA (1,1) Modeling
Rahmstorf et al describe their statistical test setup as follows:

The annually resolved AMOC reconstruction from 900 to 1850 formed the basis for an ARMA(1,1)model which closely resembles the statistical properties of the data.

Right away, one can see a couple of obvious defects of this procedure. The number of temperature principal components (“climate fields”) used to represent the gridded data changes dramatically over time. Mann et al 2009 stated that only two principal components are used prior to AD1600.

Before 1600 C. E., the low-frequency component of the surface temperature reconstructions is described as a linear combination of just two leading patterns of temporal variation, so that regional features in the temperature field are represented by a spatiotemporally filtered approximation.

Since the subpolar gyre is a relatively fine detail of global climate, there is no possibility of it being distinguished in the two-PC reconstruction prior to AD1600, making comparisons before and after AD1600 rather pointless. The AD1600 breakpoint can be clearly seen in the standard deviation of the gyre reconstruction and the pseudo-AMOC reconstruction, as the post-AD1600 series have much larger standard deviations. In addition, the underlying Mannian proxy data has been so heavily smoothed that it’s hard to say what an annual ARMA(1,1) model really means.
Given that there’s no way that a reconstruction using contaminated Finnish sediments, stripbark bristlecone chronologies, truncated MXD series and nondescript tree ring chronologies can rise above phrenology and have actual significance, this implies that Mann and Rahmstorf have erroneously calculated the benchmarks for their calculation of statistical significance. While Rahmstorf claimed that their validation methods were “standard”, Ross and I sharply criticized the related MBH98 approach to verification – a criticism that has not been rebutted in the “literature”. The only commentary thus far was the Texas sharpshooting of Wahl and Ammann, which fell far short of being a rebuttal.

Source