Decomposing Paico

In today’s post, Jean S and I are going to show that the paico reconstruction, as implemented in the present algorithm, is very closely approximated by a weighted average of the proxies, in which the weights are proportional to the number of measurements.  Paico is a methodology introduced in Hanhijarvi et al 2013 (pdf here) and applied in PAGES2K (2013). It was discussed in several previous CA posts.
We are able to show this because we noticed that the contributions of each proxy to the final reconstruction can be closely estimated by half the difference between the reconstruction and reconstructions in which each series is flipped over, one by one. This sounds trivial, but it isn’t: the decomposition has some surprising properties. The method would not work for algorithms which ignore knowledge of the orientation of the proxy i.e. ones where it supposedly doesn’t “matter” whether the proxy is used upside down or not. In particular, the standard deviations of the contribution from each proxy vary by an order of magnitude, but in a way that has an interesting explanation. We presume that this decomposition technique is very familiar in other fields. The following post is the result of this joint work.
The Re-assembled Reconstruction
Before showing the decomposition, here is a comparison of the H13 pre-calibration reconstruction and the simple sum of 27 components, each calculated as half the difference between the H13 reconstruction and the reconstruction with each proxy inverted in order. They are not exactly the same (the correlation is 0.998) and the standard deviations are about 5% different.

Figure 1. Pre-calibration Hanhijarvi reconstruction and as re-assembled from components.

Standard Deviation of Components
Somewhat unexpectedly (and this appears to be a peculiarity of how Hanhijarvi et al implemented their algorithm), the standard deviation of the components varied by more than an order of magnitude. (These standard deviations are pretty much equivalent to weights, since each component is related to the underlying proxy through a transformation that is “mostly” monotonic. The graph below shows a barplot of the standard deviations of each component in the decomposition.

Figure 2. Standard deviation of individual proxy components in decomposition.
The series on the left  in FIgure 2 above are the series with the most measurements and the series on the right are the ones with the least.  The next figure shows the component standard deviation (weight) against the simple number of measurements for that proxy.  The very low weights assigned to low-frequency appears to be a result of how Hanhijarvi et al implemented paico, rather than an intrinsic property of paico. Having said that, having looked at an absurd number of individual proxies over the last number of years, I am disinclined to overweight low-frequency proxies.  For present purposes, we are merely diagnosing the properties of an opaque methodology, not recommending a solution.  The relationship between weight and count has a slight quadratic shape, but, for practical purposes, the method is closely approximated even by weights calculated as a linear function of proxies.

Figure 3. Scatter plot of component standard deviation against number of measurements for each proxy.
Components
The next figure shows an example comparison for a single proxy between the underlying proxy (top panel) and the component calculated as half the difference between the H13 reconstruction and the reconstruction with the inverted proxy.  The connection between the component  and the underlying series is evident.  But just when you’re about to conclude that it is a sort-of damped version of the underlying proxy, it throws off a uniquely low value in 1997.  Two other series (Grudd2008,  Helama2010) have offsetting high values, so it might be something to do with that.
Figure 4. Comparison for a single proxy between the underlying proxy (top panel) and the component calculated as half the difference between the H13 reconstruction and the reconstruction with the inverted proxy.
The next figure plots the component contribution against the original proxy value for this example.  Mostly the component contribution is derived sort-of sigmoidally from the original measurement, but, as noted above, there are some puzzling oddballs.
Figure 5. Component contribution against the original proxy value for this example.
Approximation by Weighted Average
In this last figure, I’ve calculated a weighted average of the underlying proxies, weighted by the number of measurements, and compared the resulting reconstruction to the precalibrated paico reconstruction.   The correspondence is extremely high, other than at the very end, where the population of proxies drops sharply, apparently leading to odd effects.

Discussion
One of the many curiosities of 1000-year paleoclimate is their fascination with complicated and relatively opaque multivariate methods and their willingness to rely on relatively unproven methods for important empirical results. The paico methodology, whatever its merits, was about 1 minute old when adopted into the PAGES2K report and thence into IPCC AR5. I doubt that the PAGES2K authors intended to weight the proxies in their reconstruction by the number of measurements, but that’s what they did.
As a result, the low-resolution proxies, even the Igaliku series that I’ve severely criticized, actually do not matter very much to the final reconstruction. The main contributions to the final reconstruction come from Kaufman’s problematic varve series, tree rings (especially series from Briffa and Jacoby-D’Arrigo) and ice cores.
It therefore looks to me like the primary contributions to the large change from Arctic2K-2013 ro Arctic2K-2014 come from inverting Hvitarvatn.  Secondary contributions from extending Lomonosovfonna, truncating Blue Lake and switching to a University of East Anglia version  of Tornetrask.   The weights of several tree ring series will also be reduced because they removed early portions with fewer than 10 cores in the 2014 version: this reduced the number of measurements and thus the weight of these series.  Odd, but that’s the effect of this analysis.
While we’ve only applied this decomposition to the paico method, it strikes me that it would be useful for decomposing and analysing other opaque recent methodologies (LNA etc.)  As noted above, the method won’t work with methods that ignore the sign of the proxy (as some Mannian variations), but it will be interesting to try it elsewhere.  The method seems pretty obvious and both Jean S and presume that it is amply documented in other fields.  However, we presume that the PAGES2K group did not consider this form of analysis or else the close relationship of the weights of the proxies to proxy counts would have been reported by now.
Finally, whenever I look at proxy reconstructions, I think that it is extremely important to examine the weights of each proxy. One of the uncredited accomplishments of McIntyre and McKitrick 2003 was its observation that weights of individual proxies could be extracted since MBH methods were all linear, subsequently discussed in several CA posts.   (At the time, academic literature on the topic had taken the position that weights could not be extracted.)   Whenever one of these opaque new methods is proposed, I think that the authors should consider the relationship between their method and proxy weights – at least to an approximation.

Source