t-Statistics and the “Hockey Stick Index”

In MM05,  we quantified the “hockeystick-ness” of a series as the difference between the 1902-1980 mean (the “short centering” period of Mannian principal components) and the overall mean (1400-1980), divided by the standard deviation – a measure that we termed its “Hockey Stick Index (HSI)”.  The histograms of its distribution for 10,000 simulated networks (shown in MM05 Figure 2) were the primary diagnostic in MM05 for the bias in Mannian principal components.  In our opinion, these histograms established the defectiveness of Mannian principal components beyond any cavil and our attention therefore turned to its impact, where we observed that Mannian principal components misled Mann into thinking that the Graybill stripbark chronologies were the “dominant pattern of variance”, when they were actually a quirky and controversial set of proxies.
Nick Stokes recently challenged this measure as merely an “MM05 creation” as follows:

The HS index isn’t a natural law. It’s a M&M creation, and if I did re-orient, it would then fall to me to explain the index and what I was doing.

While we would be more than happy to be credited for the simple concept of dividing the difference of means by a standard deviation, such techniques have been used in the calculation of t-statistics for many years, as, for example, in the calculation of the t-statistic for the difference of means.   As soon as I wrote down this rebuttal, I realized that there was a blindingly obvious re-statement of what we were measuring through the MM05 “Hockey Stick Index” as the t-statistic for the difference in mean between the blade and the shaft.  It turned out that there was a monotonic relationship between the Hockey Stick Index and the t-statistic and that MM05 histogram results could be re-stated in terms of the t-statistic for the difference in means.
In particular, we could show that Mannian principal components produced series which had a “statistically significant” difference between the blade (1902-1980) and the shaft (1400-1901) “nearly always” (97% in 10% tails and 85% in 5% tails).  Perhaps I ought to have thought of this interpretation earlier, but, in my defence, many experienced and competent people have examined this material without thinking of the point either. So the time spent on ClimateBallers has not been totally wasted.
 
t-Statistic for the Difference of Means 
The t-statistic for the difference in means between the blade (1902-1980) and the shaft (1400-1901) is also calculated as the difference in means divided by a standard error: a common formula computes the standard error as the weighted average of the standard deviations of the two subperiods, weighted by the length of each subperiod.  An expression tailored for the specific case is shown below:

se= sqrt( (78* sd( window(x,start=1902) )^2 + 501* sd( window(x,end=1901))^2 )/(581-2) )

For the purposes of today’s analysis, I haven’t allowed for autocorrelation in the calculation of the t-statistic (allowing for autocorrelation will reduce the effective degrees of freedom and accentuate results, rather than mitigate them.)
Figure 1 below shows t-statistic histograms corresponding to the MM05 Figure 2 HSI histograms, but in  a somewhat modified graphical style:  I’ve overlaid the two histograms, showing centered PC1s in light grey and Mannian PC1s in medium grey. (Note that I’ve provided a larger version for easier reading – interested readers can click on the figure to embiggen.)   The histograms are from a 1000-member subset of the MM05 networks and a little more ragged.   I’ve also plotted a curve showing the t-distribution for df=180, which was calculated from one of the realizations. This curve is very insensitive to changes in degrees of freedom in this range and I therefore haven’t experimented further.
The separation of the distributions for Mannian and centered PC1s is equivalent to the separation shown in MM05 Figure 1, but re-statement using t-statistics permits more precise conclusions.

Figure 1. Histograms of t-statistic for difference of 1902-1980 mean and 1400-1901 means showing centered PC1s (light grey) and Mannian PC1s (medium grey). The curve is a t-distribution (df=180).   The red lines at +- 1.65 and +-1.96 correspond to 90% and 95% two-sided t-tests. 
The distribution of the simulated t-statistic for centered PC1s is similar to a high-df t-distribution, though it appears to be somewhat  overweighted to values near zero and underweighted on the tails: there are approximately half the values in the 5% and 10% tails that one would expect from the t-distribution.  At present, I haven’t thought through potential implications.
The distribution of the simulated t-statistic for Mannian PC1s bears no relationship to the expected t-distribution.  Values are concentrated in the tails: 85% of t-statistics for Mannian PC1s are in the 5% tails ( nearly 97% in the 10% tails.)  This is what was shown in MM05 and it’s hard to understand why ClimateBallers contest this.
What This Means
The result is that Mannian PC1s “nearly always” (97% in 10% tails and 85% in 5% tails) produce series which have a “statistically significant” difference between the blade (1902-1980) and the shaft (1400-1901).   If you are trying to do a meaningful analysis of whether there actually is a statistically meaningful difference between the 20th century and prior periods, it is impossible to contemplate a worse method and you have to go about it a different way.   Fabrications by ClimateBallers, such as false claims that MM05 Figure 2 histograms were calculated from only 100 cherrypicked series, do not change this fact.
The comparison of the Mannian PC histogram to a conventional t-distribution curve also reinforces the degree to which the Mannian PCs are in the extreme tails of the t-dstribution.   As noted above (and see Appendix), the t-stat is monotonically related to the HSI:  rather than discussing the median HSI of 1.62, we can observe that the median t-stat for Mannian PC1s is 2.44, a value which is at the 99.2 percentile of the t-distribution.  Even median Mannian PC1s are far into the right tail.  The top-percentile Mannian PC1s illustrated in Wegman’s Figure 4.4 correspond to a t-statistic of approximately 3.49, which is at the 99.97 percentile of the t-distribution.  While there is some difference in visual HS-ness, contrary to Stokes, both median and top-percentile Mannian PC1s have very strong HS appearance.
Stokes is presently attempting to argue that representation of a network through a biased Mannian PC1 is mitigated in the representation of the network, by accommodation in lower order PCs. However, Stokes has a poor grasp on the method as a whole and almost zero grasp of the properties of the proxies.  When the biased PC method is combined with regression against 20th century trends, the spurious Mannian PC1s will be highly weighted.  In our 2005 simulations of RE statistics (MM05-GRL, amended in MM05 (Reply to Huybers – the Reply containing new material), we showed that Mannian PC1s combined with networks of white noise yielded RE distributions that were completely different than those used in MBH98 and WA benchmarking.  (WA acknowledged the problem, but shut their eyes.)
Nor, as I’ve repeatedly stated, did we argue that the MBH hockeystick arose from red noise: we observed that the powerful HS-data mining algorithm (Mannian principal components) placed the Graybill stripbark chronologies into the PC1 and misled Mann into thinking that they were the “dominant pattern of variance”.  If they are not the “dominant pattern of variance” and merely a problematic lower order PC, then the premise of MBH98 no longer holds.
 
Appendix

Figure 2 below compares the t-statistic for the difference between the means of the blade (1902-1980) and the shaft (1400-1901) against the HSI as defined in MM05-GRL: it shows a monotonic, non-linear relationship.  It is immediately seen that there is a monotonic relationship between HSI and t-statistic, with the value of the t-statistic being closely approximated by a simple quadratic expression in HSI.  The diagonal lines show where both values are equal.  The HSI and t-statistic are approximately equal for HSI with absolute values less than ~0.7.  Values in this range are very common for centered PC1s but non-existent for Mannian PC1s, a point made in MM05.
The vertical red lines show 1 and 1.5 values of HSI (both signs); the horizontal dotted lines show 1.65 and 1.96 t-values,  both common benchmarks in statistical testing (95% percentile one-sided and 95% two-sided, 97.5% one-sided respectively.)  HSI values exceeding 1.5 have t-values well in excess of 2.
 

Figure 2.  Plot of t-statistic for the difference in means of the blade (1902-1980) and the shaft (1400-1901) against the HSI as defined in MM05-GRL for centered PC1s (left) and Mannian PC1s (right). It shows a monotonic, non-linear relationship.  The two curves have exactly the same trajectories when overplotted, though values for the centered PCs are typically (absolute value) less than about 0.7 HSI, whereas values for Mannian PCs are bounded away from zero.
 

Source