Deflategate: Controversy is due to Scientist Error

I’ve submitted an article entitled “New Light on Deflategate: Critical Technical Errors” pdf to Journal of Sports Analytics. It identifies and analyzes a previously unnoticed scientific error in the technical analysis included in the Wells Report on Deflategate. The article shows precisely how the “unexplained” deflation occurred prior to Anderson’s measurement and disproves the possibility of post-measurement tampering. At present, there is insufficient information to determine whether the scientific error arose because the law firm responsible for the investigation (Paul, Weiss) omitted essential information in their instructions to their technical consultants (Exponent) or whether the technical consultants failed to incorporate all relevant information in their analysis.  In either event, the error was missed by the NFL consultant Daniel Marlow of the Princeton University Department of Physics, by the authors of the Wells Report and by the NFL.
 
 

Background
Much public commentary about the Deflategate controversy has been about the Ideal Gas Law. However, contrary to many misconceptions, Exponent fully accounted for deflation according to the Ideal Gas Law.  However, they observed that there was “additional” loss of pressure in Patriot balls for which they could identify “no set of credible environmental or physical factors”.   The Wells Report said that this “tends to support a finding that human intervention may account for the additional loss of pressure”:

This absence of a credible scientific explanation for the Patriots halftime measurements tends to support a finding that human intervention may account for the additional loss of pressure exhibited by the Patriots balls.

While Exponent did not expressly quantify the additional pressure loss – a very peculiar omission,  it was approximately 0.35 psi, as compared to an observed pressure loss of approximately 1.4 psi, due to changes in temperature and the balls becoming wet offset by slight warming during intermission.  Drawing on information in the Exponent Report (see article for details), for Patriot balls (left two columns) and Colt balls (right two columns), the figure below compares estimates of the impacts of cooling (Ideal Gas Law) and wet footballs to observed deflation during the intermission plus an allowance for warming during intermission.  Information for Colt balls reconcile almost exactly, but there is a discrepancy of about 0.38 psi for Patriot balls. This discrepancy is almost exactly equal to the bias of referee Anderson’s Logo Gauge (orange) – a coincidence that should alarm any analyst of this data (including Exponent, Marlow and Wells).

Figure 1. Reconciliation of Patriot and Colt pressure drops. In the right column of each pair, estimated warming through the intermission is added to the observed pressure drop to estimate the pressure drop at the start of the half-time intermission. In the left column of each pair are shown the pressure drop for dry balls (limegreen), an estimated average additional drop for wet balls and, for Patriot balls, the additional deflation arising from re-setting pressure after gloving.
Simulation of Patriot Ball Preparation
The newly identified error pertains to Exponent’s simulation of Patriot ball preparation for the AFC Championship Game – an issue originally pointed to by Patriot coach Bill Belichick at his press conference of January 24, 2015.
In their simulation of Patriot ball preparation, Exponent set football pressures to 12.5 psi, then vigorously rubbed the balls (“gloving”) for 7 to 15 minutes before stopping.  They observed that pressures increased about 0.7 psi, but that the effect wore off after 15-20 minutes – effects shown in their Figure 16 (shown below). From this analysis, they excluded Patriot ball preparation as a potential contributor to the additional pressure loss.

Figure 2. Exponent Figure 16. Original Caption: The pressure as a function of time while a football is being vigorously rubbed.
However, Exponent’s simulation neglected an essential element of Patriot ball preparation. According to the Wells Report (WR, 50), Patriot equipment manager Jastremski set football pressures to 12.6 psi after the vigorous rubbing.

Jastremski told us that he set the pressure level to 12.6 psi after each ball was gloved and then placed the ball on a trunk in the equipment room for Brady to review. [my bold]

The detail that Jastremski set pressure after gloving is not mentioned in the Exponent Report, only in passing in the Wells Report.  Exponent’s failure to mention this detail makes one wonder whether Paul, Weiss might have failed to transmit this detail to Exponent. The detail is critical: footballs so processed will have pressures of 12.1-12.2 psi at room temperature, about 0.3-0.4 psi below the NFL minimum of 12.5 psi.  This is illustrated in the re-statement of Exponent Figure 16 shown below, which illustrates the setting of pressure to 12.6 psi of footballs warmed by gloving, with the subsequent loss of pressure to 12.1-12.2 psi as the balls return to room temperature. The resulting amount of under-inflation is almost exactly equal to the amount of “unexplained” Patriot loss in pressure.
Figure 3. Re-statement of Exponent Figure 16. Red: Exponent transient shows effect of rubbing to increase pressure, together with a decline after rubbing stopped. Black: shows reduction in pressure from re-setting to 12.6 psi, followed by transients as ball temperature returns to room temperature. Twenty seconds allowed for setting gauge in above transients. Dotted vertical lines show 7-15 minutes from start of rubbing reported by Jastremski. Logo gauge values of 12.5 and 12.6 psi are shown on right axis, deducting the bias of ~0.38 psi.
 
“Logical Inferences” on Gauges
The battleground issue in scientific analysis of Deflategate has concerned which gauges were used by referee Anderson for pre-game measurement.  Exponent argued that, “despite the remaining uncertainty, logical inferences can be made according to the data collected to establish the likelihood of which gauge was used [by referee Anderson]”, a conclusion with which I agree, though the above analysis changes the conclusions.
Referee Anderson had two gauges, one of which (the Logo Gauge) measured about 0.38 psi too high, while the other gauge (the Non-Logo Gauge) was accurate.  It was observed almost immediately (MacKinnon, 2015; Hassett et al 2015) that the additional deflation of Patriot balls could be explained if Anderson had used the Logo Gauge for pre-game measurement of Patriot balls, a hypothesis that was consistent with Anderson’s own recollection that he had used the Logo Gauge.  However, Exponent argued that other information led to the “logical inference” that Anderson had used the Non-Logo Gauge for measuring both Patriot and Colt balls. They stated:

Walt Anderson recalled that according to the gauge he used (which is either the Logo or Non-Logo Gauge), all of the Patriots and Colts footballs measured at or near 12.5 psig and 13.0 psig, respectively, when he first tested them (with two Patriots balls slightly below 12.5 psig). This means that the gauges used by the Patriots and the Colts each read similarly to the gauge used by Walt Anderson during his pregame inspection.

Exponent had obtained dozens of gauges (all new gauges similar to the Non-Logo Gauge), none of which had a bias similar to the Logo Gauge.  From this information, they argued that it was “very unlikely” that both the Patriots and Colts could have had gauges that were “out of whack” (the term used by Wells) similarly to the Logo Gauge and therefore concluded that Anderson had used the Non-Logo Gauge for pre-game measurements.      This conclusion was endorsed in the Wells Report, with Wells’ being particularly vehement about the conclusion in the Appeal Hearing, comparing the possibility to a “lightning strike” – a term that he liked and used twice.
Wells was particular emphatic that use of the Logo Gauge was a “scientific” finding (rather than a conclusion from circumstantial evidence). Wells told Goodell:

The scientists, the Exponent people say they believe based on their scientific tests that the non-logo gauge was used.

Wells invoked “science” to explain away Anderson’s recollection of using the Logo Gauge as follows:

Look, this is no different than a case where somebody has a recollection of X happening and then you play a tape and the tape says Y happened. Now, the person could keep saying, well, darn it, I remember it was X. But the people are going to go with the tape. I went with the science and the logic that I had three data points. And that’s what I based my decision on.

Goodell was swayed by Wells’ vehemence and his decision expressly included the following finding on gauges:

There was argument at the hearing about which of two pressure gauges Mr. Anderson used to measure the pressure in the game balls prior to the game. The NFLPA contended, and Dean Snyder opined, that Mr. Anderson had used the so-called logo gauge. On this issue, I find unassailable the logic of the Wells Report and Mr. Wells’ testimony that the non-logo gauge was used because otherwise neither the Colts’ balls nor the Patriots’ balls, when tested by Mr. Anderson prior to the game, would have measured consistently with the pressures at which each team had set their footballs prior to delivery to the game officials, 13 and 12.5 psi, respectively. Mr Wells’s testimony was confirmed by that of Dr. Caligiuri and Professor Marlow. As Professor Marlow testified, “There’s ample evidence that the non-logo gauge was used”.

This reasoning is valid if the pressures were set without rubbing (the Colt balls) but leads to exactly opposite conclusions for Patriot balls.
Because the Patriot rubbing protocol resulted in the balls being under-inflated by approximately 0.35 psi at room temperature,  the only way in which Anderson  could have measured them above 12.5 psi was if he used the Logo Gauge.  This is based on exactly the same form of logical inference used by Exponent, but without their erroneous interpretation of Patriot ball preparation.
The corollary is that Anderson inattentively switched gauges between measuring Patriot and Colt balls.  While this seems peculiar, NFL officials did exactly the same thing at half-time – switching gauges between measuring Patriot and Colt balls – despite heightened scrutiny.  If Anderson put the gauge in his pocket after measuring one set of footballs,  it would be entirely random whether he used the same gauge for the other set of footballs.
Although the possibility of Anderson inattentively switching gauges for pre-game measurements was an important possibility (suggested, for example, by Hassett et al, 2015 pdf), at the appeal hearing (Hearing, 369:11), Exponent made the remarkable statement that they had been “told” not to consider such a possibility, which was not raised or analysed in the Exponent Report.  Surprisingly, this admission wasn’t pursued by Brady’s counsel at the Appeal Hearing and it is therefore unknown who gave these instructions or why.
Transients
An essential element of Exponent’s report were their comparisons of observed pressures at half-time to modeled transients of pressure changes through the half-time intermission as footballs warmed up, simulations illustrated in a series of figures (Figures 24-30). Remarkably, these simulations contained another error.  The Exponent Report stated that the Logo Gauge had been used to set pressure of footballs to 12.5 psi in Figures 27 and the right panel of Figure 28:

The Logo Gauge was used to set the pressure of two balls to 12.50 psig (representative of the Patriots) and two balls to 13.00 psig (representative of the Colts).

However, Exponent actually used a different gauge (the unbiased Master Gauge) to set pressures to 12.5 psi, resulting in transients that were approximately 0.38 psi higher than under the stated procedure. In the figure below, I’ve re-stated results from their simulations to show transients based on Colt pressures being set with the Non-Logo Gauge and Patriot pressures with the Logo Gauge.  In each case, there is plenty of time during which there is an overlap between observations and modeled transients, contradicting Exponent:

Figure 4.  Re-statement of transients from Figures 25 (Non-Logo) and 27 (Logo), basis 70 deg F, for Colt balls set to 13 psig using Non-Logo Gauge and for Patriot balls set to 12.5 psig using the Logo Gauge. 
 
Conclusion
The “unexplained” additional loss of pressure can be unequivocally seen to occur as a result of Jastremski setting pressure after gloving, rather than before.  This is a complete explanation, which precludes tampering after referee measurement.
Previous scientific critiques of the Wells Report had observed that Patriot deflation could be explained if Anderson used the Logo Gauge, but had been unable to overcome Exponent’s argument about the improbability of the Patriot Gauge being “out of whack” similarly to the Logo Gauge.  That weakness is overcome in the present analysis.  Correct modeling of Patriot ball preparation yields the “logical inferences” that the Patriot gauge was relatively accurate and that Anderson used the Logo Gauge.
Indeed, it is the Wells Report itself that requires an implausible “lightning strike”. Wells’ analysis requires that, out of all possible target deflations, the amount of Patriot deflation was almost exactly equal to the bias of Anderson’s Logo Gauge.  Any self-respecting analyst should have examined and cross-examined his data when asked to arrive at that conclusion.
Exponent expressly stated that their procedures were based on information provided by Paul, Weiss (Ted Wells’ legal firm).  While descriptions in the Exponent Report generally track descriptions in the Wells Report, the information that Jastremski set pressure after gloving appears only in the Wells Report and is conspicuously absent from the Exponent Report.  With present documents, it is impossible to tell whether Exponent was in possession of this information and neglected to include it in their simulation or whether Paul, Weiss neglected to transmit this information to Exponent. Either way, it is an error that has no place in a professional report.
Without these errors, Exponent could not have stated that there were “no set of credible environmental or physical factors” explaining the additional pressure loss.
Appeal courts are poorly suited to resolve such errors. There is another way to resolve the controversy. The scientific community takes considerable pride in the concept of science being “self-correcting”.  When a scientist has inadvertently made an error, the most honorable and effective method of correcting the scientific record is to issue a corrected report, and, if such is not possible, retraction.  The Deflategate controversy originated in scientific and technical errors and the responsible scientists and investigators should take responsibility. Even at this late stage, Paul, Weiss and/or Exponent and/or Marlow should man up, acknowledge the errors and either re-issue corrected reports or retract. If any of them do so, it is hard to envisage the Deflategate case continuing much further.
Advocate and even policy-makers often like to say that their conclusions are supported by “science”, but Wells’ enthusiastic use of the terms “science” and “scientific tests” as rhetoric to validate incorrect analysis should serve as a caveat
The complete paper is online pdf.

Source