Goodell and Deflategate Science

Yesterday, Roger Goodell released his decision on the Brady appeal.
Most of the early discussion has been about Brady’s destruction of his cell phone. Brady has contested the NFL’s characterization of this incident here (see cover here), saying that he had replaced a broken phone; that they had already told the NFL that Brady was not going to turn over his cell phone and that Brady had no obligation to do so under the labour agreement; that they provided the NFL with records from the carrier of all calls and texts; that he had “never written, texted, emailed to anybody at anytime, anything related to football air pressure before this issue was raised at the AFC Championship game in January”; that Wells already had Jastremski and McNally’s phones (on which there were no communications from Brady until after the AFC Championship game).  More on this below.
My specific interest in the decision was how the scientific issues were dealt with, given that there were serious statistical and scientific defects in the Exponent report.   There isn’t very much in the Goodell decision about the science and statistics. Goodell adopted the Exponent report in total.  It also looks to me like Brady’s side did a totally ineffective job of confronting the Exponent report.

Goodell accepted Exponent’s finding that the full extent of the decline could not “be explained” and that a “substantial part of the decline” was due to tampering. Goodell says that the Brady side submitted “alternative scientific analyses (including the study presented by economists from the American Enterprise Institute)” and, as an expert witness, produced Dean Edward Snyder of the Yale School of Management, described as an “economist who specializes in industrial organization”.    Against them, the “Management Council” produced two Exponent scientists (Caligiuri and Steffey) and the Princeton professor who had originally reviewed the Exponent study.

The salient section is as follows (with two footnotes) :

I find that the full extent of the decline in pressure cannot be explained by environmental, physical or other factors. Instead, at least a substantial part of the decline was the result of tampering….
I took into account Dean Snyder’s opinion that the Exponent analysis had ignored timing… Dr Caligiuri and Dr Steffey both explained how timing was, in fact, taken into account in both their experimental and statistical analysis. They concluded based on physical experiments that timing of the measurements did have an effect on the pressure but that the timing in and of itself could not account for the full extent of the pressure declines hat the Partiot balls experienced.  Dean Snyder, in contrast, performed no independent analysis or experiments, not did he take issue with the results of the Exponent experimental work that incorporated considerations of timing and were addressed in detail in the testimony of Caligiuri and Steffey.
I also considered Dean Snyder’s other two “key findings”, as well as the arguments summarized in the NFLPA’s post-hearing brief, including criticism of the steps taken in the Officials Locker Room at halftime to measure and record the pressure of game balls[1]. I was more persuaded by the testimony of Caligiuri, Steffey and Marlow and the fact that the conclusions of their statistical analysis were confirmed by the simulations and other experiments conducted by Exponent. Those simulations and other experiments were described by Prof Marlow as a “first-class piece of work”.[2]
[1] There was argument at the hearing about which of the two pressure gauges Anderson used to measure the pressure in the game balls prior to the game. The NFLPA and Snyder opined hat Mr Anderson had used the so-called logo gauge.  On this issue, I find unassailable the logic that the Wells Report that the non-logo gauge was used, because otherwise neither the Colts’ ball nor the Patriots’ balls when tested by Anderson would have measured consistently with the pressures at which each team had set their footballs prior to delivery to the game officials, 13 and 12.5 psi respectively. Mr Wells’ testimony was confirmed by that of Caligiuri and Marlow. As Marlow testified, “There’s ample evidence that the non-logo gauge was used”.
[2] For similar reasons, I reject the arguments advanced in the AEI Report. The testimony provided by the Exponent witnesses and Professor Marlow demonstrated that none of the arguments presented in that report diminish or undermine the reliability of Exponent’s conclusions.

If Snyder’s testimony was as represented, he was a singularly poor choice of expert witness.  There are major errors, defects and adverse assumptions through the Exponent report and Snyder should have taken issue with them.  Why they wouldn’t have challenged the 67 deg F assumption of the simulations or the apparent gross error in Figures 26 and 30 (at CA here; also at ^ here)  is beyond me.
It’s also hard to understand why the Brady side would have produced an expert witness who hadn’t gone to the trouble of doing his own independent analysis.   As represented by Goodell, Snyder focused on the single issue of “timing”, claiming that Exponent had “ignored” timing.  While there are issues with how Exponent handled timing, it is ludicrous to say that they “ignored” timing issues.  Yes, their “statistical analysis” in Appendix A, from which ludicrous claims of “statistical analysis” are derived, ignored timing issues, but timing issues were front and center in the simulations and claims that Exponent “ignored” timing – if Snyder made such claims – are easily refuted.
On the other hand,  Goodell and Exponent’s characterization are always shadow boxing with reality.  Goodell said :” Dr Caligiuri and Dr Steffey both explained how timing was, in fact, taken into account in both their experimental and statistical analysis. ”  This isn’t true either. Timing was taken into account in their experimental analysis, but not in the statistical analysis (in Appendix A).   (By the way, I haven’t written except in passing about the statistical analysis in Appendix A as the simulations seemed to me to be the core of the prosecution case, while the statistical analysis was so stupidly irrelevant and pointless as to be worthless, but the fact that it is referred to here as a factor in the decision may cause me to revisit this.)
One of the most important, if not the most important, arguments in trying to make sense of events was the scenario in which referee Anderson used the Logo gauge for measuring Patriot balls and the Non-Logo gauge for measuring Colt balls, inattentively changing gauges between measurements – as NFL officials also did during half-time, despite the heightened scrutiny.  This scenario neatly reconciles a lot of otherwise discordant information, as discussed in previous posts.  This scenario was raised in the AEI article as well and in an early response by the Patriots.  If Goodell has correctly characterized evidence from Snyder and the NFLPA, they botched this issue as well. According to Goodell, they argued that Anderson had used the Logo gauge for measuring both Patriot and Colt balls, raising the problem of the approximate pregame match of Colt pressures and Anderson’s measurements.  This argument is moot if, as seems entirely possible, Anderson inattentively changed gauges.  Then the issue is how Anderson’s pregame measurement of Patriot balls (if done with Logo gauge) could have reconciled with Patriot pregame measurements.   On this narrower issue, there are a couple of possibilities: (1) the Patriot (Jastremski) gauge might have had a similar bias to Anderson’s Logo gauge.  Exponent’s analysis of gauge variation is wildly irrelevant to the problem as they limited their analysis to other examples of new Non-Logo gauges. Also, the NFL appears to have been in possession of the Jastremski gauge at half-time and could have tested its calibration, but it didn’t do so, apparently not keeping track of the gauge.  (2) while Exponent has plausibly shown that the additional pressure arising from Patriot gloving protocols would have worn off by the time of Anderson’s measurements, it also appears possible that Patriot pregame measurements were done while the balls were still impacted by gloving.
The AEI report had raised the issue of switching gauges, but did not carry out the more detailed analysis of the implications of that scenario on the transients and simulations.  The Brady side needed more than provided in the AEI report, but the switching scenario cannot be trivially dismissed either. Goodell stated: “The testimony provided by the Exponent witnesses and Professor Marlow demonstrated that none of the arguments presented in that report diminish or undermine the reliability of Exponent’s conclusions.”  I don’t see how anyone can responsibly assert that the switching scenario does not “diminish or undermine the reliability of Exponent’s conclusions.” It’s an important possibility that really does call into question the validity of Exponent’s claim that the decline in pressure cannot be accounted for by environmental and physical factors.
I noticed that Goodell’s decision added the word “substantial” in saying that “substantial part of the decline” was due to tampering. This word is new in the Goodell decision and is not actually stated in the Wells Report, which said instead that the decline “cannot be explained completely” by environmental and physical factors. As I reported previously, Exponent said that pressures in Exponent’s Game Day simulations were “noticeably higher” than observed Patriot pressures, but did not use the word “substantial” – undoubtedly because the difference was only 0.1-0.24 psi (see Figure 30 and Exponent page 62).
It seems to me that the use of the word “substantial” changes the hurdle. Would the difference of 0.1-0.24 psi, described as “noticeable” in the Exponent Report also be fairly described as “substantial”? I don’t think so. Read carefully, I do not believe that the Exponent Report, even on its own terms, supports the term “substantial” (as opposed to, say detectable).
