Response to critiques: Climate scientists versus climate data

by Judith Curry
Not surprisingly, John Bates’ blog post and David Rose’s article in the Mail on Sunday have been receiving some substantial attention.

Most journalists and people outside of the community of establishment climate scientists ‘get it’ that this is about the process of establishing credibility for climate data sets and how NOAA NCDC/NCEI have failed to follow NOAA’s own policies and guidelines, not to mention the guidelines established by Science for publications.
In this post, I go through the critiques of Rose/Bates made by NOAA scientists and other scientists working with surface temperature data. They are basically arguing that the NOAA surface temperature data sets are ‘close enough’ to other (sort of) independent analyses of surface temperatures. Well, this is sort of beside the main point that is being made by Bates and Rose, but lets look at these defenses anyways. I focus here more on critiques of what John Bates has to say, rather than the verbiage used by David Rose or the context that he provided.
The Data: Zeke Hausfather and Victor Venema
You may recall a recent CE post where I discussed a recent paper by Zeke Hausfather Uncertainties in sea surface temperature. Zeke’s paper states that it has independently verified the Huang/Karl sea surface temperatures.
Zeke has written a Factcheck on David Rose’s article. His arguments are that:

  1. NOAA’s sea surface temperatures have been independently verified (by his paper)
  2. NOAA’s land surface temperatures are similar to other data sets
  3. NOAA did make the data available at the time of publication of K15

With regards to #1: In a tweet on Sunday, Zeke states
Zeke Hausfather ‏‪@hausfath   ‪@KK_Nidhogg@ClimateWeave @curryja and v5 is ~10% lower than v4. Both are way above v3, which is rather the point.
What Zeke is referring to is a new paper by Huang et al. that was submitted to J. Climate last November, describing ERSSTv5. That is, a new version that fixes a lot of the problems in ERSSTv4, including using ships to adjusting the buoys. I managed to download a copy of the new paper before it was taken off the internet. Zeke states that v4 trend is ~10% lower than v5 for the period 2000-2015. The exact number from information in the paper is 12.7% lower. The bottom line is that sea surface temperature data sets are a moving target. Yes, it is good to see the data sets being improved with time. The key issue that I have is reflected in this important paper A call for new approaches to quantifying biases in observations of sea surface temperature, which was discussed in this previous CE post.
Regarding #2. Roger Pielke Sr. makes the point that ALL of the other data sets use NOAA’s GHCN data set. Zeke makes the point that CRUT and Berkeley Earth do not use the homogenized GHCN data. However, as pointed out by John Bates, there are serious problems with the GHCN beyond the homogenization J
Regarding #3. John Bates’ blog post states: “NOTE: placing a non-machine readable copy of a dataset on an FTP site does not constitute archiving a dataset”
Victor Venema has a blog post David Rose’s alternative reality.  The blog post starts out with a very unprofessional smear on the Daily Mail. He provides some plots, cites recent articles by Zeke and Peter Thorne. Nothing worth responding to, but I include it for completeness. The key issues of concern are in John Bates’ blog post (not what David Rose wrote).
The fundamental issue is this: data standards that are ‘ok’ for curiosity driven research are NOT ‘ok’ for high impact research of relevance to a regulatory environment.
Peter Thorne and Thomas Peterson
Thomas Peterson, recently retired from NOAA NCDC/NCEI, is a coauthor on K15.  He tweeted:
Thomas Peterson ‏‪@tomcarlpeterson 16h  Buoys read 0.12C cooler than ships. Add 0.12C to buoys or subtract 0.12C from ships and you’ll get exactly the same trend.
Response: Well, in the new Huang et al. paper on ERSSTv5, it turns out that adjusting the ships to buoys results in a trend that is lower by 0.07oC. NOT exactly the same – in the climate trend game, a few hundredths of a degree actually matters.
In the comments on John Bates’ post, Peterson writes:
As long as John mentioned me I thought, perhaps, I would explain my concern. This is essentially a question of when does software engineering take precedence over scientific advancement. For example, we had a time when John’s CDR processing was producing a version of the UAH MSU data but the UAH group had corrected a problem they identified and were now recommending people use their new version. John argued that the old version with the known error was better because the software was more rigorously assessed. I argued that the version that had corrected a known error should be the one that people used – particularly when the UAH authors of both versions of the data set recommended people use the new one. Which version would you have used?
John Bates email response:
First, Peterson is talking about a real-time, or what we dubbed and interim CDR, not a long term CDR. So he is using incorrect terminology. In fact the UAH interim CDR was ingested and made available by another part of NCDC, not the CDR program. Of course, I never said, use the old one with the known error. But, what I actually did is check with the CDR program for what they did. And since, yes, this was fully documented I can go back and find exactly what happened as the trail is in the CDR document repository. The CDR has a change request and other processes for updating versions etc. This is done all the time in the software engineering world. As I recall, there was a very well documented process on what happened and it may actually be an example to use in the future.
So, Peterson presents a false choice. The CDR program guidelines are for non-real time data. We did set up guidelines for an interim CDR, which is what Peterson is referring to. So, customers were provided the opportunity to get the interim CDR with appropriate cautions, and the updated, documented CDR when it became available later. 
Peter Thorne is coauthor on both Huang et al. ERSST articles. From 2010 to 2013, he was employed by North Carolina State University in the NOAA Cooperative Institute for Climate and Satellites (CICS). He has never been directly employed by NOAA.
Thorne published a blog post On the Mail on Sunday article on Karl et al.  Excerpts:
The ‘whistle blower’ is John Bates who was not involved in any aspect of the work. NOAA’s process is very stove-piped such that beyond seminars there is little dissemination of information across groups. John Bates never participated in any of the numerous technical meetings on the land or marine data I have participated in at NOAA NCEI either in person or remotely. This shows in his reputed (I am taking the journalist at their word that these are directly attributable quotes) misrepresentation of the processes that actually occurred. In some cases these misrepresentations are publically verifiable.
Apparently Peter Thorne does not know much about what goes on in NOAA NCDC/NCEI, particularly in recent years.
Response from John Bates:
Peter Thorne was hired as an employee of the Cooperative Institute for Climate and Satellites NC in 2010 and resigned in may or June 2013. As such, Thorne was an employee of NC State University and not a government employee. He could not participate in government only meetings and certainly never attended any federal manager meetings where end-to-end processing was continuously discussed. As I discussed in the blog, my Division was responsible for running the ERSST code and the global temperature blend code from 2007-2011. We had begun more fully documenting that code including data flow diagrams and software engineering studies. In addition, my Division ingested and worked with the all the GHCN data and the ICOADS data. I developed extensive insight into how all the code ran, since I was responsible for it. Running of the ERSST and global temperature blend code was transferred to the other science Division in late 2010, prior to the arrival of Thorne at NCDC. Since I remained part of the management team the remainder of my time at NCDC/NCEI, I had deep insight into how it ran.
The key issue is this. John Bates is not a coauthor on any of these studies, and hence doesn’t have any personal vested interest in these papers. However, he is extremely knowledgeable about the subject matter, being the supervisor for the team running ERSSTv3 and the GHCN. He has followed this research closely and has had extensive conversations about this with many of the scientists involved in this research. Most significantly, he has collected a lot of documentation about this (including emails).
So this is not a ‘he said’—‘the other he said’ situation. Here we have a scientist that spent 3 years 2010-2013 at NOAA (but wasn’t employed by NOAA) and is coauthor of two of the papers in question, versus a supervisory meteorologist employed by NOAA NCDC for nearly two decades, that was formerly in charge of the Division handling the surface temperature data and the architect of NOAA’s data policies.
Regarding Thorne’s specific points:

  1. ‘Insisting on decisions and scientific choices that maximised warming and minimised documentation’

Dr. Tom Karl was not personally involved at any stage of ERSSTv4 development, the ISTI databank development or the work on GHCN algorithm during my time at NOAA NCEI. At no point was any pressure bought to bear to make any scientific or technical choices. It was insisted that best practices be followed throughout. The GHCN homogenisation algorithm is fully available to the public and bug fixes documented. The ISTI databank has been led by NOAA NCEI but involved the work of many international scientists. The databank involves full provenance of all data and all processes and code are fully documented.
Response: Thorne has not been on site (NOAA NCDC) for years and not during the final few years when this took place. 

  1. ‘The land temperature dataset used by the study was afflicted by devastating bugs in its software that rendered its findings unstable’ (also returned to later in the piece to which same response applies)

The land data homogenisation software is publically available (although I understand a refactored and more user friendly version shall appear with GHCNv4) and all known bugs have been identified and their impacts documented. There is a degree of flutter in daily updates. But this does not arise from software issues (running the software multiple times on a static data source on the same computer yields bit repeatability). Rather it reflects the impacts of data additions as the algorithm homogenises all stations to look like the most recent segment. The PHA algorithm has been used by several other groups outside NOAA who did not find any devestating bugs. Any bugs reported during my time at NOAA were investigated, fixed and their impacts reported.
Response:  Thorne left NOAA/CICS in 2013. As outlined in the original blog post, the concern about GHCN was raised by a CMMI investigation that was conducted in 2014 and the specific concerns being discussed were raised in 2015. I cannot imagine how or why Peter Thorne would know anything about this.

  1. ‘The paper relied on a preliminary alpha version of the data which was never approved or verified’

The land data of Karl et al., 2015 relied upon the published and internally process verified ISTI databank holdings and the published, and publically assessable homogenisation algorithm application thereto. This provenance satisfied both Science and the reviewers of Karl et al. It applied a known method (used operationally) to a known set of improved data holdings (published and approved).
Response from John Bates:
Versioning of GHCNmv4 alpha – So, after I sent my formal complaint on K15 to the NCEI Science Council in Jan 2016, I pressed to have my concern heard but sessions were booked. I pressed on at the end of one of the meetings and specifically brought up the issue of versioning in additional to not archiving. Russ Vose Chairs the Science Council and Jay Lawrimore, who runs the GHCN code, were in attendance. I made my argument that the version in K15 was in fact V4 alpha and should have been disclosed as such whit the disclaimer required for a non-operational research product. I said that the main reason for changing the version number from 3 to 4 was the use of ISTI data per what Jay Lawrimore had briefed. Moreover, plots of raw, uncorrected ISTI vs GHCN 3 on the ISTI page (will find link after I send these thoughts) show that there are 4 decades in the late 1800 and early 1900s where there is a systematic difference of 0.1C between the two. The reason for this has to be explained before the data are run through the pairwise, and so since there is not GHCNv4 peer article doing this, provenance is lacking. The notion that the ISTI peer article does this is wrong. ISTI web site specifically says it is not run through pairwise and that is a later step.
I concluded, thus K15 uses GHCN v4 alpha consistent with the file name. Russ Vose then said, ‘no it’s version 3’. However, then Jay Lawrimore said, ‘John’s right, it’s version 4’. There was some awkward silence and, since the meeting was already over time, folks just left. So, contrary to Thorne I do meet with these folks and I discussed this very issue with them AND Jay Lawrimore who actually runs the GHCN data said I was right. Thus, Thorne is wrong.

  1. [the SST increase] ‘was achieved by dubious means’

The fact that SST measurements from ships and buoys disagree with buoys cooler on average is well established in the literature. See IPCC AR5 WG1 Chapter 2 SST section for a selection of references by a range of groups all confirming this finding. ERSSTv4 is an anomaly product. What matters for an anomaly product is relative homogeneity of sources and not absolute precision. Whether the ships are matched to buoys or buoys matched to ships will not affect the trend. What will affect the trend is doing so (v4) or not (v3b). It would be perverse to know of a data issue and not correct for it in constructing a long-term climate data record.
Response:  The issue is correcting the buoys to ships, and the overall uncertainty of the data set and trend, in view of these large adjustments

  1. ‘They had good data from buoys. And they threw it out […]’ 

v4 actually makes preferential use of buoys over ships (they are weighted almost 7 times in favour) as documented in the ERSSTv4 paper. The assertion that buoy data were thrown away as made in the article is demonstrably incorrect.
Response:  Verbiage used by David Rose is not the key issue here. The issue is the substantial adjustment of the buoy temperatures to match the erroneous ship values, and neglect of data from the Argo buoys.

  1. ‘they had used a ‘highly experimental early run’ of a programme that tried to combine two previously seperate sets of records’ 

Karl et al used as the land basis the ISTI databank. This databank combined in excess of 50 unique underlying sources into an amalgamated set of holdings. The code used to perform the merge was publically available, the method published, and internally approved. This statement therefore is demonstrably false.
See response to #4.
What next?
What needs to happen next to clarify the issues raised by John Bates?
We can look forward to more revelations from John Bates, including documentation, plus more detailed responses to some of the issues raised above.
An evaluation of these claims needs to be made by the NOAA Inspector General. I’m not sure what the line of reporting is for the NOAA IG, and whether the new Undersecretary for NOAA will appoint a new IG.
Other independent organizations will also want to evaluate these claims, and NOAA should facilitate this by responding to FOIA requests.
The House Science Committee has an enduring interest in this topic and oversight responsibility.   NOAA should respond to the Committee’s request for documentation including emails. AGU and other organizations don’t like the idea of scientist emails being open to public scrutiny. Well, these are government employees and we are not talking about curiosity driven research here – at issue here is a dataset with major policy implications.
In other words, with the surface temperature data set we are in the realm of regulatory science, which has a very different playbook from academic, ‘normal’ science. While regulatory science is most often referred to in context of food and pharmaceutical sciences, it is also relevant to environmental regulations as well. The procedures developed by John Bates are absolutely essential for certifying these datasets, as well as their uncertainties, for a regulatory environment.Filed under: Data and observations, Ethics

Source