Email Dates in the Wikileaks DNC Archive

Yesterday, Scott Ritter published a savage and thorough critique of the role of Dmitri Alperovitch and Crowdstrike, who are uniquely responsible for the attribution of the DNC hack to Russia. Ritter calls it “one of the greatest cons in modern American history”.  Ritter’s article gives a fascinating account of an earlier questionable incident in which Alperovitch first rose to prominence – his attribution of the “Shady Rat” malware to the Chinese government at a time when there was a political appetite for such an attribution. Ritter portrays the DNC incident as Shady Rat 2.  Read the article.
My post today is a riff on a single point in the Ritter article, using analysis that I had in inventory but not written up.  I’ve analysed the dates of the emails in the Wikileaks DNC email archive: the pattern (to my knowledge) has never been analysed. The results are a surprise – standard descriptions of the incident are misleading.
Nov 7, 2017: story picked up by Luke Rosniak at Daily Caller here 

On April 29, DNC IT staff noticed anomalous activity and brought it to the attention of senior DNC officials: Chairwoman of the DNC, Debbie Wasserman-Schultz, DNC’s Chief Executive, Amy Dacey, the DNC’s Technology Director, Andrew Brown, and Michael Sussman, a lawyer for Perkins Coie, a Washington, DC law firm that represented the DNC. After dithering for a few days, on May 4, the DNC (Sussman) contacted Crowdstrike (Shawn Henry), who installed their software on May 5.
According to a hagiography of Crowdstrike’s detection by Thomas Rid last year, Crowdstrike detected “Russia” in  the network in the early morning of May 6:

At six o’clock on the morning of May 6, Dmitri Alperovitch woke up in a Los Angeles hotel to an alarming email. Alperovitch is the thirty-six-year-old cofounder of the cybersecurity firm CrowdStrike, and late the previous night, his company had been asked by the Democratic National Committee to investigate a possible breach of its network. A CrowdStrike security expert had sent the DNC a proprietary software package, called Falcon, that monitors the networks of its clients in real time. Falcon “lit up,” the email said, within ten seconds of being installed at the DNC: Russia was in the network.

In many accounts of the incident (e.g. Wikipedia here), it’s been reported that “both groups of intruders were successfully expelled from the  systems within hours after detection”. This was not the case, as Ritter pointed out: data continued to be exfiltrated AFTER the installation of Crowdstrike software, including the emails that ultimately brought down Wasserman-Schultz:

Moreover, the performance of CrowdStrike’s other premier product, Overwatch, in the DNC breach leaves much to be desired. Was CrowdStrike aware that the hackers continued to exfiltrate data (some of which ultimately proved to be the undoing of the DNC Chairwoman, Debbie Wasserman Schultz, and the entire DNC staff) throughout the month of May 2016, while Overwatch was engaged?

This is an important and essentially undiscussed question.

Distribution of Dates

The DNC Leak emails are generally said to commence in January 2015 (e.g. CNN here) and continue until the Crowdstrike expulsion. In other email leak archives (e.g Podesta emails; Climategate), the number of emails per month tends to be relatively uniform (at least to one order of magnitude).  However, this is not the case for the DNC Leak as shown in the below graphic of the number of emails per day:
Figure 1. Number of emails per day in Wikileaks DNC archive from Jan 1, 2015 to June 30, 2016. Calculated from monthly data through March 31, 2016, then weekly until April 15, then daily. No emails after May 25, 2016.
There are only a couple of emails per month (~1/day) through 2015 and up to April 18, 2016. Nearly all of these early emails were non-confidential emails involving DNCPress or innocuous emails to/from Jordan Kaplan of the DNC.  There is a sudden change on April 19, 2016 when 425 emails in the archive. This is also the first day on which emails from hillaryclinton.com occur in the archive – a point that is undiscussed, but relevant given the ongoing controversy about security of the Clinton server (the current version of which was never examined by the FBI)The following week, the number of daily emails in the archive exceeded 1000, reaching a maximum daily rate of nearly 1500 in the third week of May. There is a pronounced weekly cycle to the archive (quieter on the week-ends).
Rid’s Esquire hagiography described a belated cleansing of the DNC computer system on June 10-12, following which Crowdstrike celebrated:

Ultimately, the teams decided it was necessary to replace the software on every computer at the DNC. Until the network was clean, secrecy was vital. On the afternoon of Friday, June 10, all DNC employees were instructed to leave their laptops in the office. Alperovitch told me that a few people worried that Hillary Clinton, the presumptive Democratic nominee, was clearing house. “Those poor people thought they were getting fired,” he says. For the next two days, three CrowdStrike employees worked inside DNC headquarters, replacing the software and setting up new login credentials using what Alperovitch considers to be the most secure means of choosing a password: flipping through the dictionary at random. (After this article was posted online, Alperovitch noted that the passwords included random characters in addition to the words.) The Overwatch team kept an eye on Falcon to ensure there were no new intrusions. On Sunday night, once the operation was complete, Alperovitch took his team to celebrate at the Brazilian steakhouse Fogo de Chão.

Curiously, the last email in the archive was noon, May 25 – about 14 days before Crowdstrike changed all the passwords on the week-end of June 10-12. Two days later (June 14), the DNC arranged for a self-serving article in the Washington Post in which they announced the hack and blamed it on the Russians. Crowdstrike published a technical report purporting to support the analysis and the story went viral.
There were no fewer than 14409 emails in the Wikileaks archive dating after Crowdstrike’s installation of its security software. In fact, more emails were hacked after Crowdstrike’s discovery on May 6 than before. Whatever actions were taken by Crowdstrike on May 6, they did nothing to stem the exfiltration of emails from the DNC.

Discussion

The lack of emails prior to April 19 is something that I don’t understand. This is more or less the date on which Fancy Bear was said to have entered the system, but, in other hacks (e.g. Podesta, Climategate), all emails in the account as of the penetration date were exfiltrated.  Was it an editorial decision on the part of the DNC hacker to exclude emails prior to April 19? If so, why were there any at all?
Nor does the razor-sharp end-date of noon May 25 tie into reported dates of Crowdstrike security measures?  Does this reflect an editorial decision during the curation of the hacked emails or something else?
Hiring Crowdstrike to watch the exfiltration of data can hardly be what the DNC had in mind.It’s a bit reminiscent of the uniformed official in the (Lifelock) commercial who explains “I’m not a security guard, I’m a security monitor. I only notify people if there’s a robbery… There’s a robbery.”  As Ritter observed, some of the most embarrassing emails were after Crowdstrike’s May 6 discovery of the hacks – an obvious point that has not been made in media discussion.
Recent articles about Crowdstrike  continue to falsely claim that Crowdstrike “quickly closed” the leak but “damage already done” eg Wired in March 2017:

The vulnerabilities were quickly closed, but the damage had already been done.

As discussed above, the opposite was the case.  Most of the damage was done after Crowdstrike installed its software.  Ritter further asked: “Did Overwatch detect the spread of malware into the servers of the DCCC? If the answer is yes, one must question the competence of a cyber security company whose job is to prevent just that kind of activity.”
Overall, the most serious question is the validity of Crowdstrike’s attribution of the DNC hack to the “Russians”. Alperovitch is an Atlantic Council associate who is vituperatively anti-Russian, with questionable attribution history.  Before being baked into government policy, any Alperovitch findings ought to be cross-checked in the most minute detail. However (unlike Climategate), the police (FBI here) never took possession of the hacked server and were thus unable to carry out their own forensic analysis. The intel assessments provided to the public consist of little more than assertions, repeated over and over, louder and louder, rather than objective evidence. The intel community hides behind a supposed need to protect “sources and methods”, but I seriously wonder whether these caveats nothing more than a figleaf to prevent exposure of their own shortcomings (h/t David Niven).
 

Source