In early October, a story was published by the Wall Street Journal alleging Kaspersky Lab software was used to siphon classified data from an NSA employee’s home computer system.
Given that Kaspersky Lab has been at the forefront of fighting cyberespionage and cybercriminal activities on the Internet for over 20 years now, these allegations were treated very seriously.
To assist any independent investigators and all the people who have been asking us questions whether those allegations were true, we decided to conduct an internal investigation to attempt to answer a few questions we had related to the article and some others that followed it:
Was our software used outside of its intended functionality to pull classified information from a person’s computer?
When did this incident occur?
Who was this person?
Was there actually classified information found on the system inadvertently?
If classified information was pulled back, what happened to said data after? Was it handled appropriately?
Why was the data pulled back in the first place? Is the evidence this information was passed on to “Russian Hackers” or Russian intelligence?
What types of files were gathered from the supposed system?
Do we have any indication the user was subsequently “hacked” by Russian hackers and data exfiltrated?
Could Kaspersky Lab products be secretly used to intentionally siphon sensitive data unrelated to malware from customers’ computers?
Assuming cyberspies were able to see the screens of our analysts, what could they find on it and how could that be interpreted?
Answering these questions with factual information would allow us to provide reasonable materials to the media, as well as show hard evidence on what exactly did or did not occur, which may serve as a food for thought to everyone else.
To further support the objectivity of the internal investigation we ran our investigation using multiple analysts of non-Russian origin and working outside of Russia to avoid even potential accusations of influence.
The Wall Street Journal Article
The article published in October laid out some specifics that need to be documented and fact checked.
Important bullet points from the article include:
The information “stolen” provides details on how the U.S. penetrates foreign computer networks and defends against cyberattacks.
A National Security Agency contractor removed the highly classified material and put it on his home computer.
The data ended up in the hands of so called “Russian hackers” after the files were detected using Kaspersky Lab software.
The incident occurred in 2015 but wasn’t discovered until spring of last year .
The Kaspersky Lab linked incident predates the arrest last year of another NSA contractor, Harold Martin.
“Hackers” homed in on the machine and stole a large amount of data after seeing what files were detected using Kaspersky data.
Beginning of Search
Having all of the data above, the first step in trying to answer these questions was to attempt to identify the supposed incident.
Since events such as what is outlined above only occur very rarely, and we diligently keep the history of all operations, it should be possible to find them in our telemetry archive given the right search parameters.
The first assumption we made during the search is that whatever data was allegedly taken, most likely had to do with the so-called Equation Group, since this was the major research in active stage during the time of alleged incident as well as many existing links between Equation Group and NSA highlighted by the media and some security researchers. Our Equation signatures are clearly identifiable based on the malware family names, which contain words including “Equestre”, “Equation”, “Grayfish”, “Fanny”, “DoubleFantasy” given to different tools inside the intrusion set.
Taking this into account, we began running searches in our databases dating back to June 2014 (6 months prior to the year the incident allegedly happened) for all alerts triggered containing wildcards such as “HEUR:Trojan.Win32.Equestre.*”. Results showed quickly: we had a few test (silent) signatures in place that produced a LARGE amount of false positives.
This is not something unusual in the process of creating quality signatures for a rare piece of malware.
To alleviate this, we sorted results by count of unique hits and quickly were able to zoom in on some activity that happened in September 2014.
It should be noted that this date is technically not within the year that the incident supposedly happened, but we wanted to be sure to cover all bases, as journalists and sources sometimes don’t have all the details.
Below is a list of all hits in September for an “Equestre” signature, sorted by least amount to most. You can quickly identify the problem signature(s) mentioned above.
Detection name (silent)
Taking this list of alerts, we started at the top and worked our way down, investigating each hit as we went trying to see if there were any indications it may be related to the incident. Most hits were what you would think: victims of Equation or false positives.
Eventually we arrived at a signature that fired a large number of times in a short time span on one system, specifically the signature “HEUR:Trojan.Win32.Equestre.m” and a 7zip archive (referred below as “[undisclosed].7z”).
Given limited understanding of Equation at the time of research it could have told our analysts that an archive file firing on these signatures was an anomaly, so we decided to dig further into the alerts on this system to see what might be going on.
After analyzing the alerts, it was quickly realized that this system contained not only this archive, but many files both common and unknown that indicated this was probably a person related to the malware development.
Below is a list of Equation specific signatures that fired on this system over a period of approximately three months:
In total we detected 37 unique files and 218 detected objects, including executables and archives containing malware associated with the Equation Group. Looking at this metadata during current investigation we were tempted to include the full list of detected files and file paths into current report, however, according to our ethical standards, as well as internal policies, we cannot violate our users’ privacy.
This was a hard decision, but should we make an exception once, even for the sake of protecting our own company’s reputation, that would be a step on the route of giving up privacy and freedom of all people who rely on our products. Unless we receive a legitimate request originating from the owner of that system or a higher legal authority, we cannot release such information.
The file paths observed from these detections indicated that a developer of Equation had plugged in one or more removable drives, AV signatures fired on some of executables as well as archives containing them, and any files detected (including archives they were contained within) were automatically pulled back.
At this point in time, we felt confident we had found the source of the story fed to Wall Street Journal and others.
Since this type of event clearly does not happen often, we believe some dates were mixed up or not clear from the original source of the leak to the media.
Our next task was to try and answer what may have happened to the data that was pulled back. Clearly an archive does not contain only those files that triggered, and more than likely contained a possible treasure trove of data pertaining to the intrusion set.
It was soon discovered that the actual archive files themselves appear to have been removed from our storage of samples, while the individual files that triggered the alerts remained.
Upon further inquiring about this event and missing files, it was later discovered that at the direction of the CEO, the archive file, named “[undisclosed].7z” was removed from storage.
Based on description from the analyst working on that archive, it contained a collection of executable modules, four documents bearing classification markings, and other files related to the same project.
The reason we deleted those files and will delete similar ones in the future is two-fold; We don’t need anything other than malware binaries to improve protection of our customers and secondly, because of concerns regarding the handling of potential classified materials.
Assuming that the markings were real, such information cannot and will not consumed even to produce detection signatures based on descriptions.
This concern was later translated into a policy for all malware analysts which are required to delete any potential classified materials that have been accidentally collected during anti-malware research or received from a third party.
Again to restate: to the best of our knowledge, it appears the archive files and documents were removed from our storage, and only individual executable files (malware) that were already detected by our signatures were left in storage.
Also, it is very apparent that no documents were actively “detected on” during this process.
In other words, the only files that fired on specific Equation signatures were binaries, contained within an archive or outside of it.
The documents were inadvertently pulled back because they were contained within the larger archive file that alerted on many Equation signatures.
According to security software industry standards, requesting a copy of an archive containing malware is a legitimate request, which often helps security companies locate data containers used by malware droppers (i.e. they can be self-extracting archives or even infected ISO files).
An Interesting Twist
During the investigation, we also discovered a very interesting twist to the story that has not been discussed publicly to our knowledge.
Since we were attempting to be as thorough as possible, we analyzed EVERY alert ever triggered for the specific system in question and came to a very interesting conclusion.
It appears the system was actually compromised by a malicious actor on October 4, 2014 at 23:38 local time, specifically by a piece of malware hidden inside a malicious MS Office ISO, specifically the “setup.exe” file (md5: a82c0575f214bdc7c8ef5a06116cd2a4 – for detection coverage, see this VirusTotal link) .
Looking at the sequence of events and detections on this system, we quickly noticed that the user in question ran the above file with a folder name of “Office-2013-PPVL-x64-en-US-Oct2013.iso”. What is interesting is that this ISO file is malicious and was mounted and subsequently installed on the system along with files such as “kms.exe” (a name of a popular pirated software activation tool), and “kms.activator.for.microsoft.windows.8.server.2012.and.office.2013.all.editions”. Kaspersky Lab products detected the malware with the verdict Backdoor.Win32.Mokes.hvl.
At a later time after installation of the supposed MS Office 2013, the antivirus began blocking connections out on a regular basis to the URL “http://xvidmovies[.]in/dir/index.php”. Looking into this domain, we can quickly find other malicious files that beacon to the same URL.
It’s important to note that the reason we know the system was beaconing to this URL is because we were actively blocking it as it was a known bad site.
This does however indicate the user actively downloaded / installed malware on the same system around the same time frame as our detections on the Equation files.
To install and run this malware, the user must have disabled Kaspersky Lab products on his machine. Our telemetry does not allow us to say when the antivirus was disabled, however, the fact that the malware was later detected as running in the system suggests the antivirus had been disabled or was not running when the malware was run. Executing the malware would not have been possible with the antivirus enabled.
Additionally, there also may have been other malware from different downloads that we were unaware of during this time frame.
Below is a complete list of the 121 non-Equation specific alerts seen on this system over the two month time span:
At this point, we had the answers to the questions we felt could be answered.
To summarize, we will address each one below:
Q1 – Was our software used outside of its intended functionality to pull classified information from a person’s computer?
A1 – The software performed as expected and notified our analysts of alerts on signatures written to detect on Equation group malware that was actively under investigation.
In no way was the software used outside of this scope to either pull back additional files that did not fire on a malware signature or were not part of the archive that fired on these signatures.
Q2 – When did this incident occur?
A2 – In our professional opinion, the incident spanned between September 11, 2014 and November 17, 2014.
Q3 – Who was this person?
A3 – Because our software anonymizes certain aspects of users’ information, we are unable to pinpoint specifically who the user was.
Even if we could, disclosing such information is against our policies and ethical standards. What we can determine is that the user was originating from an IP address that is supposedly assigned to a Verizon FiOS address pool for the Baltimore, MD and surrounding area.
Q4 – Was there actually classified information found on the system inadvertently?
A4 – What is believed to be potentially classified information was pulled back because it was contained within an archive that fired on an Equation specific malware signatures.
Besides malware, the archive also contained what appeared to be source code for Equation malware and four Word documents bearing classification markings.
Q5 – If classified information was pulled back, what happened to said data after? Was it handled appropriately?
A5 – After discovering the suspected Equation malware source code and classified documents, the analyst reported the incident to the CEO.
Following a request from the CEO, the archive was deleted from all of our systems. With the archive that contained the classified information being subsequently removed from our storage locations, only traces of its detection remain in our system (i.e. – statistics and some metadata). We cannot assess whether the data was “handled appropriately” (according to US Government norms) since our analysts have not been trained on handling US classified information, nor are they under any legal obligation to do so.
Q6 – Why was the data pulled back in the first place? Is the evidence this information was passed on to “Russian Hackers” or Russian intelligence?
A6 – The information was pulled back because the archive fired on multiple Equation malware signatures. We also found no indication the information ever left our corporate networks.
Transfer of a malware file is done with appropriate encryption level relying on RSA+AES with an acceptable key length, which should exclude attempts to intercept such data anywhere on the network between our security software and the analyst receiving the file.
Q7 – What types of files were gathered from the supposed system?
A7 – Based on statistics, the files that were submitted to Kaspersky Lab were mostly malware samples and suspected malicious files, either stand-alone, or inside a 7zip archive.
The only files stored to date still in our sample collection from this incident are malicious binaries.
Q8 – Do we have any indication the user was subsequently “hacked” by Russian actors and data exfiltrated?
A8 – Based on the detections and alerts found in the investigation, the system was most likely compromised during this time frame by unknown threat actors. We asses this from the fact that the user installed a backdoored MS Office 2013 illegal activation tool, detected by our products as Backdoor.Win32.Mokes.hvl.
To run this malware, the user must have disabled the AV protection, since running it with the antivirus enabled would not have been possible.
This malicious software is a Trojan (later identified as “Smoke Bot” or “Smoke Loader”) allegedly created by a Russian hacker in 2011 and made available on Russian underground forums for purchase.
During the period of September 2014-November 2014, the command and control servers of this malware were registered to presumably a Chinese entity going by the name “Zhou Lou”, from Hunan, using the e-mail address “firstname.lastname@example.org”. We are still working on this and further details on this malware might be made available later as a separate research paper.
Of course, the possibility exists that there may have been other malware on the system which our engines did not detect at the time of research.
Given that system owner’s potential clearance level, the user could have been a prime target of nation states.
Adding the user’s apparent need for cracked versions of Windows and Office, poor security practices, and improper handling of what appeared to be classified materials, it is possible that the user could have leaked information to many hands. What we are certain about is that any non-malware data that we received based on passive consent of the user was deleted from our storage.
Q9 – Could Kaspersky Lab products be secretly used to intentionally siphon sensitive data unrelated to malware from customers’ computers?
A9 – Kaspersky Lab security software, like all other similar solutions from our competitors, has privileged access to computer systems to be able to resist serious malware infections and return control of the infected system back to the user.
This level of access allows our software to see any file on the systems that we protect. With great access comes great responsibility and that is why a procedure to create a signature that would request a file from a user’s computer has to be carefully handled. Kaspersky malware analysts have rights to create signatures. Once created, these signatures are reviewed and committed by another group within Kaspersky Lab to ensure proper checks and balances.
If there were an external attempt to create a signature, that creation would be visible not only in internal databases and historical records, but also via external monitoring of all our released signatures by third parties.
Considering that our signatures are regularly reversed by other researchers, competitors, and offensive research companies, if any morally questionable signatures ever existed it would have already been discovered. Our internal analysis and searching revealed no such signatures as well.
In relation to Equation research specifically, our checks verified that during 2014-2016, none of the researchers working on Equation possessed the rights to commit signatures directly without having an experienced signature developer verifying those.
If there was a doubtful intention in signatures during the hunt for Equation samples, this would have been questioned and reported by a lead signature developer.
Q10 – Assuming cyberspies were able to see screens of our analysts, what could they find on it and how could that be interpreted?
A10 – We have done a thorough search for keywords and classification markings in our signature databases.
The result was negative: we never created any signatures on known classification markings. However, during this sweep we discovered something interesting in relation to TeamSpy research that we published earlier (for more details we recommend to check the original research at https://securelist.com/the-teamspy-crew-attacks-abusing-teamviewer-for-cyberespionage-8/35520/).
TeamSpy malware was designed to automatically collect certain files that fell into the interest of the attackers.
They defined a list of file extensions, such as office documents (*.doc, *.rtf, *.xls, *.mdb), pdf files (*.pdf) and more.
In addition, they used wildcard string pattern based on keywords in the file names, such as *pass*, *secret*, *saidumlo* (meaning “secret” in Georgian) and others.
These patterns were hardcoded into the malware that we discovered earlier, and could be used to detect similar malware samples. We did discover a signature created by a malware analyst in 2015 that was looking for the following patterns:
These strings had to be located in the body of the malware dump from a sandbox processed sample.
In addition, the malware analyst included another indicator to avoid false positives; A path where the malware dropper stored dropped files: ProgramData\Adobe\AdobeARM.
One could theorize about an intelligence operator monitoring a malware analyst’s work in the process of entering these strings during the creation of a signature. We cannot say for sure, but it is a possibility that an attacker looking for anything that can expose our company from a negative side, observations like this may work as a trigger for a biased mind.
Despite the intentions of the malware analyst, they could have been interpreted wrongly and used to create false allegations against us, supported by screenshots displaying these or similar strings.
Many people including security researchers, governments, and even our direct competitors from the private sector have approached us to express support.
It is appalling to see that accusations against our company continue to appear without any proof or factual information being presented. Rumors, anonymous sources, and lack of hard evidence spreads only fear, uncertainty and doubt. We hope that this report sheds some long-overdue light to the public and allows people to draw their own conclusions based on the facts presented above. We are also open and willing to do more, should that be required.
Appendix: Analysis of the Mokes/SmokeBot backdoor from the incident