How adding “supervised” machine learning to the development of n-dimensional signature engines is moving the detection odds back to the defender.

Second in a series of two articles about the history of signature-based detections and how the methodology has evolved to identify different types of cybersecurity threats.
Many security vendors are now applying increasingly sophisticated machine learning elements into their cloud-based analysis and classification systems, and into their products.

All of these techniques have already proven their value in Internet search, targeted advertising and social networking business arenas.
For example, supervised learning models lie at the heart of ensuring that the best and most applicable results are returned when searching for the phrase “never going to give you up.”
In the information security world, supervised learning models are a natural progression of the one, two, and multi-dimensional signature systems discussed in my earlier article.

At its core, instead of humans arguing over which features and attributes of a threat are most relevant to a detection, mathematics and science are used to find and evaluate the most important artifacts, and to automatically construct a sophisticated signature.
 N-dimensional Signatures
Multidimensional signatures and the security products that use them rely heavily on human researchers and analysts to observe and classify each behavior for efficacy.
If a threat exhibits a new malicious behavior (or a false positive behavior has been identified in the field), the analyst must manually create or edit a new signature element and its classification, and include it as an update.

The assumption is that humans will be the most relevant elements of a threat and can label them.
The application of machine learning to the problem largely removes humans and their biases to the development of an n-dimensional signature (or often called a “classification model”).
Instead of manually trying to figure out and label all the good, bad, and suspicious behaviors, a machine is fed a bunch of “known bad” and “known good” samples, which could be binary files, network traffic, or even photographs. 
It then takes and compares all the observable behaviors of the collected samples, automatically determines which behaviors were more prevalent or less prevalent to each class of samples, calculates a weighting factor for each behavior, and combines all that intelligence in to a single model of n-dimensions – where n is a variable size based upon the type and number of samples and behaviors the machine used. 
Enter ‘Supervised Learning’
Different sample volumes and differing samples supplied over time will often affect n.
In machine learning terminology, this process is called “supervised learning.” 
Historically, there existed a class of threat detection referred to as “Anomaly Detection Systems” (ADS) that effectively operated on the premise of baselining a network or host activity.
In the case of network ADS (i.e. NADS), the approach would entail constructing a map of network devices, identifying who talks to who over what ports and protocols, how often, and in what kind of volume.
Once that baseline is established (typically over a month), any new chatter that was an anomaly to that model (e.g. a new host added to the network) generated an alert – subject to certain thresholds being defined. Obviously that approach generated incredibly high volumes of alerts and detection was governed by those threshold settings.

As a technology, ADS represented a failed branch of the threat detection evolutionary tree.
Without getting into the math, unsupervised machine learning has allowed security vendors to revisit the ADS path and detection objectives – and overcome most of the alerting and threshold problems.

The detection models and engines that use unsupervised machine learning still require an element of baselining, but continually learn and reassess that baseline on an hourly or daily basis. 
As such, these new detection systems are capable of identifying attack vectors such as “low-and-slow” data exfiltration, lateral movement, and staging servers.

These threats are difficult or cumbersome to detect using signature systems.
This is why signature-based detection systems will continue to be valuable in to the future – not as a replacement, but as a companion to the new advancements in unsupervised machine learning.
In other words, what the current generation of unsupervised machine learning brings to security is the ability to detect threats that are anomalies or unclassified events and behaviors.
It is inevitable that machine learning approaches will play an increasingly important role in future generations of threat detection technology. Just as their use has been critical to the advancement of Internet search and social media applications, their application to information security will be just as great. 
Signature-based threat detection systems have been evolving for more than two decades, and the application of supervised machine learning to the development of n-dimensional signature engines over the last couple of years is already moving the detection odds back to the defender. When combined with the newest generation of unsupervised machine learning systems, we can expect that needle to shift more rapidly in the defender’s favor.
Return to part 1: Machine Learning In Security: Good & Bad News About Signatures
Related Content: 

Find out more about security threats at Interop 2016, May 2-6, at the Mandalay Bay Convention Center, Las Vegas.

Click here for pricing information and to register.
Gunter Ollmann is chief security officer at Vectra. He has nearly 30 years of information security experience in an array of cyber security consulting and research roles.

Before joining Vectra, Günter was CTO of Domain Services at NCC Group, where he drove strategy …
View Full Bio

More Insights