Researchers merged artificial intelligence with ‘analyst intuition’ to create AI2.
Cyber crime never sleeps, but researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and machine-learning start-up PatternEx are working to thwart the next big attack.
Their joint effort, known as AI2, merges artificial intelligence with what the researchers call “analyst intuition” to predict future attacks.
Like a parent and child tackling a homework assignment, the machine first works unsupervised, combing through data and detecting suspicious activity.
It then presents said activity to human analysts, who pull out actual attacks.
AI2 incorporates that feedback into its models for the next dataset, learning as it goes.
“It continuously generates new models that it can refine in as little as a few hours, meaning it can improve its detection rates significantly and rapidly,” says CSAIL research scientist Kalyan Veeramachaneni, who co-developed AI2 with Ignacio Arnaldo, a chief data scientist at PatternEx.
The program was tested on 3.6 billion pieces of data, or “log lines,” generated by millions of users over three months.
This approach, however, is a tricky one: MIT researchers detail challenges like manually labeling cyber-security data for algoritms. “For a cybersecurity task, the average person on a crowdsourcing site like Amazon Mechanical Turk simply doesn’t have the skillset to apply labels like ‘DDOS’ or ‘exfiltration attacks,’ says Veeramachaneni. “You need security experts.”
If can be difficult, however, to find experts with enough time to review possibly suspicious data.
But that’s where AI2 and its “secret weapon” come in. On day one, the machine picks the 200 most abnormal events and gives them to the expert.
As it improves, it identifies more of the events as actual attacks; in a matter of days, the analyst may only be looking at 30 or 40 events daily.
“The more attacks the system detects, the more analyst feedback it receives, which, in turn, improves the accuracy of future predictions,” Veeramachaneni said. “That human-machine interaction creates a beautiful, cascading effect.”
According to its creators, the system boasts “significantly better” results than existing programs.
It can scale to billions of log lines per day, and detect 85 percent of attacks—roughly three times more than previous benchmarks—while reducing false positives by a factor of five.
Veeramachaneni presented a paper about the system at last week’s IEEE International Conference on Big Data Security in New York City.
“This research has the potential to become a line of defense against attacks such as fraud, service abuse, and account takeover, which are major challenges faced by consumer-facing systems,” Nitesh Chawla, professor of computer science at the University of Notre Dame, said in a statement.