Data Science Meets Cyber Security

hackerVerizon recently released its 2014 Verizon Data Breach Investigations Report, which presents results from an analysis of over 63,000 cyber security incidents that have occurred during the past ten years. Using tools from the rapidly expanding field of Data Science, the authors categorize the incidents in terms of their similarities, an analysis technique called clustering. By grouping the incidents into these similarity clusters, the authors conclude that 92% of the security incidents they studied fall into one of nine categories: point-of-sale intrusions, web app attacks, insider and privilege misuse, physical theft and loss, human error, crimeware, payment card skimming, denial of service, and cyber espionage. The report includes sections on each of these nine categories that identify their primary targets, the industries in which they are most prevalent, how frequently they occur, and the techniques attackers use to carry them out.

The report includes numerous graphs that illustrate how cyber attacks have evolved over time. For example, the report shows that point-of-sale attacks, despite grabbing a lot of the headlines this year related to what happened at retail giant Target, have actually been decreasing in popularity over the past few years. Meanwhile, web application attacks have increased in popularity, perhaps because successful attacks against web-based payment processing and data management systems can produce a much greater yield of customers’ personal information. The report also identifies the dramatic increase in attacks that come from outside an organization. Five years ago, attacks against an organization were rather evenly split between those started by people working for the organization, either maliciously or accidentally, and those launched by external adversaries. The picture is quite different today, as Figure 4 and Figure 5 on page 8 of the report show.

Verizon’s report is very valuable. Grouping attacks into clusters and describing them in terms of numerous variables can help an organization understand which of their systems are most vulnerable to attack and who is responsible for them based on what the industry as whole is experiencing. Organizations can then plan their cyber security efforts based on verified data rather than on intuition. There is no substitute for good, actionable data that provide a comprehensive picture of the threat landscape. Verizon’s report is a treasure trove of cyber security intelligence that was made possible both by a concerted effort to collect all of this data and the ability to analyze it using statistical techniques from Data Science.

It is also valuable for cyber security and data science educators. Our Computer Science department offers graduate degrees in both information security and data science. (These programs are also available online.) The nine categories the Verizon report identifies provide an interesting framework for organizing the curriculum to ensure that students understand how various attacks work, what they target, how they can be counteracted from a technical perspective, and what kinds of resources and controls need to be established to neutralize them. For our data science students, the volumes of data provided by security appliances provide an important area for them to apply their data analysis acumen to influence decisions on how to keep our systems secure. Studies like Verizon’s show the intriguing  opportunities for experts from these two critical fields to collaborate.

The Department of Homeland Security has dubbed October National Cyber Security Awareness Month. The more actionable data we have about cyber security incidents, the more aware we become of how best to protect them. The industry needs more analyses like these so that it can keep pace with the rapidly evolving cyber threat.


About Ray Klump

Associate Dean, College of Aviation, Science, and Technology at Lewis University Director, Master of Science in Information Security Lewis University,, You can find him on Google+.

Leave a Reply

Your email address will not be published. Required fields are marked *