Two of the most exciting areas of Computer Science these days are cyber security and data science. The recently released Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science, issued jointly by the ACM and IEEE, places a large focus on these two increasingly important areas of the field. Undergraduate programs across the country, including at Lewis, have been creating courses and programs to prepare students for the many exciting opportunities in cyber security and data science. Just in the last two years, Lewis’ Computer Science program has added courses Cloud and Virtualization, Cyber Security and Forensics Tools, Introduction to Data Mining, and Machine Learning to our list of options for Computer Science students. We’ll also launching a Masters Degree in Data Science starting in Fall 2014, which will join our Master of Science in Information Security program in our stable of graduate offerings. Clearly, there is a lot of activity in these two vital areas.
It is not uncommon for emerging fields to find common ground and start to influence each other as they mature. We’re seeing that now with cyber security and data science. As described in this article, there are both concerning challenges and exciting opportunities in the overlap of cyber security and big data.
In terms of opportunities, collecting and processing huge additional amounts of data will give us better visibility into any system. For example, by installing and collecting data from sensors in a battlefield, military personnel can keep better track of the enemy and make more informed decisions about deploying troops to combat them. On the cyber battlefield, collecting more data about ongoing and evolving attacks can help cyber security personnel deploy the best strategies for combating them and minimizing their impact. The more you know, the better prepared you are, provided you have a good way to organize and make sense of what you know. That is what data mining is all about: figuring out how to make sense of all the data that have been collected so that it can help inform decision making.
There are challenges to this, however. What if the software that has been written to mine the data is compromised? What if there are bugs in it that lead users to draw incorrect conclusions from the data? What if the data sets themselves are compromised so that they contain incorrect or falsified data? What if the communication channels these systems depending upon to share data in real time are interrupted?
Big Data systems present a rather large attack surface, and it is up to software developers, data scientists, and cyber security specialists to protect it. Programs such as the military’s Mining and Understanding Software Enclaves (MUSE), which is described in the article, are important initiatives because they seek to establish best practices for ensuring that the software, storage, communications, and protection systems work together in ways that help shore up the holes.
The stakes a rare high and growing. To make best use of big data, we need to be able to secure data from being breached, because we need to be able to trust the data before we can use it. To establish trust in a data system, we need to collect as much intelligence as we can about possible attacks against it. The two problems are intertwined and must be solved simultaneously.
That’s your cue, Computer Science majors. Put on your super suits.