This debate has considerable relevance for the world of computer security. Many of the systems that we build to protect our networks work automatically to quarantine virus infected files or block attacks, and indeed, automated attacks often happen more quickly than human beings can react. However, sophisticated attackers have proven that they can effectively outsmart our machines. Obfuscated malware avoids detection by anti-virus software while exploits that target 0-day vulnerabilities slip past intrusion detection systems. Perhaps in order to surmount these problems we need to bring people back into the loop on the defensive side.
The VizSec Workshop is an international academic conference that explores the intersection of human machine interfaces and cyber security challenges in search of the right balance between automation and human insight. These subjects are particularly interesting to those of us at Lancope, where I work as Director of Security Research. We build systems that enable human operators to better understand what is going on in their computer networks, with the ultimate goal of detecting and analyzing malicious activity that fully automated security systems have missed.
For this year’s VizSec Workshop, Lancope prepared some interesting visualizations of malware command and control behavior. The goal is to see if we can visually differentiate certain kinds of malware behavior from legitimate network traffic. The data available from Lancope’s malware research suggests that 85% to 95% of malware samples use TCP port 80 to communicate with their command and control servers. We decided to investigate the other TCP and UDP ports chosen by the remaining samples to see if there are any interesting patterns that emerge.
We took a look at the command and control behaviors of a collection of nearly two million unique malware samples that were active between 2010 and 2012. These samples reached out to nearly 150,000 different command and control servers on over 100,000 different TCP and UDP ports. We created heat maps representing the relative popularity of each port. Each pixel in the images we generated represents a single port number, and the color of each pixel represents the number of command and control hosts in our sample set utilizing that port.
In order to create an example of legitimate traffic to compare this data against, we monitored a small office network over the course of one month, and collected information about the ports that computers on that network contacted. We generated images out of that data too, and certain distinctions were immediately visible.