"The ThreatData framework is comprised of three high-level parts: feeds, data storage, and real-time response," Mark Hammell, threat researcher at Facebook, explained in a blog post on Tuesday.
"Feeds collect data from a specific source and are implemented via a light-weight interface. The data can be in nearly any format and is transformed by the feed into a simple schema we call a ThreatDatum. The datum is capable of storing not only the basics of the threat (e.g., evil-malware-domain.biz) but also the context in which it was bad. The added context is used in other parts of the framework to make more informed, automatic decisions."
The information that is fed into it includes malware file hashes from VirusTotal, malicious URLs from malware tracking sites, threat intelligence bought from different vendors, and so on, and is transformed in raw data.
This data is then checked against two internal Facebook data repositories: Hive (which contains historic data about past and long-term threats) and Scuba (which holds data about the most recent threats).
The results are immediately acted upon, via a processor that examines ThreatDatum in real time and automatically does one or more things such as feed the malicious URL blacklist that aims to protect Facebook users, feed Facebook's internal security event management system in order to protect the company networks, flag and forward interesting malware file for further analysis, and so on.
Hammell says that ThreatData has successfully spotted many threats that conventional anti-virus solutions - and especially those used internally by Facebook - do not.
It also allowed them to understand where threats are coming from, types of attacks, their frequency, and the time when they happen.
"We're constantly finding new ways to improve and extend the ThreatData framework to encompass new threats and make smarter decisions with the ones we've already identified. We realize that not all aspects of this approach are entirely novel, but we wanted to share what has worked for us to help spark new ideas," he concluded.
And, who knows, maybe the system will be so successful that the company will consider marketing it to the rest of us.