Researchers work on self-healing cloud infrastructure

Cloud computing has become completely ubiquitous, spawning hundreds of new web based services, platforms for building applications, and new types of businesses and companies. However, the freedom, fluidity and dynamic platform that cloud computing provides also makes it particularly vulnerable to cyber attacks. And because the cloud is a shared infrastructure, the consequences of such attacks can be extremely serious.

With funding from DARPA, researchers from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) aim to develop a new system that would help the cloud identify and recover from an attack almost instantaneously.

Typically, cyber attacks force the shutdown of the entire infiltrated system, regardless of whether the attack is on a personal computer, a business website or an entire network. While the shutdown prevents the virus from spreading, it effectively disables the underlying infrastructure until cleanup is complete.

Professor Martin Rinard, a principal investigator at CSAIL and leader of the Cloud Intrusion Detection and Repair project, and his team of researchers aim to develop a smart, self-healing cloud computing infrastructure that would be able to identify the nature of an attack and then, essentially, fix itself.

The scope of their work is based on examining the normal operations of the cloud to create guidelines for how it should look and function, then drawing upon this model so that the cloud can identify when an attack is underway and return to normal as quickly as possible.

“Much like the human body has a monitoring system that can detect when everything is running normally, our hypothesis is that a successful attack appears as an anomaly in the normal operating activity of the system,” said Rinard. “By observing the execution of a “normal’ cloud system we’re going to the heart of what we want to preserve about the system, which should hopefully keep the cloud safe from attack.”

Rinard believes that a major problem with today’s cloud computing infrastructures is the lack of a thorough understanding of how they operate. His research aims to identify systemic effects of different behavior on cloud computing systems for clues about how to prevent future attacks.

“Our goal is to observe and understand the normal operation of the cloud, then when something out of the ordinary happens, take actions that steer the cloud back into its normal operating mode,” said Rinard. “Our expectation is that if we can do this, the cloud will survive the attack and keep operating without a problem.”

Don't miss