Nabeel's Blog: Clouds and Storms [Part 1]

Decoded title: Security/Privacy issues in the Cloud (from the PoV of utility computing)

The objective of this post is to raise awareness about possible privacy/security issues that may arise in cloud computing paradigm - if you are a security researcher, where you may want to focus on; if you are a cloud provider, what you need to safeguard against and what customers would be expecting, if you're a cloud user (technical/non-technical), what you can expect and what you can do about it in order to minimize the risks.

(Note that there could also be security benefits of moving to the clouds, as opposed to maintaining your own infrastructure. For example, the clouds may provide more secured infrastructure and also can afford to provide expertise in security which may not be a viable option especially for small-medium businesses. Also, the virtual machines deployed may be better configured and virtualization inherently provides a certain level of security.)

As we all know, there is, in general, a gap between the research solutions and the industry implemented solutions; it is partly due to the fact that some elegant theoretical solutions (published even in top conferences or journals) are not practical. However, there are very many useful research that could well be utilized; they are not implement for one reason or another. One of the main reason is that most of businesses/agencies/users don't see a ROI on having security/privacy because the effect is not immediate. I'd also like to encourage in this post to think about security/privacy up front, no matter which stakeholder you are.

It is not a secret that cloud computing is getting a lot of attention these days. I think that the economies of scale (or the on demand elasticity) is the biggest drive for this compared to conventional IT outsourcing - money matters!. You can pay for the amount of storage you use or the amount of computational power you use. Not only businesses but also government agencies are moving to what appears to be the current big thing. (wonder what the next big thing might be?)

If you can relate to, for example, Amazon EC2, S3, Google Apps, free email services, chat serivies, Yahoo pipes, flickr, facebook, youtube, hulu, Zoho, 3Tera Applogic, etc. you are living in the cloud! irrespective of whether it's free, consumption based or subscription based. It's pretty much everything we currently do (I am not a big fan of cloud defintions; further there are disagreements about the origin of clouds [4] which I am not going to look at here).

Quote:
The Pew Internet & American Life Project released survey results in September 2008 reporting that 69 percent of Americans who are online use Web-based e-mail, store data or use software applications over the Internet. In October 2008, the market research firm IDC forecast that spending on IT cloud services would reach $42 billion by 2012 [2].

Let me start the $subject with the following quote:
"Privacy and security are the number one concern of organizations that are thinking about going into the cloud space." said Brendon Lynch, senior director of privacy strategy for Microsoft's trustworthy computing group [1].

What are these privacy/security concerns? The rest of the post aims to look into them. As we know, privacy/security can only be as good as its weakest link. The goal is to identify those weakest links.

In all cloud arrangements (SaaS, PaaS or IaaS), your data end up being in someone else's hand outside of your security perimeter. (I am still a free cloud user; I use gmail a lot, upload my documents to Google Apps, occasionally share some photos in Yahoo Flickr, share in Facebook - I don't know where all my online data, including sensitive data, physically reside - but my desire to have the data available from anywhere and to connect with people, has overridden the perceived risks.) Is the issue new? Not really. Well before the current clouds, there have been services to outsource network storage, databases, host web sites and IT services which also move your data out of your organization. What's different here? I see the following differences.

1. In clouds, we know that the data reside in one or more data centers, but we don't know which ones - not limited by space or geography. What are the legal/privacy/security implications?
2. Oursourcing has never been this cheap; an incentive to use the cloud which is not the case with the traditional outsourcing. What could go wrong, if it becomes perversive?

Does the locality of the data in the cloud matter? No physical boundary is an interesting outcome of the cloud. Note that different countries have their own legal framework. For example, data protection laws in US are very different from those in european countries (EU is more strict). In other words, depending on the locality of your data, you'll have different expectation of privacy. If a company X resides in a EU country, but their customers are mainly from USA. Due to differences in legal protections, I am not sure if X (consumer of the cloud) can ask the cloud provider to host their data/service in USA. Even if it is allowed, does X make an informed decision? Since the data resides closer to customers, it would be fast to access them; but what about security/privacy protection? If there's a data breach, in USA there will be less protection compared to EU. [On a positive note, inability to trace to a specific location is good thing from the security PoV; this provides some level of anonymity; if the attacker does not know where the target is, they have nothing to attack at]

How can a cloud provider efficiently identify if a consumer sticks with the terms of service (e.g: AWS ToS prohibits illegal uses)? Further, if a consumer uses the cloud services in a way that threatens, say, national security, but the cloud provider is unaware of it, who is held liable for the threat? Consumer or service provider or both? Extensive work on anomaly detection in the IDS research area could be very useful in this regard. Even if there are good anomaly detection techniques available, how do we define anomalous patterns in the cloud?

Can misbehaving consumers affect the benign ones? AFAIK, many cloud providers use Xen virtual machines. However, unless you pay extra, you, as a consumer, have to share the same physical machine or even same virtual machine. Virtualization techniques provide certain level of insulation, but I am sure there is an increased interest in this area as way of improving security in the context of cloud computing.

How long can a cloud provider retain my data after deleting? Also, once you put your data in someone else's facility, can you ever be sure of that they removed it completely? Deleting data once you indicate to the provider the intension is trivial if the provider manages only a few consumers. However, with the exploding use of the clouds, this has become a challenging problem. The report by Joseph Bonneau backs up this with the results from real applications [6]. For performance reasons, most of the providers delete them just like how the recycle bin works in your computer. For example, facebook retention policies say that "When you update information, we usually keep a backup copy of the prior version for a reasonable period of time to enable reversion to the prior version of that information". Like facebook, most of the vendors do not give a specific time about how long they retain your data in order to prevent legal actions against them. Clouds are increasingly used to store PII (Personally Identifiable Information) and the failure to delete them promptly could violate user privacy when the PII is available long enough for an attacker to obtain. It is a challenging task for cloud providers to balance performance and privacy/security in this regard - architecture/design should consider these concerns together, not in isolation. For consumers, it is better to choose a provider that provides quantitative/better claims about data retention period in their policies.

How can the cloud live up to the perimeter security expectations of consumers? Nico Popp at VeriSign raises the interesting question "what does perimeter security mean when the perimeter extends beyond the familiar boundaries of today's corporate network?" [7] In the current conventional setting, enterprises have their perimeter security controls (firewalls, IDS, etc.) placed in-premise either managed by themselves or outsourced; they protect the enterprise infrastructure, data from malicious traffic, malwares, unauthorized accesses, etc. With cloud computing, enterprises mobile users will be accessing organization's resources without going through the in-house perimeter security controls. It should be clear that the cloud computing create the need to have a some kind of proxy sitting between the cloud and the mobile users. Who should provide this proxy service? One approach is to have cloud providers, such as Google App engine, Microsoft Azure or Amazon EC2, themselves provide a security layer over the cloud. This may require them to go beyond their core competencies. An attractive solution is to provide Security-as-a-Service by third party who already has expertise in conventional perimeter security. In fact, Gartner predicts that by 2013 cloud-based services in messaging security controls will account for 60 percent of revenue [9]. For example, Zscalar does exactly that. The following diagram shows how it works:

(Courtesy: Zscalar)

Can enterprises let go of in-premise perimeter controls? They will still need to have some control in place. This brings the burden of having two sets of security controls in place (cost, management, etc.). Can we combine these two together? What are the challenges in doing so?
[On a positive note, there have been some research indicating that computers can be better protected against viruses if the anti-virus software is move to the cloud [10]].

In Part 2, I am hoping to discuss about the ownership of data, the control, what you can expect from free and commercial cloud services and some generic issues such as confidentiality and integrity (in light of insider attacks). So stay tuned. And feel free to comment/criticize/correct anything I have mentioned here.

References:
[1] http://www.informationweek.com/news/windows/security/showArticle.jhtml?articleID=221600544
[2] http://www.govtech.com/gt/727301?topic=117671
[3] Privacy in the Cloud Computing Era - A Microsoft Perspective
[4] http://www.cerias.purdue.edu/site/blog/post/a_quick_note_about_cloud_computing/
[5] http://blogs.cisco.com/security/comments/data_security_and_the_cloud/
[6] http://www.lightbluetouchpaper.org/2009/05/20/attack-of-the-zombie-photos/
[7] http://blogs.verisign.com/innovation/2009/06/are_clouds_of_change_looming_o.php
[8] http://www.technologyreview.com/computing/21303/
[9] http://www.gartner.com/it/page.jsp?id=722307
[10] http://www.eecs.umich.edu/fjgroup/cloudav/

Nabeel's Blog

Monday, November 16, 2009

Clouds and Storms [Part 1]

No comments: