Nabeel's Blog: 2010

Sunday, December 12, 2010

Thinking security geometrically

Recently, I had an interesting conversation where the discussion diverged into the application of geometry in security protocols. Until then, I didn't pay much attention to how we derive security protocols. Actually, most of the time we think about security constructs from the algebraic point of view. We hardly think in terms of geometric shapes. At that time, I couldn't think of a security protocol whose basis is geometry. So, did some research on the topic. Here's one example: one of the very first secret sharing scheme, Blakley's scheme, is based on the really cool idea of intersection of (hyper)planes. The idea is quite simple. Any non-parallel n n-dimensional hyperplanes intersect at a unique point and that unique point corresponds to the master secret and the n hyperplanes correspond to the secret shares. For example, two non-parallel lines on the same plane intersect at a specific point; two non-parallel planes in the space intersect a specific point. Even the idea behind Shamir's secret sharing scheme could be considered as a geometric construction.

(Source: wikipedia)

In the above diagram, each plane represents a secret share and the intersection point represents the master secret.

Monday, October 18, 2010

GWT

I have been doing a lot of AJAX based programming lately..thanks to Google Web Toolkit I don't need to worry about underlying AJAX calls; it works seamlessly..however, I had some issues with layouts and laying out widgets..GWT (2.0) works well when you set explicit height and width for widgets but acts weirdly when you want to set a variable (percentage value) especially in the standard mode. After a lot of trail-and-error, I changed most of my outer layout panels to FlowPanels to get the scalable width working.

Tuesday, September 21, 2010

Where do good ideas come from?

The key message from Steven is that ideas come from collaboration. Collaborations/discussions do create Eureka moments, but IMO a good amount of time spent alone can also create Eureka moments..I think the key is the passion to connect the dots.

Saturday, July 10, 2010

Research and camera lenses

There is a resemblance between the way we do research and the camera lenses. I am just trying to tie them together :)

Macro lens - Go deep into the details
Telephoto lens - Foresee the trends that drive the technology and demand new technology/solutions
Wide angle lens - Get the big picture
Prime lens - Focus on a topic

Where to go to Friday Jummah prayers in Menlo Park?

As I am working in Menlo park during this summer, I had to find a new place to go to Friday Jummah (congregational) prayers. Menlo park itself does not have a mosque and there are hardly any Muslims around the area. Fortunately, I found two options which worked out for me. I take 2 - 3 hours off every Friday to make it to the prayers. It's good to get away from the busy life for at least a few hours and be with the community every week.

Option1:
Islamic Society of Stanford University holds a Friday prayer starting at 12.15 pm (till 1.10 pm). This one is the closest to Menlo park (about 2 miles away). It is managed by the students at Stanford. Stanford does not have a mosque though. (Good thing about Purdue, where I study, is that it has a mosque in the campus itself and there is a large Muslim crowd.) They hold the Friday prayers in the 3rd floor of the old Union building (However, they do have special wudu areas). You get about 50-60 people altogether. Where to park your car? There is a paid parking lot at the intersection of Mayfield avenue and Lagunita drive. You can either use your credit card or use coins to pay. During May/June (where the Spring semester was still in progress), it was somewhat difficult to find a parking spot in that car park (You better go there about 1/2 an hour early). But now, being the summer, there is ample parking available in that lot.

Option2:
Muslim Community Association (MCA) mosque in Santa Clara (Yelp). It's about 15 miles (1/2 an drive) from Menlo park, but I very much like this place - it's a quite big mosque with a large gathering. If I have more time, I usually drive to this place instead of Stanford. They have two Jummah prayers one at 12.15 pm (I usually go to this one) and the other at 1.30 pm. In addition to the large prayer areas, the mosque has many other specialized rooms. Talking about parking, it does have a big parking lot, but it gets filled pretty quickly on Fridays - so you better go there a few minutes before if you want to park in the mosque itself. There is also a parking lot close to the mosque which you can use.

Other useful links:
South Bay Islamic Association
Muslim Community Association

Hope this information might be useful to those who are new to this area.

Wednesday, July 7, 2010

[Security/Privacy] Can we bridge the gap?

I was wondering how we may apply secure computing (e.g. computation over encrypted data) in real life scenarios where you have to interact with real objects as opposed to bits and bytes. It seems to me quite difficult, if not impossible, to achieve the same "invisibility" in the physical world; the very nature of the tangibility makes it hard to do so.

Consider the example where I want to mail my digital photos to Walgreens and get them printed. However, I want Walgreens to see neither the photos nor the printed copies. You see the similar privacy/security problems in getting something printed through a courier service such as UPS. I am not aware of any technology that we could use to solve this problem. One important thing is for the solution to be economical for me (the service requester), the amount of work I need to do (hence the cost) to recover the actual thing (actual photos from printed copies) should be cheaper than the service I want (getting the photos printed) in the long run. Otherwise, I might as well buy my own printing machine and do the printing myself which will eliminate the problem of privacy/security.

Tuesday, July 6, 2010

Driving/Hiking in the Big Basin Redwoods park

For the memorial day long weekend, I drove to the Big Basin Redwoods state park. The drive around the park area was really nice as it is covered with big redwood trees. I spotted a few deer as well. The road is very curvy and narrow; at times you have only one lane - so you have to be careful when driving and enjoying the view around you.

I parked the car in the park headquarters ($10 for parking, $5 for the map) and hiked about 10 miles. It was a good workout for me and the trails are spectacular and covered with large trees and a water stream flowing close by. Make sure to take the trail Skyline-To-The Sea Trail to Berry Creek Falls to Sunset Trail if you want to see the beautiful waterfall. The total round-trip distance is about 10 miles. You'll need 6 to 8 hours to cover this particular route and enjoy the surrounding.

It was a little hot and humid on that day. You better take a big bottle of water with you as you get dehydrated quickly there.

Driving/Hiking around the Golden Gate Bridge

I recently visited the golden gate bridge and stopped at both north and south sides. One of the main stops in the north side is the vista point which you find it immediately to the right after you cross the bridge from the south. You get a good view of the bridge from there. You'll also find the statue of the lonely traveler and a distant view of the Alcatraz prison island. The parking lot is somewhat small and you might have to wait a little bit to find a parking spot. There is also a side walk along the bridge.

From the side, you can drive/hike along the beach. We drove along Marina Blvd and Manson St (North East) and Lincoln Blvd (North West). If you want to take photos, it is better you go there in the morning or late in the afternoon close to sunset. In the afternoon, the sun directly falls on the bridge and it is hard to take any good photos.

Wednesday, June 30, 2010

'The law of fishes'

This is what is happening everywhere!

Impact of a tweet

I was just thinking about whether we can come up with a formula to measure the impact of a tweet just like we measure acceleration (v = u + at), force (f = ma), etc. in high-school Physics :-)

The following looks to hold:

impact ∝ ((rate of tweeting) * (quality of the tweets) * (number of followers) * (average frequency of checking tweets by followers)) / ((avg. number of people your followers follow) * (avg. rate of tweets your followers see))

It is not so simple as I initially thought; Number of retweets, the @ tags or # tags also can have a very positive effect. And, rate of tweeting could have a negatively effect as well - for example, if you are a fast tweeter people may simply ignore your tweets as junks. Quality is also a very subjective term. Also the relationships could well be nonlinear.

Another thing I was thinking about was how the rate at which information reaches us has evolved over time. 10 years ago, we used to rely mainly on the morning news paper, but now within minutes we have access to tweets and blogs to get our hands on the latest. So, how much is the acceleration of information? Roughly speaking, the acceleration ∝ (a day - a few minutes) / 10 years. The velocity at which information reaches us keeps on rising - I feel like the current velocity is already higher than our brain can 'run' - we are overloaded! We will need to add some 'friction' to slow it down :-)

Sunday, June 20, 2010

[Hiking] Muir Woods National Monument

I along with a friend drove from Palo Alto to Muir Woods which is located 11 miles from the Golden Gate bridge. The designated parking lot is quite small and we were not able to park there. (We reached there around 10 am) But there is plenty of road side parking if you drive a few minutes pass the parking lot.

Muir Woods has over 6 miles of hiking trails (pay just $1 and get the hiking map - it's very useful if you are not a frequent hiker in Muir Woods). It was one of the best hiking I have ever done - you get to walk along giant red wood trees with voices of nature, the less hiked trails give this clam and peaceful feeling to you. I would definitely go back again there when I get a chance. It is not an exaggeration to say that you sometimes get the Pandora feeling in the Avatar movie.

The temperature was around 60-70 F. I took my jacket with me in case it is cold inside the wood, but I did not have to use it. Make sure you take a bottle of water with you if you plan to do a longer trail - we hiked for about 3 hours - but you need a full day or more to cover all the trails.

Talking about the trails, there is one main trail which is flat and even wheel-chair accessible; most people walk along this - so it is somewhat crowded. Other trails are either longer or with some rough paths and slopes (medium to difficult). They are not hiked by many people and you get to experience a calm and peaceful environment. We hiked along Fern creek trail, Camp East Wood trail and Hill Side trail - all I would say are of easy to medium difficulty. Most of the time we were hiking along a path with a small water stream flowing next to the trail. There are several other longer trails which we did not have time to cover - Lost trail, Ocean view trail, Redwood trail and Coastal view trail are some of them.

If you are visiting the bay area and a nature lover, I would definitely recommend this place.

Saturday, June 19, 2010

[Photography] Testing the chromatic aberration in Sigma 10-20mm F4.0-5.6

In reviews, it is said that Sigma 10-20 F4.0-5.6 (UWA - Ultra Wide Angle lens) has a low chromatic aberration (CA) (which is a good thing for people like me as I mainly do landscape photography), but I noticed that the aberration varies by quite a margin with different F (DoF - Depth of Field) values.

First some background info:
Chromatic aberration is a kind of distortion where the lens fails to focus all component waves of the white light to a single focal point. The white light consists of component waves of color red, green, yellow, blue and violate (in their increasing order of frequencies). The refractive index of lens glass varies with the wavelength; shorter waves bends more than the longer ones. See the following diagram:

When this happens, the sensor averages them and the edges in the image becomes distorted. See the Wikipedia page for an example. I was told that this happens mainly from the light entering closer to the edge. So, in theory, if you have a smaller aperture (i.e. later DoF value), aberration should be small. A simple technique to correct this distortion is to have a concave lens right next to the convex lens so that the effect of irregular bending of wavelength is corrected by dispersing them. But with mm focal lengths, it seems that such a simple technique does not work - it's much more difficult to correct this effect with UWA lens as they support very small focal length.

Test setting:
I used a Canon EOS T1i body and took multiple pictures at the same focal length (10 mm) but varying the depth of field (F values at 4/4.5/5, 5.6, 8, 11, 22, 29) at the same light condition. Here are some sample photos.

Unprocessed JPEG image for focal length = 10 mm, F = 4.0:

Unprocessed JPEG image for focal length = 10 mm, F = 11.0:

Unprocessed JPEG image for focal length = 10 mm, F = 22.0:

Note that due to the crop factor (1.6 in EOS T1i), the actual focal length is close to 16 mm.

I was expecting the CA to monotonically decrease with increasing DoF value (think in the line that with larger F values, we have smaller aperture and light will pass through mostly middle of the lens). However it was not the case. Maybe the complex inter-leaved lenses have other effects. The CA visible to the naked eye decreased 4 (highest CA), 5.6 8, 11 (lowest CA) and 11 and 16 had similar quality. At 22, CA was quite high and image quality was less than that for at 4. I am not in a position to explain this bell-curve shape like behavior. Would be very much interested to know the technical details behind the scene.

I repeated the above experiment for focal lengths 12, 14 and 16 mm. I found a similar pattern. The visible CA decreased as the F value is increased closer to the current or little above the current focal length but beyond that visible CA increased with increased F values.

So, with Sigma 10-20mm F4.0-5.6 on a Canon EOS T1i camera, if you want to take landscape photos with minimal CA, set the F value closer to the current focal length.

Friday, June 18, 2010

What's your identity (and religion)? [not a security post]

Yes, what is your identity? We immediately think of the country we were born (the nationality), the religion (or a sub division of it) we follow, the ethnicity, the cast to which we belong, our parents, our siblings, the native languages we speak, our skin color, our height, etc. don't we? But wait...if you come to think about it, we pretty much don't have control over any of the above attributes; we don't have control over who our parents are, where we are born and so on; YET we not only label ourselves using those attributes, but sometimes, go so far as to start an arms struggle based on the differences in these attributes. Look at the current news -- most of the conflicts are due to these attributes -- the attributes we did not earn ourselves, but GIVEN to us (different religions interpret how this inheritance works differently -- but the underlying core is the same -- there should be some source of energy which does everything in such an orderly manner -- some of which are beyond human imagination) It is also sad to see that we discriminate people based on these labels that they don't have control over; high cast or low cast, black or white, short or tall, and so on. Your nationality is not your identity, and so are the skin color, the religion (literally), mother tongue, etc.

So, what exactly is your identity? Identity is something that you build yourself with a good intention and over which you have control. And that serves the greater good. Most religions I know of teach us to build this identity. However, looking at the current affairs, the religious identity is gravely misunderstood. This religious label is not your identity. Your religion becomes your identity only when you are truly honest to yourself (for example, treat everyone with the same spoon irrespective of the uncontrollable attributes) and truly care to make the world a better place to everyone (for example, by sharing your knowledge, wealth, etc., by raising your voice in a peaceful manner for the oppressed, the weak, etc.). In short, don't be evil. I hope this post gave you some food for thought.

Wednesday, June 16, 2010

Twitter

After a long wait, debuted in Twitter :) My twitter id is nabeel_yoosuf. Hoping to share interesting/useful links and events there.

Tuesday, June 15, 2010

Tracking patients remotely

When we talk about GPS, we immediately think about going from point A to B. Technologies similar to GPS have been used to track patient remotely. The basic idea is that these devices, which are, most of the time, attached to the patient, report the location information to a central location and if the movement patterns deviate from the normal patterns, they detect an anomaly. That anomaly could be something good (for example, a patient who is recovering making some movement could be a positive sign, no movement at all could be a negative sign).

There have been commercial as well as research projects in this regard. For example,

Remotely monitory elderly location: here here
A research project to track the recovery from a surgery: here
A device to track dangerous psychiatric patients: here
And many more

Even though these devices/techniques are designed/deployed with good intension, one concern here is people who are being monitored have no control over their own data, i.e. their movement information. And they don't have control over who can view their data. Hence, it could lead to serious privacy breaches. I'd like to see a system where it gives more control to the target (to someone on behalf of the target) over their information.

Monday, June 14, 2010

Security :-)

Hi & Lois (via Schneier on security)

Security by deterrence

When no one can watch (or trace one to what they do), the possibility of doing something bad (steal, break in, erase/modify/add data, etc.) is quite high. I was thinking about security cameras in supermarkets and shops. Is it more effective to have those cameras well visible to everyone or have them hidden? In my opinion they should be installed in visible locations; if they are not visible, there will be more bad people attempting to do bad things; it's true that you can track them down going through the surveillance videos and prosecute them - but think about the cost you have to incur; it is far more economical to indicate some sort of a warning signal. This will reduce the number such incidents and, yes, you can take necessary actions against those few bad incidents where bad guys dare to ignore the warning. Thinking in this line, you actually don't need real cameras installed all the places - you can safely have a few fake ones installed along with the real ones - they will act as a deterrence factor. (If you cannot afford to have a video surveillance system, it is at least good to have some fake cameras installed.)

What about firewalls, ID (Intrusion Detection) systems? I think we can make a similar argument about them.

Another side note, have you ever come across a situation where you cannot keep the lunch packets or any other food items from your co-workers in an office or classmates in a school? One crude way to do that is to take a bite and keep it :-) it'll surely act as a deterrence. A good way to have it packaged as if it's not a lunch packet - most hungry people won't bother to open that. This sandwich bag seems to be a good idea as well (though won't work after others figure out your trick)

Thursday, June 3, 2010

[Math] Transitivity of relationships

Most of the relationships in the real world that we know are transitive. For example, if Bob is taller than Tom and Tom is taller than Alice, it naturally implies that Bob is taller than Alice. At the same time, there are many other relationships where the transitivity is not clear. Take for example, a triangular series among Sri Lanka, India and England. Let's say that India beat England and England beat Sri Lanka. Does that mean India will beat Sri Lanka (comprehensively)? Not necessarily - in fact, it has proven numerous times in the past that the transitive relationship does not hold (a tournament would be boring if it were the case). In other words, the relationship is probabilistic in nature.

Some more not necessarily transitive examples in the technology/science field:
(Social networks) Bob is a friend of Sam. Sam is a friend of Tom. It does not necessarily imply that "Bob is a friend of Tom".
(Trust relationships in security) Alice trusts Bob to keep a secret. Bob trusts Mary to keep a secret. It does not necessarily imply that "Alice trusts Mary to keep a secret" since Alice needs to trust on something else to make the transitivity working. That something is Bob's ability to judge Mary's trustworthiness to keep a secret.

I also find that some relationships can never be transitive. For example:
(Family relationships) Mary is mother of Alice; Alice is mother of Eve. It is incorrect to say "Mary is mother of Eve".

In computer science, we mostly deal with deterministic transitivity. Take for example, Lamport's clock; if an event A occurs before an event B and an event C occurs before the event A, we safely conclude that the event C occurs before the event B. And the transitivity always holds. But, what about probabilistic transitivity?

Tuesday, April 20, 2010

Facebook - new advertising model?

Heard about the news that Facebook is going to launch a new advertising model where they target ads based on user's browsing history. [1,2,3] From what I understood, FB is not going to (and unable to track) your complete browsing history; rather, FB is going to build a browsing profile for you based on what you explicitly want to "like" by clicking a button placed on a web page you browse. I think they already get some amount of browsing history information whenever you click "f-share" button on a web page which sends the request to http://facebook.com/.

The question is whether this behavioral targeting is an invation/violation of privacy? IMO, it's NOT a violation of privacy as opposed to what the links above try to indicate. Privacy is more about the control YOU have and less about secrecy. Unless YOU explicitly decide to like or share (by clicking), FB will not be able to do any meaningful behavioral targeting. It's still under YOUR control.

Of course, it is a violation of privacy, if FB tries to show ads to someone based on YOUR browsing history which they tried to do with beacon system and failed miserably; YOU loose control over YOUR data in this case. I think FB is not going do something similar to that with the new behavioral targeting.

Waiting to see how their system actually works!

Thursday, April 15, 2010

write once, remain forever

Here's an example:

Have you ever sent out a “tweet” on the popular Twitter social media service? Congratulations: Your 140 characters or less will now be housed in the Library of Congress.

That’s right. Every public tweet, ever, since Twitter’s inception in March 2006, will be archived digitally at the Library of Congress. That’s a LOT of tweets, by the way: Twitter processes more than 50 million tweets every day, with the total numbering in the billions.

Monday, March 29, 2010

People mining ;-)

~~data~~ people mining - could be used for good or bad purposes just like everything else in life.

www.pipl.com
www.reunion.com
www.classmates.com
www.facebook.com
www.twitter.com
www.linkedin.com
www.myspace.com
www.switchboard.com
www.jigsaw.com
www.google.com
www.bing.com
www.rootsweb.com
www.tributes.com
www.legacy.com

(source)

Sunday, March 28, 2010

Securing systems dealing with sensitive information

I went through the executive summary of the audit report of a popular clinical information system in Canada which assessed the security measures in place. The 10 recommendations the report make are quite useful when implementing any access controlled information system; they are not new, but rather well-known facts (need-to-know, defense-in-depth, leakage-prevention, auditing, etc) but in practice largely neglected.

Friday, March 26, 2010

SEO poisoning on the rise

Source:

The people who push malware love to trap victims via search. Security companies refer to what they do as "SEO (Search Engine Optimization) poisoning." They identify popular search terms, figure out which ones are likely to bring them suitable targets, and then optimize pages so engines like Google and Bing display their results on the first page -- mixed in amongst the non-malicious pages you actually wanted to find.

So what search words are most likely to get you into trouble? Bearshare (46% malicious sites) and screensaver (42% malicious sites).

The blog post here gives an idea of what kinds of black hat SEO techniques are frequently employed by cyber criminals.

Search engine optimization (SEO) is a collection of techniques used to achieve higher search rankings for a given website. "Black hat SEO" is the method of using unethical SEO techniques in order to obtain a higher search ranking. These techniques include things like keyword stuffing, cloaking, and link farming, which are used to "game" the search engine algorithms.

Cyber criminals also exploits the current hot news (celebrity affairs, death, etc.) at any given time to have search results for malicious pages with high ranks as people are likely to search for such news.

It is a good idea to make your web sites xss safe. If you are a PHP developer, htmlspecialchars and htmlentities are two very useful functions in this regard.

If you are a user, think before you click!

Wednesday, March 24, 2010

Learning/thinking by analogies

For people with a computer science background (but not limited to), working with/thinking in analogies is part of life. For example, take design patterns Adapter, Bridge, Observer, Factory, etc.; they are all analogies. Analogies help us understand/solve the problem at hand.

I found the following analogy appeared in an article in ACM Communications March 2010 issue interesting:

Alice owns a jewelry store. She has raw precious materials—gold, diamonds, silver, etc.—that she wants her workers to assemble into intricately designed rings and necklaces. But she distrusts her workers and assumes that they will steal her jewels if given the opportunity. In other words, she wants her workers to process the materials into finished pieces, without giving them access to the materials. What does she do?

Here is her plan. She uses a transparent impenetrable glovebox, secured by a lock for which only she has the key. She puts the raw precious materials inside the box, locks it, and gives it to a worker. Using the gloves, the worker assembles the ring or necklace inside the box. Since the box is impenetrable, the worker cannot get to the precious materials, and figures he might as well return the box to Alice, with the finished piece inside. Alice unlocks the box with her key and extracts the ring or necklace. In short, the worker processes the raw materials into a finished piece, without having true access to the materials.

Cryptographically speaking ;), this is what we try to achieve with computation over encrypted data! (Note: this analogy does NOT fully represent this goal as the authors themselves point out)

Monday, March 22, 2010

rebuff huff 'n puff

Creative!
(decoded title: say no to smoking)

How will the healthcare bill affect medicine?

(The traditional way of managing medical records)

From "10 things you need to know about the healthcare bill":

The bill includes incentives to use more electronic medical records, which should make healthcare more efficient and effective. It would set up pilot programs for medical malpractice tort reform. Community health clinics, which help serve people who often don't have access to other forms of care, would get more funding. Medicare payments would be linked to quality of care, which should shift more providers toward evidence-based standards to see how well treatments work.
Other pilot programs would be set up to study how to improve public health in general, and improve care for people with chronic diseases, rural patients and other groups. The goal is to improve the quality of care while holding the costs down.

Friday, March 19, 2010

spring..

When the snow vanishes from the ground
And no cold breeze is to be found
The feeling of thankfulness is profound
As I know that the spring is around
The corner with fresh hope
And I feel like nothing is out of my scope
Trees will slowly and surely start to blossom
Reminding me how awesome
It is to be alive
And a convertible can I drive :)
~Nabeel

Wednesday, March 17, 2010

To friend or not to

I don't mean to be paranoid here, but you better think twice before you become friend with someone in a social network.

It may be an undercover agent that you are accepting as a friend; this could lead to privacy violations if you are an innocent party.

Law enforcement agents are following the rest of the Internet world into popular social-networking services, even going undercover with false online profiles to communicate with suspects and gather private information, according to an internal Justice Department document that surfaced in a lawsuit.

Want to know how they do it and what they can obtain? read up here.
I don't mind if they use social networks to uncover only those who did something wrong or really questionable, but it would be naive for me to think so.

Facebook's rules, for example, specify that users "will not provide any false personal information on Facebook, or create an account for anyone other than yourself without permission." Twitter's rules prohibit users from sending deceptive or false information. MySpace requires that information for accounts be "truthful and accurate."

I am confused now; can I prosecute an undercover agent on the above ground?

It may be someone impersonating someone else for totally different reason:

Around September 20, 2006, Lori Drew created the Myspace account for the "Josh Evans" alias. At the time Drew operated the Josh Evans MySpace account, she was aware that Meier had been taking antidepressant medication. Meier committed suicide as a result of the bullying.

It may be someone who tries to defame you by associating you with something that you are not. For example, tagging you in an image that is not socially acceptable or writing defamatory/incorrect remarks about you on your wall.

How do you know if a person is who he/she claims to be in a social network? Well, there's no formula for that. But it is in general a good idea to check the mutual friends a person has before accepting the request. It may not work in some cases. What if some of your friends have already been fooled to be friends with that person? (which I have encountered at least a few times already)

How privacy vanishes online and some thoughts

Very timely article:
"If a stranger came up to you, would you say your email address, your phone number?
If you have a not so close friend would you tell your DoB to him/her?
Probably not..yet people say it on the Internet."

“Personal privacy is no longer an individual thing: In today’s online world, what your mother told you is true, only more so: people really can judge you by your friends.”

As the article also briefly mentions, you may think that innocuous attributes such as where you work, your current location, where and what you studied, etc. will not lead to identify you as a unique individual. However, there is research indicating that the aggregation of these small small things can lead to something powerful even to the extent to identify your social security number. Actually, one of my research goals is to minimize the revelation of use innocuous credentials used as part of access controlling in service consumption scenarios. In other words, the question is "how do I get the service with no or minimal disclosure of credentials yet convincing the service provider?"

Another question I am in search of answers is "how much privacy do I loose by revealing different bits of information in different places in the Internet?". Intuitively, as you reveal more attributes about you, you become easier to identify. How does this relationship vary - is your identifiability proportional to something about your attributes? Some attributes reveal more than others. My next question is about identifying that "something"; "Can we capture this notion in an information theoretic way?"

Computation over encrypted data [Crypto]

The following diagram shows the ideal situation:

The objective is to perform a general computation over encrypted data so that the party that performs the computation learns neither the input values nor the result of the computation (over a finite field). The computation to be performed (e.g. eigenvalue computation, null space computation, Gaussian elimination, etc.) is public (i.e. known to everyone). Theoretically speaking, one can achieve the above objective using a SMC (Secure Multiparty Computation) protocols by evaluating a scrambled Boolean circuit. However, it is not practical.

Two popular practical techniques that we can use:
1. Commutative encryption (Pohlig-Hellman)
2. Homomorphic encryption (Paillier, Damgard, Unpadded RSA, Benaloh, ElGamal, etc.)

Since I am interested in one off computation, IMO, homomorphic encryption is the most suitable here. Computations over finite fields, in general, involves two binary operations (e.g. addition and multiplication). However, all the practical homomorphic crypto systems are homomorphic to only one operation. (E.g.: addition - Paillier, Damgard, Benaloh; multiplication - Unpadded RSA, Elgamal). It should be noted that mid last year, IBM published a paper on a fully homomorphic encryption using ideal lattices, but it is computationally intensive and thus not suitable for real applications. So, it is still an open problem to invent a practical fully homomorphic encryption. Until such an invention, we need to rely on specialized protocols to solve the afore mentioned problem.

Sir Ken Robinson: Do schools/universities kill creativity?

Very valid points!

The points that made me think most were the facts that our education system stigmatizes mistakes and schools/universities are like factories that produce people to work in the industry.

Monday, March 15, 2010

Why DRM doesn't work ;)

here and here ;)

Monday, March 8, 2010

TBL on linked data

The talk is about one year old, but still interesting and current. This year's ICDE conference also had some interesting papers on topics related one way or the other to linked data.

Slides of my talk at ICDE 2010

Last week, we had the ICDE 2010 conference in Long Beach, LA. Here are the slides of my talk.

Wednesday, February 24, 2010

Another great presentation

"Cloud computing - why it matters?" by Simon Wardley (OSCON '09). I like the presentation style and the presentation itself.

Saturday, February 20, 2010

Identity 2.0 keynote

The following keynote (found thanks to a friend of mine) is quite old, but I thought of adding it here as the presentation style used is quite interesting (the content is very useful as well). I liked it so much that I watched it twice. I think I am going copy some of his style in my presentations.

Monday, February 15, 2010

Health care identity theft

A good news article on health care identity fraud and its current status.

Theft of health care identity is relatively new; partly because it's only now people are starting to use electronic health care records. Last year, in the stimulus package, the US government allocated billions of dollars to start build a nation wide online health care record system over the next couple of years. So, I think there will be more such incidents than what we currently see.

Some stats:
...It is estimated that the number of identity fraud victims in the United States increased by 12 percent, to 11.1 million adults in 2009, while the total annual fraud amount increased by 12.5 percent, to $54 billion.

"Health insurance-related identity fraud is particularly troublesome because of the relative costs. The average identity fraud victim pays $373, while a health insurance fraud victim pays $2,228, and a health insurance fraud typically is about $12,100 in total, compared with $4,841 for an average identity fraud case."

A simple solution to minimize such frauds is to ask for multiple credentials (driver's license, student photo ID, etc.) along with the health insurance card; it is unlikely an impersonator possesses all these.

This is good news for those who do research in protecting medical records - there is a real need.

Sunday, February 14, 2010

Some creative Flickr photos marking the day..

(Source: link)

Friday, February 12, 2010

The Good, the Bad and the Ugly at the same time..

It is sad to see that if you criticize or are open minded about the current ruling, they think you are a conspirator..and if you speak for the people in the north and east, they think you are a traitor. I agree with most of the things that Shahani mentions in her blog .. whatever happens politically, still Sri Lanka is one of the best (my public photos bear witness :-)

Thursday, February 11, 2010

Google buzz is criticized for privacy concerns

After setting up buzz, if you don't change the default settings, others can see who you most frequently (not sure about the most frequent part, I guess they pick almost all the contacts that you ever had conversation with if your contact list is not too long) chat with or email to due to the default automatic friends feature. Looks like they have not learned from the Facebook beacon experience -- when it comes to information sharing it is safer to opt-in rather than opt-out.

The above link mentions that:
"Imagine ... a wife discovering that her husband emails and chats with an old girlfriend,"

(Btw, if you are honest, you probably don't need to hide anything. Are we encouraging people to be dishonest by allowing them to hide behind the screen in the name of privacy??)

Also mentions that:
"Imagine ... a boss discovers a subordinate emails with executives at a competitor."

(When you use a free service like Google mail/chat, you don't have much control over your information - your profile, your chat logs, your contacts, your emails ... this raises the question if we should use such services for business purposes or highly private matters??)

There could be other damaging inferences as well. For example, if Bob frequently communicate with one of his doctors, John, who specializes in cancer treatment. Others will be able to infer that Bob is possibly having some sort of cancer.

Mitigating factors:
There are some mitigating factors, however. Buzz only shares information about other people who are using Buzz and have set up public profiles in Google. So currently, most Gmail users are not publicly listed by the service. Users can also "unfollow" people who they don't want to be linked to.

You can follow the steps in this to change the default settings.

Saturday, February 6, 2010

Alice, Bob, Malloy, Jared, Tim and Eve

Yesterday I was at a short talk on watermarking. Thought of checking out some recent work on the subject. And I was think how I am going to explain it to someone who is not interested in technical stuff. Following description is adapted from a relatively old paper with the usual security characters:

Data hiding aims at enabling Alice and Bob to exchange messages in a manner as resilient and stealthy as possible, through a medium controlled by evil Mallory. Alice and Bob don't care if Mallory see the hidden message.

On the other hand, digital watermarking is deployed by Alice to prove ownership over a piece of data (a music album, movie, photo, document, etc), to Jared the Judge, usually in the case when Tim the Thief benefits from using/selling that very same piece of data (or maliciously modified versions of it). In order to convince Jared, the piece of data should have something unique that only Alice can show its existence (Ideally, Alice should be able to challenge Tim to show how to get that unique thing from the data; Tim fails to do so since he does not possess a secret that only Alice knows. This will impress Jared more about Alice's claim and Jared is most like to send Tim to jail.). Jared does not care what that unique thing is - it just needs to be unique. To be effective, Tim should be able to remove that unique thing from the piece of data (better if Alice can prove if Tim tried to tamper the piece data). For a usability point of view, that unique unique thing that Alice has attached to the piece of data should not affect the quality or any other desirable property of that piece of data.

Now in another scenario, Alice wants to send a message to Bob through a communication channel controlled by Eve and she want to hide the existence of that message from Eve (not even want to show the cryptic message which Eve cannot decipher anyway). So, Alice uses stenographic techniques here. Unlike watermarking, here the hidden message is the main data. Alice takes some public piece of data (e.g. an image) and embeds the message. For Eve, it looks all normal. Alice and Bob shares a secret so that once Bob gets the public piece of data, he can extract the hidden message. It would be even better if Eve cannot know if a communication took place between Alice and Bob. In certain situation (like in a war) knowing that two parties communicated with one another could be valuable information.

Thursday, February 4, 2010

Imagination is the limit

Ref: link

Tuesday, February 2, 2010

Funny..

You probably have watched this video earlier. I happened to watch it again. It's so funny :) .. there's a message as well - I don't like people bragging about their personal life in Twitter/Facebook or any other social media, but Twitter could be a useful tool if it is used in the right way.

This one is not only funny, but also very creative :) .. there is some reality as well.

Sunday, January 31, 2010

Minority aspirations and the failed democracy..

I usually don't write about politics. But I couldn't help writing about this one.

The following figure shows the vote distribution of the recently concluded presidential election.

(Source: http://www.srilankanelections.com/)

For me, the green districts (in north and east) do not mean that MR lost or SF won (barring some green spots in the hill country, Colombo and some more urban areas); but rather they mean that minority aspirations are not met by the government ruled by the majority. Look who are living in these areas; either minority Tamils or Tamil-speaking Muslims. If we look at the world history, it is a usual thing that the majority is not sensitive to minority issues. The real challenge is whether MR can reverse this and be sensitive to minority issues. I think it'll take lot more to heal the ethnic division than winning the war. That's when I will say Sri Lanka has real peace. I hope that day is not far away.

Update [2/2/2010]: First of all, I am neither supporting nor in favor of any political party or any political leader in Sri Lanka. From the point of view of democracy, MR should not be allowed to extend his next term more than what is stipulated; it is he who called for an early election (which should not have been done in the first place; if he's truthful, not power hungry and had no hidden agenda, why an early election??) and there should be consequences for it. But to my disappointment, he's allowed to extend his tenure by almost one year. I don't know much about politics, but I know that when a country does not have a strong opposition (like the current situation), it is unfortunate that, at the end of the day, it's the civilians who have bear the consequences.

Update [2/8/2010]: It is disturbing to hear that SF has been arrested. As I mentioned before I am not a supporter of SF, but where is democracy?? Would he be arrested under war crimes, had he not stood against MR?? (A few days before, some army officials were also fired under the same ground.)

First they came for the communists, and I did not speak out—because I was not a communist;
Then they came for the trade unionists, and I did not speak out—because I was not a trade unionist;
Then they came for the Jews, and I did not speak out—because I was not a Jew;
Then they came for me—and there was no one left to speak out for me.
~ Martin Niemoller

Substitute the above with brave journalists, impartial news papers and other media, people who questions the current ruling party, etc... is the history repeating??

Saturday, January 30, 2010

How unique/trackable is your browser?

I've just got a "fingerprint" of my browser through the Panopticlick tool. The result is as follows:

Your browser fingerprint appears to be unique among the 389,007 tested so far.

Currently, we estimate that your browser has a fingerprint that conveys at least 18.57 bits of identifying information.

This is a worrying fact; browser fingerprint is a very effective way of tracking users in the Internet. Why should you take defensive measures against such tracing down? This clearly invades your privacy. You probably don't want someone to profile your online trace without your consent or knowledge. If your browser sends out too much unnecessary information (increasing the likelihood of uniqueness), multiple visits to not only the same site but also different sites can be linked. So, with these fingerprints, systems providing anonymous access to digital content, digital cash become ineffective since these methods make an implicit assumption that the attacker does not use the background information available through the communication channel itself.

It should be noted the same browser fingerprinting technique is used to provide protective measures as well. For example, my bank won't ask for additional credentials when I log through the browser I use everyday, but when I log in from a new browser/new location/new computer, they will ask for additional credentials. The challenge is to protect user privacy without compromising security.

Another challenge is to protect user privacy without limiting the usability. For example, one technique to minimize the risk of fingerprinting is to disable java scripts, but most sites require java scripts to work.

Update [2/2/2010]: The above work allows to identify browsers, but not exact users. Researchers from the Isec lab have devised a method to identify users using social network group membership as background knowledge. It's a two step process:
1. Generate a group membership fingerprint for each users (their thesis is that the collection of groups a user is member of is more or less unique).
2. User history stealing technique to identify the links the user previously visited. Their TR is available here (A practical attack to de-anonymize social network users).

Friday, January 29, 2010

ZERO: How was it discovered?

The history behind the concept of zero is an interesting one. We are so used to the number zero that we cannot live without it (what if a zero is dropped from your salary and appended to your electricity bill ;).

On a serious note, in the early history of counting, the number zero neither required nor well understood; and the same with the negative numbers. why? early mathematics was based on counting real things as opposed to abstract ideas. It is fascinating to see that things, like the concept of zero, we take for granted took centuries and many great minds to discover. The following time line of historical events is an indication of this fact. I hope this will be a good reminder to all of us that we will never achieve perfection and we progress through our mistakes/needs. No mistakes/needs, no progress!

3000 BC [3] : Sumerian numerical system - Separated numbers from goods, but no concept zero (zero loaves of breads, zero cows, etc. did not make much sense at that time :D). Some details of their progress:
Version 1: Different types of goods were represented by different symbols, and multiple quantities represented by repetition.
Examples: two units of grains was represented by two grain-marks. Four oil cans was as four oil-can-marks.
Version 2: Separated the quantity of the good from the symbol for the good. That way a great amount of redundancy was prevented. They introduced a sexagesimal system (that is, base 60). Not sure why it's base 60 instead of any other base.
Example: two units of grains as the symbol followed by the symbol of grain.

(Sumerian system)

Around 3000 BC - Egyptians introduced the earliest fully developed base 10 numeration system. It's not a positional number system as the decimal number system we have, but it can represent large numbers. (For example, to represent 45, they used 4 number 10 symbols and 5 number 1 symbols) Similar to Roman numerals.

(Egyptian hieroglyphics)

Egyptians also did not have the concept of zero since they also mainly thought numbers as concrete concepts for measurement of length, trading, etc. Still, I am amazed about Egyptian math; even without the concept of zero, they were able to build precisely calculated colossal pyramids and other structures. Also, through their math skill, they were one of the ancient nations who got close to calculating the correct number of days per year.

2700-2300 BC Sumerian/Akkadians invented the Abacus. They use it with their sexagesimal number system.

Babylonian number system [2]: First to introduce the place value system (just like the decimal number system we have) but still no concept of zero.
Sumerian number system still needed many symbols to represent numbers. Influenced by this number system, Babylonians, towards the end of the 3rd millennium, introduced the place value system. They just needed two symbols to count.

(Babylonian system)

700 - 400 BC - Use of zero to denote an empty position in the notational system.
Babylonians put two wedge symbols to where we would put zero in the decimal notation. These empty wedge symbols only occurred within a number (as in 5403 in decimal); never place at the ends (as in 5430 in decimal); zero was never used as a number, but rather as a punctuation sign.

During this time Greek mathematicians did not use a positional number system. They developed their theories/abstract concepts through shapes/geometry. It was during this period great mathematicians like Euclid lived. Even without the concept of zero, people like Euclid worked on number theory (which lead to the fundamental theorem of arithmetic, Euclidean algorithm, etc), but it was based on geometry. However, Greek astronomers used the notation of zero and it is believed to be similar to how we currently use zero. But they did not appear to have devised a number system based on zero.

(Euclid)

5th century - Indians (mainly Aryabhata) were the first develop a base 10 positional numeral system (remember Babylonians invented base 60 positional numeral system a long time before that) which resembles closely to our current decimal system. These dates are still disputed, but I think it's fair to credit Indians for the number system we currently have.

(Aryabhata)

People started to think about numbers as an abstract concept. As a result, the number zero as we use today was born.

7th century (dates are disputed) - The first appearance of zero as number by Indians. (There is some dispute about the origin as well - there appear to have some Chinese connection as well, but not sure about it)

876 AD - The first record of the Indian use of zero which is dated and agreed by all to be genuine.

Indians formulated arithmetic rules involving zero and negative numbers although they did not get it right in the first few attempts. Brahmagupta, in his book "Brahmasphutasiddhanta", got most of the arithmetic operations right except the division by zero: "0/0 is 0 and a number divided by zero is that number". For many centuries after this, division by zero remained a mistry to peope; they simply did not know how to explain it. During the same time, Islamic/Arabic mathematicians, especially, Al-Khawarizmi, studied Indian number system and contributed to the arithmetics with numbers. The combined work led to the Hindu-Arabic numberal system we are using today. In 12th centure, this system was spreaded to Europe mainly through the Italian mathematician, Fibonacci. There is no doubt that the development of zero is a very important milestone in human civilization and it paved way to many new concepts.

(There is evidence that in the 6th century, Mayans used a base 20 number system with a number zero. Also, they appear to have used the number zero a long before that. However, their knowledge has not influenced others.)

There have been many developments and rules about division by zero in the history, but let's not go into that. Currently we consider division by zero is undefined in any system that obeys the axioms of a field (e.g. real numbers, complex numbers, etc.).

In the 16th century, Newton and Leibniz, fathers of calculus, played a key role in understanding "division by zero" and its applicability to real life. Instead of considering absolute values, working with numbers approaching zero, they were able to develop a new branch of Mathematics, calculus. I think this is another very important milestone on the number zero.

At present, we cannot imagine Math, Physics, Chemistry or any other branch of scinece without having the value zero, yet it took many centuries to develop the idea of having a zero in the number system and people had been working with numbers well before zero came into picture. Empty sets (cardinality zero sets), zero gravity, freezing point, zero probability, accounting, modular arithmetics, calculus, Cartesian coordinate system, indexing are just name a few references.

We would not have progressed this far, had the concept of zero not understood.

References:
[1] http://yaleglobal.yale.edu/about/zero.jsp
[2] http://www-groups.dcs.st-and.ac.uk/~history/HistTopics/Zero.html
[3] http://it.stlawu.edu/~dmelvill/mesomath/sumerian.html

Thursday, January 28, 2010

The 3rd data privacy day (Jan 28th)

Today is the data privacy day.

Thursday, January 21, 2010

Interesting broadcast group key management schemes

In this post, I focus on some really neat broadcast group key management schemes/protocols (BGKM). The idea behind the schemes are based on quite simple algebraic constructs, but they are very elegant.

In addition to other properties, a good GKM scheme should provide the following two properties.
1. Forward secrecy - a user who left the group should not be able to access new keys
2. Backward secrecy - a user who joined the group should not able to access old keys

In my opinion, these are two most difficult properties to satisfy in a group communication setting. In order to satisfy these two properties, it is required to initiate a rekeying operation (i.e. change the existing keys). There have been many GKM schemes proposing various way of doing rekey operation. What set them apart is the communication cost of the rekeying operation. Earlier GKM schemes had O(n) communication overhead, where n is the number of users in the group. Later, it was improved to incorporate a hierarchy and the communication overhead was reduced to O(log n). Further, these rekeying operations are not transparent to users; every time a user joins/leaves, other users need to update their keys. Can we make the communication overhead to be O(1) (i.e. independent of the number of users in the group? This is where the BGKM schemes fit.

BGKM schemes make the rekeying operation transparent to existing users at the expense of additional computational cost at the server which manages the BGKM scheme. There are three noteworthy BGKM schemes which I will go over in some details giving the key ideas in the rest of this post.

1. The secure lock (SL) approach based on the Chinese Remainder Theorem (CRT) [SE 1989]
2. The access control polynomial (ACP) approach based on special polynomials [INFOCOM 2008]
3. The access control vector (ACV) approach based on matrix null spaces [ICDE 2010]

1. The Secure Lock (SL) approach
(Slightly modified the original version)
Each user u_i is given a random secret value s_i and a unique secret number n_i at the time of joining. These n_i's are relatively prime in pairs. The server construct the following congruences and compute the common solution using the CRT and broadcast to the group whenever there is a leave or join.

x ~ r_1 (mod n_1)
x ~ r_2 (mod n_2)
...
x ~ r_n (mod n_n)

where ~ is the congruence symbol, r_i = K XOR s_i, K is the actual key. (r_i < n_i)

Then construct the common solution, using the CRT:
x ~ \sigma{i=1}{n} N/n_i * r_i * f_i (mod N)

where N = n_1 * n_2 * ... * n_n,
f_i ~ N/n_i (mod n_i).

Let the standard representative of the common solution is C.

A user u_r with the secret s_r and the public value n_r derives the key by (C mod n_r) XOR s_r. (Note that C mod n_r gives the value r_r)

Can a newly joined user u_r get its hand on old keys? No, because the old common solution did not consider the congruence x ~ r_r (mod n_r).

Can a user u_r who left the group access new keys? No, because the server removes the congruence x ~ r_r (mod n_r) from the CRT calculation.

2. The Access Control Polynomial (ACP) approach
Each user u_i is given a random secret value s_i at the time of joining. The server construct the following polynomial of order m+n and broadcast to the group whenever there is a leave or join.

f(x) = K + (x - H(s_1 || z))(x - H(s_2 || z))..(x - H(s_n || z))(x - H(a_1 || z))...(x - H(a_m || z))
where K - is the actual key, s_1,.., s_n are the random secret values given to n users, a_1,.., a_m are random fake values used to increase the entropy of f(x), z is a public random value, H is a hash function.

A user u_r with the secret s_r derives the key as f(H(s_r || z)).

Can a newly joined user u_r get its hand on old keys? No, because the old polynomials f`(x)'s do not have (x - H(s_r || z)).

Can a user u_r who left the group access new keys? No, because the server removes (x - H(s_r || z)) from the new polynomials f`(x)'s.

The scheme is quite simple. However the security of this scheme is neither formally analyzed nor proved.

3. The Access Control Vector (ACV) approach
Each user u_i is given a random secret value s_i at the time of joining. The server construct the following the matrix X of n by n + 1 and broadcast a vector created based on a random vector from the null space of X to the group whenever there is a leave or join.

X =
| 1 a_{1,1} .... a_{1,n}|
| 1 a_{2,1} .... a_{2,n}|
| ..................................|
| 1 a_{n,1} .... a_{n,n}|

where a{i,j} = H(s_i || z_i), H is a hash function, z_i's public random random values, s_i is the random secret value given to the user u_i.

The null space of such a matrix is always guranteed to be nontrivial (each row is linearly independent). Hence, there exists a non-trivial colomn vector Y such that XY = 0. The server picks a random vector Y from the null space and compute the ACV (Access Control Vector) and broadcasts.

ACV = (K, 0, ..., 0)^T + Y

A user u_r with the secret s_r derives the K by performing a dot product of ACV and KEV_r (Key Extraction Vector). A user can construct her KEV using her secret and public random z_i's as follows.

KEV_r = (1, H(s_r || z_1, ...., H(s_r || z_n))^T

Can a newly joined user u_r get its hand on old keys? No, because the vector Y in the old ACV's are not orthogonal to her KEV, dot product does not yield the key K.

Can a user u_r who left the group access new keys? No, because the server removes the corresponding row from X, now the vector Y in the new ACV's are not orthogoal to her old KEV.

The scheme is quite elegant. The security of this scheme is analyzed and proved. A downside of this approach is the computational cost at the server when the group size is large.

Wednesday, January 20, 2010

'Sights unseen' photography

When I first saw the subject of an ongoing photography exhibition 'sights unseen' in the news, I was so exited and was like 'this must be a collection of really cool photos I have hardly seen before'. But my first impression about the exhibition was wrong. However, I found it not only EVEN MORE exciting, but it's inspiring! It's a collection of really cool photos taken by people who are not fortunate enough to see. This bbc audio slideshow tells it all.

(A photo taken by a blind person)

All this time it was wired into my brain that, if you are blind you cannot take photographs. I was proved wrong! And it goes well with good old proverbs.

As I was Googling about the subject, I found the link that photography can be a tool for social change quite interesting.

Sunday, January 17, 2010

"Never, never, never quit"

I truly admire the spirit of this story. Life is full of challenges; stories like this (and also this) remind us that we can overcome those challenges if we believe and act. Some people may not end up being the winners always, but we have to admire their spirit.

Friday, January 15, 2010

Effects of the failed christmas bomber..

(Source: http://www.cartoonistgroup.com)