Nabeel's Blog: March 2009

Tuesday, March 31, 2009

Medical Identity Theft

Interesting:

The soaring cost of health care is spawning a new crime: medical identity theft, in which someone uses your insurance information and health records to obtain medication or even surgery. It happens to 250,000 people each year, says the World Privacy Forum. To protect yourself, the WPF recommends that you: 1) closely review all "explanation of benefits" letters from your health insurer, 2) annually request a list of benefits paid by your insurer in your name (sometimes thieves alter billing info), and 3) check your medical file every time you visit the doctor.Google, Microsoft, and dozens of other companies will also store your personal health records (PHRs) online. While there are advantages to having a complete medical history in one convenient location, some companies have "de-identified" these records and sold them to marketers. Pam Dixon, executive director of the WPF, does not generally recommend PHRs that are not maintained by health care providers. She stresses looking for a service that is "HIPAA covered" rather than "HIPAA compliant" in order to retain confidentiality. Look for that exact wording in the privacy statement. (Electronic health records, or EHRs, are maintained exclusively by health care providers.)

To tell the truth, I hardly check any of the 3 steps mentioned above. In other words, I really don't know if someone else misuses my medical insurance.

More info: here

(Note: For those who are not familiar with medical insurance (for example, in Sri Lanka it is not required to have such insurance), some countries such as USA, require you to possess a valid medical insurance if you are under a certain visa status. It is just like automobile insurance.)

Sunday, March 29, 2009

The Availability Heuristic

When the devastating Tsunami (Indian Ocean Earthquake) happened in 2004, it was our number one fear that time as we had that feeling of happening something similar or worse again. Now we don't have the same level of concern over it, do we?

When that horrible shooting incident happened at VTech, we were more fearful at that time than now as we had that feeling of increased probability of being a victim of such an incident.

When there is a suicide bomb attack, we see extra check points and increased security measures which tend to fade off as the time pass by. Why is that we have more concern immediately after the attack? Again, we get that feeling of happing something similar or worse.

We use availability heuristic to estimate the frequency of something (good or bad) happening. Often we don't have or bother find solid evidence to base our estimation. "For example, what is the probability that the next plane you fly on will crash? The true probability of any particular plane crashing depends on a huge number of factors, most of which you're not aware of and/or don't have reliable data on. What type of plane is it? What time of day is the flight? What is the weather like? What is the safety history of this particular plane? When was the last time the plane was examined for problems? Who did the examination and how thorough was it? Who is flying the plane? How much sleep did they get last night? How old are they? Are they taking any medications? You get the idea." Our cognitive decision is based other no rationale factors!

Usually, our estimation is shaped by our recent memory. It is easier for us to recall events happened in the recent past than those in distant past.

The availability heuristic often leads people to loose sight of "real" dangers: " Psychologist Gerd Gigerenzer, for example, conducted a fascinating study that showed in the months following September 11, 2001, Americans were less likely to travel by air and more likely to instead travel by car. While it is understandable why Americans would have been fearful of air travel following the incredibly high profile attacks on New York and Washington, the unfortunate result is that Americans died on the highways at alarming rates following 9/11. This is because highway travel is far more dangerous than air travel. More than 40,000 Americans are killed every year on America's roads. Fewer than 1,000 people die in airplane accidents, and even fewer people are killed aboard commercial airlines. The bottom line is that being a passenger on a plane being flown by trained professionals who are being guided by a team of professionals (i.e., air traffic control) is much safer than driving your own car on streets surrounded by other amateur drivers who may or may not follow the rules of the road (and whose cars may or may not be fit to drive)."

Another interesting fact:
"Consider, for example, that the 2009 budget for homeland security (the folks that protect us from terrorists) will likely be about $50 billion. Don't get us wrong, we like the fact that people are trying to prevent terrorism, but even at its absolute worst, terrorists killed about 3,000 Americans in a single year. And less than 100 Americans are killed by terrorists in most years. By contrast, the budget for the National Highway Traffic Safety Administration (the folks who protect us on the road) is about $1 billion, even though more than 40,000 people will die this year on the nation's roads. In terms of dollars spent per fatality, we fund terrorism prevention at about $17,000,000/fatality (i.e., $50 billion/3,000 fatalities) and accident prevention at about $25,000/fatality (i.e., $1 billion/40,000 fatalities). This huge imbalance tells us that our priorities are seriously out of whack. (And don't even get us started on bigger killers like heart disease!)"

Is our risk assessment model flawed? IMHO it is not the model that is problematic here, but we as a society have failed in the first place (why do we have terrorists attacks? why are there mass shooting incidents?) barring the natural disasters. Why do we allocate more resources to those possible events that has a low frequency but a high impact? I argue that this is due to the true human nature; we, human beings, feel a higher impact if something happens in burst rather than gradually even if the latter is causing more damage in the long run. Can we (as citizens or as governments) change our perceptions (be a lot less afraid of recent bad incidents) and focus on the latter? I think the real problem is not the fear factor or the accrual damage caused but the fact that most of the burst incidents are caused by extremist elements and the victims have neither control nor any involvement (in other words, there are unfortunate reactions without any actions (involvement) - is this what we call "fate"?). We can make a similar argument about natural disasters.

Monday, March 23, 2009

Examples (I)

Some interesting database/security/privacy concerns through examples (taken from various papers) and this is how new ideas for some interesting papers found:

A problem with k-anonymity:

Table 1

The above table shows a list of patient records. The following table shows the 3-anonymous version of the above table.

Table 2

Disease attribute is sensitive here. If a user knows some background information, she can infer sensitive attributes of patients. Suppose Alice knows that Bob is 25 years old and lives in ZIP 47625. She knows that Bob belongs to one of the first three records in table 2. Since all of them have the same disease, Alice can deduce that Bob has heart disease. l-diversity was introduced to overcome this problem where there are at least l distinct sensitive attributes for each equivalent class. Had the table 2 had the 2-diversity property, Alice would only be able to guess Bob's illness with only 0.5 probability. (Note: even l-diversity has issues. t-closeness was introduced to address one of such issues)

Table 3

Table 4

Table 4 shows the 3-diversity version of table 3. Here both salary and disease are sensitive attributes. Notice that each equivalent class has 3 distinct sensitive attributes.

Even when the sensitive attributes are different, if they are semantically similar (low salary range, some specific types of diseases, etc.), one can perform a similarity attack. For example, if one knows that Bob is one of the first three records, one can deduce that Bob's in low income range of 3K-5K and has a stomach related disease. To account for the semantically closeness of values (i.e. to prevent similarity attacks), t-closeness was introduced. (I leave it for you to read)

A problem with replacing non-disclosed values with NULL:

(a) show a customer table with attributes ID, name, age and phone number. (Y) and (N) indicate the users' content about disclosing those attribute values. For example, Linda and Mary don't mind disclosing everything, but Nick does not want the organization to disclose his age and phone number to any third party. It is common practice to replace (N) values with NULL before executing the query.

Q1: “SELECT name, phone FROM Customer”
Q2: “SELECT name, phone FROM Customer WHERE age >= 25"
Q = Q1 - Q2

Intuitively Q should return only those records with age < 25 as in (e). But to the contrary, it returns two results as in (d). Therefore, using NULL is not a sound solution. A variable based approach was introduced to solve this problem. (I leave it for you to read)

Zero-Knowledge Proof of Knowledge (ZKPK):

Discrete Logarithm Problem is hard to solve. We can use this hardness assumption to hide a secret.
(DLP: Given a generator g of a group of order p and c (which is calculated from g^x), find the value (i.e. x) that gives c)

If you can solve the problem, you can convince another party you possess a secret. How do you actually convince the party that you know the secret without revealing the secret?

Alice: I know the value x corresponding to c. (but I don't want to tell you x)
Bob: Show me.

Alice: Chooses a random y selected from Z_p and send g^y to Bob.
Bob: Send the challenge r selected from Z_p to Alice.

Alice: Computes s = y + r.x and sends s to Bob
Bob: Checks if g^s = g^y.c^r and is convinced that Alice knows x if those are equal.

Sunday, March 22, 2009

How to differentiate a normal person from a mathematician?

Two people P and Q independently, had to babysit on a particular day. They are given the following instruction:

"If baby cries, feed it with the milk in the bottle."

Now it's P's turn. P found the baby was not crying and attended to his/her work while keeping an eye on the baby.

Then Q gets his/her turn. Q also found the was not crying..but he/she made it crying so that he/she can feed it with the milk in the bottle.

Who is the mathematician? Make a guess..

It's Q. Why?

Q found that when the baby was not crying, the action was undefined, hence an unresolved situation. Therefore, he/she made it crying so that the situation was reduced to an already solved problem. :D

Search like how the brain does?

Interesting search engine: Wolfram Alpha - A new paradigm for using the web:

"In a nutshell, Wolfram and his team have built what he calls a “computational knowledge engine” for the Web. OK, so what does that really mean? Basically it means that you can ask it factual questions and it computes answers for you.

It doesn’t simply return documents that (might) contain the answers, like Google does, and it isn’t just a giant database of knowledge, like the Wikipedia. It doesn’t simply parse natural language and then use that to retrieve documents, like Powerset, for example.

Instead, Wolfram Alpha actually computes the answers to a wide range of questions — like questions that have factual answers such as “What country is Timbuktu in?” or “How many protons are in a hydrogen atom?” or “What is the average rainfall in Seattle this month?,” “What is the 300th digit of Pi?,” “where is the ISS?” or “When was GOOG worth more than $300?”

Think about that for a minute. It computes the answers. Wolfram Alpha doesn’t simply contain huge amounts of manually entered pairs of questions and answers, nor does it search for answers in a database of facts. Instead, it understands and then computes answers to certain kinds of questions."

I am not sure how a software system understands something without a database of facts. We, human beings, answer questions based on the vast amount of information we have gathered through out our lives. Our ability better answer certain questions over others is shaped by our past experience and the knowledge gained - in other words, it depends the database of facts stored in the brain.

In any case, if it lives upto its billing, I think it should be able to differentiate facts from opinions with a high degree of confidence..and also be able to filter out irrelevant results based on the context just like our brain unconsciously does.