Tuesday, February 22, 2011

Encryption in the cloud

Good read on the $subject

There are solutions for data-at-rest, but they are far from perfect for cloud scenarios. In my opinion, the most challenging one is the data-in-process. There are some basic solutions to perform computation over encrypted data, but they are quite limited. This is mainly due to the fact that we still don't have efficient fully homomorphic encryption schemes. I think in the next decade, we will see better techniques to perform computation over encrypted data without decrypting them.

Monday, February 21, 2011

Homomorphism in Shamir's secret sharing

The homomorphism property in the Shamir's secret sharing scheme is a simple but a very powerful idea that could be useful to reduce the trust we have to put on the dealer to correctly distribute secret shares to users. (See my previous post for an introduction on verifiable secret sharing).

Let's look a simple example to see what this property means.

Assume that we have two degree two polynomials g and f which we can use to create two 3-out-of-n secret sharing schemes where at least 3 out of n users should get together to obtain the secret.

f(x) = 2 + x + x2
g(x) = 3 + 2x + 3x2

We have three users with the following secret shares given from the two secret sharing polynomials where the secrets are 2 and 3 for f and g respectively. Let's call them sub-secrets.













Userfg
u1(1,4)(1,8)
u2(2,8)(2,19)
u3(3,14)(1,36)

Let's call the combination of the two sub-secrets as the super-secret, which is equal to 5. The problem we have at hand is "how can the users construct the super-secret revealing as little information about the sub-secrets as possible?".

It is not difficult to see that if the three users releases the sum of their secret shares, that is (1, 12), (2, 27) and (3, 50), and do the polynomial interpolation on in it, the combined function gives the super-secret 5. So, the sum of the shares of the secrets are the shares of the sum of the secrets. This is what the homomorphism property is.

The above relationship is defined as the (+, +)-homomorphism. It could be two different operators. For example, Shamir's secret sharing scheme can be converted to (*, +)-homomorphism by using discrete log of secret shares.

References:
Josh Cohen Benaloh, Secret Sharing Homomorphisms: Keeping Shares of a Secret, Proceedings on Advances in cryptology---CRYPTO, 1986

Friday, February 18, 2011

A simple construction of a veribable secret sharing scheme

In a SS (Secret Sharing) scheme a secret is split among many people and some of them need to corporate to build the secret from their shares of the secret. In in it is basic form, a dealer who possesses the secret s splits the secret and gives the secret share si to each party Pi, i = 1, 2, ..., n. The protocol makes sure that there should be at least t+1 parties to derive the master secret s. In other words, t or less number of parties cannot derive the secret.

In a normal SS scheme, the dealer is assumed to be honest. What if the dealer can be malicious? That's where VSS (Verifiable Secret Sharing) comes to play. VSS is similar to SS except that it allows each party to verify that the share given to them, si, is a valid one.

Here's a simple VSS scheme using Shamir's secret sharing and additive homomorphic encryption.

An additive homomorphic encryption scheme (e.g. ElGamal cryptosystem) allows us to perform addition over encrypted data. In particular, we use the following properties. Let E be the encryption algorithm and m1, m2 two numbers.

E(m1)E(m2) = E(m1 + m2)
E(m1)m2 = E(m1m2)


The dealer generates secret shares just like in the Shamir's SS scheme. For (n, t+1) (i.e. out of n parties, there should be at least t + 1 parties to construct the secret, the dealer selects a random degree t polynomial f with f(0) = s:

f(x) = s + a1x + a2x2 + ... + atxt

The dealer sends the secret share (i, f(i)) (f(i) = si) to each party Pi through a private communication channel. Now t + 1 parties can perform a polynomial interpolation and obtain a unique polynomial which is equal to f and get the secret f(o).

In order to verify their secret shares, the dealer distribute an additional piece of information. The dealer generates public and private key pair for the additive homomorphic encryption E and broadcasts the encrypted coefficients of f, that is, E(s), E(a1), E(a2), ..., E(at). A party Pi, first encrypts it share and compare if the following holds. If it is true, it concludes that the dealer has given a valid secret share.

E(si) ==? E(s)E(a1)iE(a2)i2 ... E(at)it

References:
How to Share a Secret, 1979, Adi Shamir
A Practical Scheme for Non-interactive Verifiable Secret Sharing, 1987, Paul Feldman

Tuesday, February 15, 2011

Flat table based group key management

Flat table (FT) key management is a simple way to manage group key. In it is basic form, it is quite efficient, but it is susceptible to collusion attacks.

First the notations:
GC - Group Controller - the centralized entity that manages the group key
GM - Group Member - a participant who wants to obtains the group key
K - the group key, it is also called DEK (Data Encryption Key) and is used to encrypt broadcast messages to the group.
KEK - Key Encryption Key - a key used to encrypt K.

Each GM in the group is assiged an identifier, ID. ID is an n-bit binary number which is denoted as Xn-1Xn-2...X0, where Xi is either 0 or 1. The maximum size of the group N = 2n.

GC maintains 2n KEKs, {ki,b | i ∈ Zn, b ∈ Z2}. That is for each bit position, there are two keys, one for 0 and one for 1.

Each GM is given n + 1 keys. Half of the KEKs (i.e. n) which correspond to the n bits in its ID and K.

Join:
Let's assume that the ID of the joing GM is Xn-1Xn-2...X0. In order to provide backward secrecy (i.e. not allow the joining GM to access previous keys), we need to update K and n KEKs {ki,Xi | i ∈ Zn} corresponding to this ID. New K and KEKs are encrtypted with their old K and KEKs respectively and broadcast to the group. GC gives the new K and KEKs to the joining GC though a secure private communication channel.

Leave:
Let's assume that the ID of the leaving GM is Xn-1Xn-2...X0. In order to provide to provide forward secrecy (i.e. not allowing the leaving GM to access new keys), again K and the n KEKs held by the leaving GM are updated.
However, we need to use a different approach to update the keys since we cannot reuse the old keys to do the job. GC decides a new group key K' and encrypts with the remaining n keys that the leaving GM does not possess and broadcasts the encrypted messages. An existing group member can decrypt at least one of those messages and obtain the new group key K'. Then GC encrypts each new KEK corresponding to those of the leaving GM twice. First GC encrypts with K' and then encrypts with the old KEK, and then broadcast to the group. It make sure only the existing users can access new KEKs who already had old KEKs. Notice that the leaving GM cannot access since it does not have K' to remove one layer of encryption.

There are proposals for optimizing for multiple leaves. In one work, the use Boolean function minimization techniques to reduce the complexity.

Collusion attacks:
When multiple users are removed at once, the FT key management approach is not resistant to collusion attacks.


References:
Waldvogel et. al. The VersaKey framework: versatile group key management, 1999

Chang et. al. Key management for secure internet multicast using Boolean function minimization techniques, INFOCOM 1999

Wednesday, February 9, 2011

What should we have in a security definition/protocol?

There are three key components that should be specified in any security definition/protocol.

1. Adversarial power under which the protocol is secure
2. Network model in which the protocol operates
3. Security guarantees that the protocol provides

Security is not something absolute. We may not be able to protect a protocol against any type of adversary. A security protocol is meaningless if it does not specify the capabilities of the adversaries under which the protocol is still secure. The capabilities of adversaries are specified in the adversarial model. For example, if the adversary is computationally bounded or have unbounded computational power, if the adversary is honest, semi-honest or malicious, etc.

Your protocol may need a trusted third party to work. It may not be secure without a trusted third party. Your protocols may need certain parties to online at all times. Such information is specified in the network model. Like the adversarial power, your protocol is meaningless without a proper network model.

As I mentioned earlier, security is moving line; your protocol may solve only certain problems. It is crucial to specify what problems it solves up front as part of the security guarantees.

In short, you need to specify what security guarantees your security protocol provides under which adversarial and network model. Otherwise, a security protocol is as useless as zero security protocol.

A big mistake people, especially in the database community, make is to say "our protocol is secure for this data set and it does not reveal any useful information to the adversary". This is very wrong. You cannot define security based on a input data set. What happens when you have a different data set? Will your protocol still be secure? Probably not. Security should never be analyzed by a set of examples. Rather, it should be based on the above 3 objective measures.