Super Ninja Privacy Techniques for Web App Developers
by Marc Hedlund and Brad Greenlee - Developers at Wesabe - Wednesday, 22 August 2007.
The idea of a privacy wall is simple: don't have any direct links in your database between your users' "public" data and their private data. Instead of linking tables directly via a foreign key, use a cryptographic hash that is based on at least one piece of data that only the user knows-such as their password. The user's private data can be looked up when the user logs in, but otherwise it is completely anonymous. Let's go through a simple example.

Let's say we're designing an application that lets members keep a list of their deepest, darkest secrets. We need a database with at least two tables: 'users' and 'secrets'. The first pass database model looks like this:

The problem with this schema is that anyone with access to the database can easily find out all the secrets of a given user. With one small change, however, we can make this extremely difficult, if not impossible:

The special sauce is the 'secret_key', which is nothing more than a cryptographic hash of the user's username and their password (plus a salt). When the user logs in, we can generate the hash and store it in the session. Whenever we need to query the user's secrets, we use that key to look them up instead of the user id. Now, if some attacker gets ahold of

the database, they will still be able to read everyone's secrets, but they won't know which secret belongs to which user, and there's no way to look up the secrets of a given user.

So what you do if the user forgets their password? The recovery method we came up with was to store a copy of their secret key, encrypted with the answers to their security questions (which aren't stored anywhere in our database, of course). Assuming that the user hasn't forgotten those as well, you can easily find their account data and "move it over" when they reset their password (don't forget to update the encrypted secret key); if they do forget them, well, there's a problem.

The privacy wall technique has a number of possible weaknesses. As mentioned earlier, we store the secret key in the user's session. If you're storing your session data in the database and your database is hacked, any users that are logged in (or whose sessions haven't yet be deleted) can be compromised. The same is true if sessions are stored on the filesystem. Keeping session data in memory is better, although it is still hackable (the swapfile is one obvious target). However you're storing your session data, keeping your sessions reasonably short and deleting them when they expire is wise. You could also store the secret key separately in a cookie on the user's computer, although then you'd better make damn sure you don't have any cross-site scripting (XSS) vulnerabilities that would allow a hacker to harvest your user's cookies.

Other holes can be found if your system is sufficiently complex and an attacker can find a path from User to Secret through other tables in the database, so it's important to trace out those paths and make sure that the secret key is used somewhere in each chain.

A harder problem to solve is when the secrets themselves may contain enough information to identify the user, and with the above scheme, if one secret is traced back to a user, all of that user's secrets are compromised. It might not be possible or practical to scrub or encrypt the data, but you can limit the damage of a secret being compromised. We later came up with the following as an extra layer of security: add a counter to the data being hashed to generate the secret key:

secret key 1 = Hash(salt + password + '1') secret key 2 = Hash(salt + password + '2') ... secret key n = Hash(salt + password + '')

Getting a list of all the secrets for a given user when they log in is going to be a lot less efficient, of course; you have to keep generating hashes and doing queries until no secret with that hash is found, and deleting secrets may require special handling. But it may be a small price to pay for the extra privacy.


Harnessing artificial intelligence to build an army of virtual analysts

PatternEx, a startup that gathered a team of AI researcher from MIT CSAIL as well as security and distributed systems experts, is poised to shake up things in the user and entity behavior analytics market.

Weekly newsletter

Reading our newsletter every Monday will keep you up-to-date with security news.

Daily digest

Receive a daily digest of the latest security news.

Tue, Feb 9th