A flawed random-number theory

Wednesday, 26 June 2002, 11:15 AM EST

IBM's Privacy Research Institute recently revealed techniques that aim to preserve individual privacy while giving e-businesses information to generate data models. These techniques scramble or "randomize" private information and reconstruct data distributions at an aggregate level to perform data mining. This means that Web site administrators and merchants can use scrambled data without knowing the underlying private information.

Let's say I enter 45 in a forthcoming Java application that uses the IBM techniques to provide a merchant with age information in return for a music sample. The Java app takes my age and adds or subtracts a random value. The value would differ with each user. Then it sends the new number to the merchant. So, my 45 years may be reduced to 32. This program may also increase my net worth in a single keystroke! I like it already, but what's the value to the merchant?

Although the numbers change, the allowed range of randomization does not. That range is linked to an acceptable range of data at an aggregate level-and a level of privacy. The merchant might not care about my exact age, but it might like to know I'm between 30 and 50. Large randomizations will increase the personal privacy for users but reduce accuracy for merchants. If my age were randomized to 17, that would hardly be valuable to a merchant if it were used in conjunction with the title to the music I requested. Not too many 17-year-olds are into Bob Dylan.

[ Read more ]


MagSpoof: A device that spoofs credit cards, disables chip-and-PIN protection

The device can wirelessly spoof credit cards/magstripes, disable chip-and-PIN protection, and predict the credit card number and expiration date of Amex cards after they have reported stolen or lost.

Weekly newsletter

Reading our newsletter every Monday will keep you up-to-date with security news.

Daily digest

Receive a daily digest of the latest security news.

Thu, Nov 26th