Using Fake Data to Protect Real Privacy

A fascinating article from The Atlantic on using fake data to protect real privacy. “There are basically two ways to reduce the risk of a confidentiality breach, [John] Abowd explained. The familiar approach is to perform an analysis on confidential data and then add random error to the output of the analysis. Introducing random error in the output is necessary to reduce the chance that information about any individual will be revealed. But sometimes the random error precisely masks the features that researchers are interested in. Another way, that gets around this problem, is to implement privacy protections on the input of an analysis, by modifying the dataset itself.”

%d bloggers like this: