Paper here: http://www.cs.princeton.edu/~satyen/papers/privacy.ps
I'll be honest: I saw a lot of greek symbols and ran away screaming, so I actually didn't get a whole lot out of this paper. This is doubly embarrassing as several authors are MSR-SVC-ers I know and love (Frank, Cynthia, Kunal).
Okay, so, the gist is that you want to release a "fuzzied" database with information about people in the form of "contingency tables," which are essentially histograms of various traits (think male vs. female, age ranges [20-29, 30-39, 40-49...], etc.). You want to maintain aggregate statistics about the data set without revealing information about particular people in the data set. The problem is that you want to maintain the accuracy of the data and keep it internally consistent in the process of making it private. This work focuses on keeping the data consistent.
The punchline is that instead of directly tweaking the data itself or the "marginals" (which they never define for us idiot-folk), they translate the data into the Fourier domain and tweak the data there. Turns out that has nice properties, though fuck me if I know even what that means or why it's true.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment