Inference Attacks to Social Systems

Web service providers (e.g., Google, Facebook, and Yelp) and third-parties who can crawl these service sites acquire a large amount of data from users. These data includes graphs (e.g., social relationships between users), user attributes (e.g., demographcs, interests, and sexual orientation), behavior data (e.g., reviews of products, pages shared or liked on Facebook), texts (e.g., blog posts, tweets), and images (e.g., photos shared on Instagram). Due to privacy concerns, some users might not provide their sensitive information to these service providers. For instance, Online Social Network users might hide some sensitive friends or choose to not provide their private attributes; and bloggers might hide their identity when they post sensitive texts.

We demonstrate that private information can be inferred from publicly available data with big data analytics techniques. In particular, via evaluations on a blog dataset with 100,000 authors and 2.4 million blog posts, we demonstrate that user identity can be inferred via writing style analysis. Moreover, we crawled a large-scale dynamic Google+ social data including 30 million users and 474 million social relationships, and we use it to show that hidden social relationships and user attributes can be inferred with high accuracy. Our ongoing researches explore more inference attacks as well as their defenses.


Inferring users' identity via linguistic stylometry

Measuring and modeling the interactions between social relationships and user attributes, which shed light on the inferences of hidden social relationships and user attributes

Jointly inferring hidden social relationships and user attributes


  • Dawn Song (My advisor)
  • Arvind Narayanan (Professor at Princeton University)
  • Hristo Paskov (Ph.D. student at Stanford University)
  • Elaine Shi (Professor at University of Maryland)
  • Emil Stefanov (Ph.D. student at UC Berkeley)
  • Wenchang Xu (Visiting student from Tsinghua University)
  • Ling Huang (Research scientist at Intel Labs)
  • Prateek Mittal (Professor at Princeton University)
  • Vyas Sekar (Professor at CMU)
  • Ameet Talwalkar (Postdoc at UC Berkeley)
  • Lester Mackey (Professor at Stanford University)
  • John Bethencourt
  • Richard Shin (Ph.D. student at UC Berkeley)