Computer Science

Natural language processing for mobile app privacy compliance

Document Type

Conference Paper

Abstract

Many Internet services collect a flurry of data from their users. Privacy policies are intended to describe the services' privacy practices. However, due to their length and complexity, reading privacy policies is a challenge for end users, government regulators, and companies. Natural language processing holds the promise of helping address this challenge. Specifically, we focus on comparing the practices described in privacy policies to the practices performed by smartphone apps covered by those policies. Government regulators are interested in comparing apps to their privacy policies in order to detect non-compliance with laws, and companies are interested for the same reason. We frame the identification of privacy practice statements in privacy policies as a classification problem, which we address with a three-tiered approach: a privacy practice statement is classified based on a data type (e.g., location), party (i.e., first or third party), and modality (i.e., whether a practice is explicitly described as being performed or not performed). Privacy policies omit discussion of many practices. With negative F1 scores ranging from 78% to 100%, the performance results of this three-tiered classification methodology suggests an improvement over the state-of-the-art. Our NLP analysis of privacy policies is an integral part of our Mobile App Privacy System (MAPS), which we used to analyze 1,035,853 free apps on the Google Play Store. Potential compliance issues appeared to be widespread, and those involving third parties were particularly common.

Publication Title

CEUR Workshop Proceedings

Publication Date

2019

Volume

2335

First Page

24

Last Page

32

ISSN

1613-0073

APA Citation

Story, P., Zimmeck, S., Ravichander, A., Smullen, D., Wang, Z., Reidenberg, J., ... & Sadeh, N. (2019, March). Natural language processing for mobile app privacy compliance. In AAAI spring symposium on privacy-enhancing artificial intelligence and language technologies (Vol. 2, No. 4, p. 4).

Share

COinS