Computer Science
Natural language processing for mobile app privacy compliance
Document Type
Conference Paper
Abstract
Many Internet services collect a flurry of data from their users. Privacy policies are intended to describe the services' privacy practices. However, due to their length and complexity, reading privacy policies is a challenge for end users, government regulators, and companies. Natural language processing holds the promise of helping address this challenge. Specifically, we focus on comparing the practices described in privacy policies to the practices performed by smartphone apps covered by those policies. Government regulators are interested in comparing apps to their privacy policies in order to detect non-compliance with laws, and companies are interested for the same reason. We frame the identification of privacy practice statements in privacy policies as a classification problem, which we address with a three-tiered approach: a privacy practice statement is classified based on a data type (e.g., location), party (i.e., first or third party), and modality (i.e., whether a practice is explicitly described as being performed or not performed). Privacy policies omit discussion of many practices. With negative F1 scores ranging from 78% to 100%, the performance results of this three-tiered classification methodology suggests an improvement over the state-of-the-art. Our NLP analysis of privacy policies is an integral part of our Mobile App Privacy System (MAPS), which we used to analyze 1,035,853 free apps on the Google Play Store. Potential compliance issues appeared to be widespread, and those involving third parties were particularly common.
Publication Title
CEUR Workshop Proceedings
Publication Date
2019
Volume
2335
First Page
24
Last Page
32
ISSN
1613-0073
Repository Citation
Story, Peter; Zimmeck, Sebastian; Ravichander, Abhilasha; Smullen, Daniel; Wang, Ziqi; Reidenberg, Joel; Russell, N. Cameron; and Sadeh, Norman, "Natural language processing for mobile app privacy compliance" (2019). Computer Science. 228.
https://commons.clarku.edu/faculty_computer_sciences/228
APA Citation
Story, P., Zimmeck, S., Ravichander, A., Smullen, D., Wang, Z., Reidenberg, J., ... & Sadeh, N. (2019, March). Natural language processing for mobile app privacy compliance. In AAAI spring symposium on privacy-enhancing artificial intelligence and language technologies (Vol. 2, No. 4, p. 4).