Android Apps and Permissions: Security and Privacy Risks Trond Boksasp Eivind Utnes Master of Telematics - Communication Networks and Networked Submission date: June 2012 Supervisor: Svein Johan Knapskog, ITEM Co-supervisor: Pern Hui Chia, ITEM Norwegian University of Science and Technology Department of Telematics Problem Description Third-party applications drive the attractiveness of web and mobile application platforms. Manyplatforms(incl. Android, HTML5webapps, Facebook)relyon granular permissions to avoid granting full privileges to third-party applications. The case of Android OS is particularly interesting. However, the permission sys- tem on Android is complex. There are more than 135 official permissions, and it has been a challenge in communicating the actual scope of each permission to boththedevelopersandusers. Thiscreatesroomsforexploitations;maliciousap- plications (or grayware) disguise themselves amongst the hundreds of thousands of normal ones. This project will focus on a large scale data collection and analysis to measure and characterise the behaviour of bad applications. The basic ideas (adaptable to student’s interests) would be as follows: 1. Build an automated and long term data collection process (e.g., using Bash/Python) 2. Parseandorganisetheinformationobtainedintostructureddatabase(e.g., using MySQL) 3. Analyse the data and visualize interesting patterns (e.g., using R) 4. Characterise the behaviour of bad apps (e.g., detecting anomalous permis- sion requests) Assignment given: 23.01.2012 Supervisor: Pern Hui Chia, Q2S Professor: Svein Knapskog, Q2S i ii Preface This master’s thesis completes our 2 year master’s program in Telematics at the Norwegian University of Science and Technology. We would like to thank our supervisor Pern Hui Chia from Q2S at NTNU for all the valuable guidance and help during the course of this project. This project could not have been accomplished without you. Thanks also to Professor Svein JohanKnapskogfromQ2Sforgettingthisprojectupandrunning,andforguiding us through the finishing stages. We greatly appreciate the students at Futurum for keeping our spirits up, and the students at Victoria for keeping us sane. Lastly, we would like to thank our families for believing in us even when we didn’t. Your continuous support over the years have been important. iii iv Abstract This thesis investigates the permissions requested by Android applications, and the possibility of identifying suspicious applications based only on information presented to the user before an application is downloaded. During the course of this project, a large data set consisting of applications published on Google Play and three different third-party Android application markets was collected over a two-month period. These applications are analysed using manual pattern re- cognition and k-means clustering, focusing on the permissions they request. The pattern analysis is based on a smaller data set consisting of confirmed malicious applications. The method is evaluated based on its ability to recognise malicious potential in the analysed applications. The k-means clustering analysis takes the whole data set into consideration, in the attempt of uncovering suspicious patterns. Thismethodisevaluatedbasedonitsabilitytouncoverdistinctsuspi- cious permission patterns and the findings acquired after further analysis of the clustering results. v vi Sammendrag Denne masteroppgaven undersøker tillatelsene etterspurt av Android applikasjo- nerogmulighetenefor˚aidentifiseremistenkeligeprogrammerbasertkunp˚ainfor- masjon presentert til brukeren før applikasjonen blir lastet ned. Under gjennom- føringen av dette prosjektet har vi laget ett datasett best˚aende av applikasjoner fraGooglePlayogtretredjepartsapplikasjons-markeder,samletoverentom˚ane- ders periode. Applikasjonene er analysert med manuell mønstergjenkjenning og k-means gruppering medfokus p˚a tillatelsene de ber om. Mønstergjenkjenningen erbasertp˚aetmindredatasettbest˚aendeavbekreftedeondsinnedeapplikasjoner, og metoden er evaluert etter dens evne til˚a gjenkjenne ondsinnet potensiale i de analyserteapplikasjonene.Grupperingsanalysentarheledatasettetibetraktning for˚a finne mistenkelige mønstre. Denne metoden er evaluert etter dens evne til ˚a avdekke mistenkelige mønstre og funnene ervervet etter nærmere analyse av resultatene fra grupperingen. vii viii
Description: