Discovering Android Malware with Information Flow Monitoring



Google Play offers more than 800'000 applications (apps), and this number increases every day. Google play users have performed more than 25 billion app downloads. These applications vary from games to music, video, books, tools... Unfortunately, each of these app is an attack vector on Android.  The number of malicious applications (malwares) discovered during the first six months of 2013 exceeds the number of malwares discovered during the 2010 to 2012 period (source TrendMicro). According to the same source, more than 700 thousand malicious and risky apps were found in the wild. In this context it is crucial to propose methods to stem the progression of Android malwares.  

 The final visible result of this project is an online platform where users can  drop android applications and immediately get an understandable information flow digest, a report on detected malicious behaviors and a security policy to apply on the application to prevent such unwanted behaviors.

 Our contribution w.r.t aforementioned work will be to automate the complete malware detection process. The database of malicious behaviors will be automatically built and updated from the applications dropped by users. Furthermore, no fixed or manual selection of sensitive information will be necessary but a security policy adapted to the application will automatically be generated instead.
 The first challenge is to detect malwares with regards to what they do and not what they look like. To have a compact representation of ``what they do'', we will use a new data-structure namely System Flow Graph (SFG).
 The SFG of an application describes its external behavior: used files, used sockets and processes it communicates with, etc. In other words, it describes the application observable behavior from a system point of view. The graph is built using the log file produced by an information-flow monitor. SFGs revealed to be helpful to visualize and understand malware behavior. Preliminary experiments show that, given two applications infected by the same malware, the intersection of their two SFGs is the SFG of the malware. Thus, by computing such intersections between many applications, it should be possible to automatically infer the SFG of unknown malwares.

 The second challenge is to get an accurate SFG for any unknown application. Since
 SFGs are built by runtime monitoring of the application, to have a complete SFG, we have to trigger all possible observable behaviors at least once. It is thus necessary to know the set of all possible events the application is waiting for. We propose to build this set, or to approximate it, using static analysis of the Dalvik application code.


For now, our main results are :

- A framework called GroddDroid dedicated to automatic malware triggering. GroddDroid has received the best paper award at the 11th International Conference on Malicious and Unwanted Software.  

- A dataset of reversed malware. This dataset is available for research purpose. A corresponding article is under publication process.

- An analysis platform currently in private access.

All the results of the Kharon project are stored on the Inria Forge. Please visit