“Google Play has more than one million apps and over 50 billion app downloads, but no one reviews what gets put into Google Play—anyone can get a $25 account and upload whatever they want. Very little is known about what’s there at an aggregate level,” pointed out Jason Nieh, professor of computer science at Columbia Engineering. “Given the huge popularity of Google Play and the potential risks to millions of users, we thought it was important to take a close look at Google Play content.”
You would think that Google would be able to prevent such a large-scale and automated scanning and downloading of its Android app market, but the researchers managed to circumvent Google's defenses via a specially crafted crawler tool called PlayDrone, which randomly generates valid IMEI and MAC addresses to prevent device blacklisting by Google, and by using some 500 different Google accounts and reverse-engineering the various Google Play store APIs in order to implement them.
Other things that the researchers discovered are that:
- The top 10% of most downloaded applications accounts for over 96% of the total downloads
- Popular applications tend to use more native libraries. "As an application rises in popularity, developers are perhaps more willing to spend time and money to use native libraries to optimize the user experience of the application," posited the researchers.
- 25 percent of all Google Play free apps are clones of other apps already in Google Play
"PlayDrone can serve as a useful tool to better understand Android applications and improve the quality of application content in Google Play," the researchers noted, and added that Google has learned something frome their research and is now using their techniques to scan apps for these types of problems.
For more details about the research, download the paper.