The Most Dangerous Code in the Web Browser

June 28, 2015

Did you know that the web browser extension you installed a long time ago (say, AdBlock), can probably see all your passwords, look at any website you visit using your credentials and could trivially send all that information to an arbitrary web server? That's pretty scary, and in this blog post I will explain how security for extensions currently works. I will also outline research towards a better extension security model for browsers that protects your sensitive information.

Background

Web applications are ubiquitous and many tasks that traditionally have been achieved using dedicated desktop applications are carried out in web browsers today. Furthermore, these applications increasingly handle sensitive data such as banking information, passwords or medical data. This poses security challenges to protect this data from malicious entities, and luckily the web platform and, as part of this, web browsers have evolved to achieve this. For instance, the same-origin policy (or SOP for short) roughly ensures that information from one website (say, bank.com) cannot be accessed by a malicious site like evil.com, because they have different origins¹.

Mechanisms like the SOP have worked relatively well in protecting user data in web browsers². However, users often want to extend web browser functionality by installing extensions, e.g., to block advertisements or to extend the functionality of certain websites. Unfortunately, the security story for extensions is less rosy.

Extension Security

Let's look at Google Chrome, the most popular browser and the one with the most sophisticated security model for extensions: Unlike websites, extensions need access to more sensitive APIs to implement their functionality, such as access to the list of open tabs, the browsing history, or the cookies. At the same time, extensions also need to deeply interact with the websites code, so that they can change their behavior or looks. Since such APIs give access to rather sensitive data, Chrome uses two main mechanisms to try and ensure the information remains private. Firstly, extensions are split into a content script and a core script, and only the core gets access to these APIs (like your browsing history), but cannot directly interact with any web pages. In contrast, the content script can interact with web pages, and, for instance, access and change what is displayed (by accessing the so-called DOM). Message passing is then used between the content script and the extension core to try and prevent malicious websites from getting access to the sensitive browser APIs, even in the presence of badly written or buggy extensions. This is important because the sensitive APIs are not governed by, say, the SOP. Secondly, extensions need to declare a list of permissions, and extensions only get access to the APIs allowed by the declared permissions.

At this point, you might be able to guess what the problem is: The sensitive APIs are protected from malicious websites, but what about malicious extensions? The only protection of the sensitive APIs is through the permissions, which are declared by the extension. The user can either grant all requested permissions at install time, or not install the extension. We have looked at the top 500 extensions by number of users in the Chrome Web Store to get a sense of what permissions are being requested. We found that more than 71% of extensions require permission to "read and change all your data on the websites you visit". And it only gets worse if we look at the top 100 (with 82%) or top 50 of extensions, where a staggering 88% of extension require this permission. And these extensions are widely used, too, with the most popular having more than 10 million users (how many exactly we don't know, as Google caps the reported number at 10 million), and even the extension in place 500 still has more than 70,000 users. Clearly, users are willing to give these very broad permissions, and most extensions do require powerful permissions.

A full view of what percentage of the most popular extensions require permission to <i>read and change all your data on the websites you visit</i>. At the left, we consider the top 500 most popular extensions (by number of users), and we restrict it to the more popular extensions as we go to the right. For instance, all of the top 7 most popular extension show this message on installation.

This means that most of these popular extensions can interact with web pages, and for instance learn the password entered by the user, or see the bank statement. At the same time, they can send all that data to wherever they choose, clearly putting the user's privacy at risk.

This is a highly unsatisfactory situation, and while Google is trying to remove malicious extensions from their online store³, this inevitably is an arms race that cannot guarantee the safety of all users. Thus, in the remainder of this blog post, I will outline a possible solution to the problem of extensions needing access to sensitive information to implement their functionality, yet ensuring the user's privacy is not at risk.

Towards a Solution

The key to a solution is the insight that extensions which deal with sensitive information are perfectly safe as long as they do not disseminate this sensitive information arbitrarily, for instance by sending it to evil.com. Let's look at an example: The extension Google Mail Checker gives the user an icon with the number of unread emails in Gmail.

To implement this functionality, the extension requires permission to look at any information from google.com sites. Clearly, this is potentially dangerous to the user's privacy. In fact, there is nothing stopping the extension from leaking all emails, or stealing the user's Google password (it probably isn't, but how do you know?)⁴.

But as we observed earlier, the user's data would be safe if the extension can access the emails, but not leak them arbitrarily. This idea is pervasive in mandatory access control (MAC) based confinement systems, where it is not just limited who can access information, but also how that information can be further shared.

We can implement this idea by tracking where data originates from by using a label. The labels correspond to origins, and in our example the extension would be tainted with the label mail.google.com after it accessed the unread count. This label is then used to limit with whom the extension can communicate. In particular, in the example, the extension is now limited to communicate with mail.google.com, and couldn't leak the user's email to evil.com⁵. With this, even if it reads all your email or looks at your password, the extension has no way of sending the information anywhere. However, it can still perform its main feature of displaying the unread count as an icon to the user (which can always be done, regardless of the label). And most importantly, it is possible to do this without requiring a permission at all! Accessing sensitive information does not require permission if the information cannot be leaked.

Of course not all extensions are this simple, and so we need more sophisticated APIs to allow more complex extensions to be safe. A common problem in MAC-based confinement systems is that sometimes information that is sensitive (with regard to a particular label) is actually okay to be leaked to another origin. For instance, extensions like the Evernote Web Clipper allow its users to save parts of a web page (that might contain sensitive information) to the popular note-taking app at evernote.com. Formally, this problem is known as declassification; conceptually the label needs to be removed of a piece of data before it can be sent to the destination origin. Clearly extensions cannot be trusted to make these declassification decisions, as that would allow them to just arbitrary leak sensitive information again. However, for extensions we can leverage user intent: Already in current extensions the user clicks a context menu entry to share the information with the extension. Thus, if we provide a sharing API, then extensions can register such a context menu in a trusted UI, and every time the user shares information explicitly, this corresponds to a declassification (by the user) in our system. The user is in full control of what information gets leaked, again without having to use any permissions.

Another application enabled by our approach is a secure password storage in the cloud. This is not a new idea and services like LastPass allow users to only remember a single master password, and have all other passwords be stored in the cloud. This functionality is great, but allows anybody with access to LastPass servers to read all the user's password hashes. Unfortunately it's not enough to trust LastPass to not peak them: If somebody manages to hack into their systems, then that person might gain access to user's password hashes⁶. So, how can we help? If an extension reads the password on the accounts.google.com page when you log into your Google account, then that password is labeled with the Google origin and couldn't be sent to lastpass.com. We wouldn't want to declassify the password, as that reveals the password to LastPass and has all the aforementioned problems. Instead, we can take advantage of cryptography: In our system, we allow extensions to conceptually remove a label from a piece of data (like the password) by encrypting it through a browser API. With this, an extension can access the password, encrypt it and then hold the encrypted string without being tainted at all. This allows the extension to share the encrypted password with, say, lastpass.com. While the password is encrypted, the extension isn't tainted, but it also cannot see or use the password (e.g., to log the user in with previously saved credentials). There is a second API call that decrypts the information and allows the extension to see the password. However, the decryption now taints the extension with the corresponding label. In our example, the extension would be tainted with accounts.google.com and couldn't share the password arbitrarily any longer, but can perform it's legitimate function of filling in the login form. Essentially, this mechanism allows extensions to trade taint with encryption.

Conclusions

These are just a few of the ideas from a research paper I have written with Devon Rifkin and Deian Stefan. The paper contains more technical details and has been presented at the Workshop for Hot Topics in Operating Systems. While we have not yet implemented such a secure extension system and only outlined a solution, we do believe this to be promising direction towards a practical solution. We hope that this encourages browser vendors to rethink extension systems and raise awareness in users who may not be aware of the capabilities of even simple browser extensions.

For now, the best recommendation for users is to be careful what extensions you install and only do so through trustworthy sources.

Press

This blog post has been discussed on Hacker News. Furthermore, Eric Limer from Popular Mechanics has written an article with the title Reminder: Your Browser Extensions Have Absurd Access To Everything You Do Online about this work.

Technically, an origin is defined as an URI scheme, a hostname, and a port number. For simplicity, I only use the domain in my examples. ↩
Here we are only looking at client side security, i.e., protecting user data when it is in the web browser. Security on the server is a separate problem, and a lot of reports about privacy breaches are about servers getting hacked. ↩
For example, Google recently removed almost 200 extensions that affected a large number of users. The extensions range from injecting ads to outright stealing private information such as passwords. ↩
This particular extension is actually better than most extensions, and only requires permission to all google.com sites (but more than just mail.google.com). For this reason, it couldn't directly leak the information to, say, evil.com. However, it could send it via the user's Gmail account, and then delete the incriminating email. ↩
By default, only GET requests are possible, which only allow reading a website. Changing information on a website or sending an email requires a POST request, which prevents the extension from sending the users information via Gmail (the attack outlined earlier). ↩
In fact, LastPass did get hacked recently, though the saved passwords for other sites have not been compromised according to their analysis. ↩

Questions or comments? Send me an email or find me on twitter @stefan_heule.