A face peering out from the darkness signifying browser fingerprinting

Browser Fingerprinting and Anonymity

I’ve written before on the subject of VPNs and anonymity and, more specifically, on the fact that such a thing does not exist. The basic rebuttal to anonymity while using a VPN surrounds the use of services that already know you regardless of what IP address you’re coming from. Services that require you to log in are an example of this. If you have supplied a valid username and password to a service you’ve proven your identity. But there exists silent identifiers that can expose your activities online regardless of whether you’re using a VPN. This technique is called Browser Fingerprinting and it is highly accurate, undetectable, and immune to VPN use.

What is Browser Fingerprinting?

We’re all familiar with fingerprints and the fact that they are unique enough to identify one specific individual on the planet. What you may not know is that there is no such thing as a 100% guaranteed fingerprint match to one person. In most cases it takes a minimum of eight matching points to assign a fingerprint to a person and there can be no mismatching points.

That’s human fingerprinting and it is hard work. Human fingerprints decay rapidly and are almost never complete. Browser Fingerprinting is essentially the same idea. Match a set of data points provided by the web browser to an individual. However, because computers’ fingerprints are not comprised of rapidly decaying salt and skin residue, they’re usually complete.


HideMyAss.com

Can you give me an example of Browser Fingerprinting?

When you visit a website your browser has to supply some information so that you can see the page. One of the most damning pieces of information is your IP address. A VPN can help with that by allowing you to use a different IP address than the one your ISP assigned. When a website collects IP addresses that is considered “passive collection” because that information is supplied out of necessity. But websites can do a lot of “active collection” as well. This term refers to code on the website that purposely examines your browser for as much information as it can get. This includes:

A screenshot of my test at EFF showing the results of my browser finerprinting

For the Panopticlick to gather this amount of data it relies on a combination of Javascript and browser request headers. It is possible to surf with Javascript off (I use Script Block for Chrome) but the vast majority of people have it turned on.

So, now we see that I am leaking a ton of information about my browser. So what? There are millions of people out there surfing the same sites, how can this possibly identify me?

The answer to that lies in the “one in x browsers have this value” column. For example, only one in 1635 browsers share my “hash of canvas fingerprint” value. Add in the fact that only one in 1455 browsers share the same hash as my WebGL fingerprint and things are starting to get pretty narrow. Throw in some information such as my user agent, the fonts I am using (whaaat??) and my time zone and it becomes pretty obvious how easy it is to compile a data set that can fairly accurately identify one single browser.

But, wait. It gets worse.

It’s even possible to check what’s in your various caches. This paper on the Technical analysis of client identification methods shows this. It’s possible to check your Silverlight cache. It’s possible to check what Flash Local Shared Objects you have, your HTML5 Appcache and so much more.

When someone has the ability to compile the statistics that the Panopticlick collects, coupled with the ability to query the local storage and cache of your plugins, it becomes pretty easy to make those 8 (hundred?) points of comparison for a browser fingerprint match.

OK, I’ve been fingerprinted, so what?

This is a good point. I visited my bank website, they collected this information, so what?

The problem is that any site can collect this information from you. It’s technically possible for every site to collect this data from you. As you surf from site to unrelated site, each collects this same fingerprint data. This data is accurate enough to now correlate your trips across the web, tracking every site you’ve been to.

A VPN cannot help with this because all a VPN does is encrypt your data and hide your IP. Plugins that block javascript and Flash can help, but it’s hard to imagine being able to block all these collection attempts and still be able to functionally surf the web.