Why Do Websites like Boing! Boing! Collect So Much Data?

This exchange between Greg Yardley of Pinch Media and Joel Johnson of Boing! Boing! highlighted a fundamental hypocrisy about data collection and really begs the question of why so many websites think they need to collect so much data about visitors while really making this hard to suss out for normal users.

Yardley is co-founder of Pinch Media which makes spyware that is then baked into iPhone apps. When you use the iPhone app, the app gathers and transmits information about you back to Pinch Media. Johnson highlighted this, but Yardley responded that what company does is no different than what Boing! Boing! does,

Here’s what Boing Boing is running right now, right when I loaded this page:

Google Analytics
Quantcast
Federated Media
HitTail
Doubleclick
Google Custom Search Engine
Tribal Fusion
Six Apart Advertising
Adify
Chitika
AWStats

That’s no fewer than eleven different services that started tracking information about me without my consent. Most (not all) of these services track users across every domain where their code is placed, constructing a profile that’s then used for ad targeting. Some of these services go out of their way to circumvent user attempts to safeguard their privacy. A couple, for instance, store information in the much lesser-known – and rarely deleted – Local Shared Objects that come along with Flash, and have been known to use this information to ‘recreate’ user cookies after they’ve specifically been deleted. A couple more combine the information they’ve gathered about you here with information they’ve pulled in from social networks (where you’re also tracked) to work up a complete demographic profile for targeting. Some of these probably don’t even have a direct relationship with Boing Boing, but are served by other ad networks doing backfill – you could get a different set of trackers, potentially even more invasive, the next time you reload the page.

I didn’t consent to any of the tracking Boing Boing does – there’s no terms of service or privacy policy that pops up on first entry. Even if there *was*, by the time I got here, it’d be too late. If we went by the first commenter’s standards, Boing Boing’s running eleven different pieces of spyware.

The weird thing is that Johnson’s response is extremely weak (emphasis added),

And as far as Boing Boing‘s tracking and analytics goes, I can’t really argue against his general point. It’s useful for me as a writer and small businessman to have some basic stats (tracking pageviews to understand what sort of articles readers find compelling, for instance), and I think most people understand that a baseline of metrics is par for the course on commercial sites, but I hate the amount of tracking the comes out of the ad networks, too, and it only seems to be getting worse. There’s rarely more perfidious Javascript than that coded by an ad network programmer.

First, I think he’s totally wrong about the bolded part. Most people don’t have a clue just how much data the typical website is gathering about them. If you started talking to them about “baseline metrics” as Johnson does their eyes would glaze over.

But even assume that is true, so what? Saying it is useful and most people have come to expect it seem like the sort of weasel words we’d see from any industry trying to cover its ass.

Johnson continues and here’s where he really goes off the rails,

But there’s one difference between web-based tracking and the sort of analytics that Pinch Media gathers on the iPhone: it’s pretty simple to figure out what stats tracking occurs between a web site and a browser on a computer, as Yardley shows; it’s much more difficult to discern—or even be aware of—tracking that occurs in a closed system like the iPhone. And it’s not FUD to point it out so users can make their own decision.

That is a complete crock of shit. It is, in fact, extremely difficult for most people to figure out what is going on when they visit a website. I know pretty much what Boing! Boing! is doing in the background because I run Adblock and NoScript and can quickly look at all of the stuff Yardley points out.

The secretary down the hall has no clue. Moreover, my experience has been that once you show people and they understand, rather than being empowered they are resigned to going along with the system because they have little choice to do otherwise.

I can quickly right click on the NoScript button and enable the Flash movie that I want to see but that it blocked. The secretary has better things to do than spend all of her time trying to guess which script on the page is serving up necessary content and which is going to rat her out to some other server.

And before anyone beats me to it, I do run two services here — Google Ads and WordPress.com stats. Google Ads because I’m a greedy bastard, and WordPress.com stats because I wanted a basic stat tracking without the overkill that is Google Analytics. I’m not prepared to defend either one as motivated by anything other than crass self-interest.