Well, There Might be a Total Asshole Company In Here Somewhere


Seth Dillingham points out that I didn’t do my homework here. Google might be starting out with the Open Directory database, but they are then modifying it apparently using Page Rank.

The bizarre thing is they’ve apparently truncated the listings — the Open Directory database has many more weblog tools listed than does the Google version (unless Google’s version is old and not being updated, which seems unlikely).

In that case, it is indeed odd that Radio Userland doesn’t show up there is odd. It may not be the best or most widely used tool, but it certainly has a significant base, and leaving it out is odd.

This certainly validates part of Winer’s point — that Google’s got a stupid way of producing its directory. It looks like they’re using Page Rank to create a half-human/half-machine created directory which is actually less useful than if they’d just reproduced as-is the Open Directory data.

– Brian

Dave Winer has this (unintentionally) amusing slam at Google over the lack of inclusion of his blogging tools in their directory of blogging tools,

Google’s directory of weblog tools. None of the tools I wrote made the list. Centralized directories on the Web are like buggy whips for cars. Let’s fix this bug.
Google, this makes you look like a total asshole company. Your tool is
listed first, and your competitor’s tools aren’t listed at all. When
will it become too embarassing to support this antiquated mode

But, of course, this is not Google’s directory — they have no responsibility at all over what gets listed here. They’re simply rebranding the Open Directory project whose directory is available to anyone. I guess the Open Directory folks are probably in some sort of conspiracy with Google or something like that.

As far as Open Directory, it’s not a bad directory but runs smack into the main problem that creating a general directory of the Internet is pretty much an unmanageable task at this point. Dave’s got his own proposed solution which doesn’t do anything that I can see to obviate the obvious problems with creating a directory of a network that has millions of sites and billions of pages.

I’m surprised that anybody uses these general directories like Yahoo! or Open Directory anymore. It’s a little like encountering an old card catalog for a library with a sign reading, “Warning: this catalog only indexes 5% of the actual known books in the library.” Would you actually bother to use such a tool? Then why bother with Yahoo! or Open Directory?

Is Distribution of Stories in Google News Indicative of Anything?

Meril Yourish is unhappy that a peculiar search she did of stories at GoogleNews didn’t turn up more stories about the Palestinian terrorist cowards who shot a woman and her three kids outside of Jersualem yesterday. Yourish seems to think that this is evidence of bias.

A better explanation is that it’s a function of the way that GoogleNews seems to group stories (and remember, this is still in beta) as well as the search she’s doing.

The search string she is using is looking for news articles similar to some article from The Statesman that used to be in the GoogleNews index (when I follow her link now, it says that story is no longer in the index).

If you do a GoogleNews search on Israel, a story mentioning the terrorist act comes up #6 in the results listing.

The “find more news articles like this” that Yourish is using is likely to produce the weirdest results since Google’s using some sort of heuristic method to automatically link related news stories together. Sometimes it does this pretty good, but since these news stories rarely (if ever) link to each other, a lot of times you get some really odd results. This is also why sometimes on the front page of the GoogleNews site you’ll see some story and next to it a picture that has absolutely nothing to do with the story. A few days ago I was looking at their sports section and they had a series of related stories about basketball accompanied by a picture of a tennis player. Go figure.

It would be an interesting exercise to use Lexis-Nexis to look for difference in coverage of the killing of Palestinians vs. the killing of Israelis in mainstream media coverage. But GoogleNews is far too idiosyncratic and sometimes downright weird to make the sort of claims that Yourish is about it.

Which is not to say that I don’t absolutely adore GoogleNews. As I’ve said before, if I want recent information fast, I usually hit GoogleNew first and only hit Lexis-Nexis if I don’t turn up anything there (and I get Lexis-Nexis access free so cost is no issue).

Conspiracy Theories About Google

Dave Winer is apparently impressed by Daniel Brandt’s anti-Google rantings. But as this Salon.Com article documents, Brandt is a nutty conspiracy theorist (just go a few links deep at his NameBase.Org who is pissed off because *his* page about Donald Rumsfeld, and a whole host of other people, doesn’t show up very high in Google searches.

I particularly love the brief explanation Brandt offers of why Google’s PageRank sucks,

It’s democratic in the same way that capitalism is democratic. You could have the cure for cancer on the Web and not find it in Google because ‘important’ sites don’t link to it.

But, of course, if there were a cure for cancer posted on the web, then it is likely that lots of people would link to it, much like many scientists would end up citing a paper that outlined a successful cure for cancer.

What Brandt wants is for Google to be democratic in the same way that the Democratic Republic of North Korea is Democratic.

In fact, as Salon notes, Brandt believes that if you search on “Donald Rumsfeld” his page about Rumsfeld should be shown before Rumsfeld’s DoD biography page, even though it is largely useless and almost impossible to navigate (the main problem with NameBase is that it is an index of citations largely of the conspiracy literature which Brandt has personally read).

Update: A good example of one of Brandt’s nutty conspiracy theories his his speculation about China’s blocking of Google in which Brandt argues that “China may be well-advised to block the use of U.S. engines to protect their own national security” because Google may be sharing data about Chinese users with the National Security Agency which would, in Brandt’s mind, “put the NSA at a tremendous advantage in determining where pro-U.S. sentiment may exist in China.”