Tutorial for Adding OpenSearch via Google to Your Blog/Website

DeWitt Clinton has
a nice tutorial on adding an OpenSearch plug-in to your website.

OpenSearch is a search engine description standard supported by both Firefox 2 and Internet Explorer 7 that makes it easy to customize the search engine. For example, if you’re using Firefox 2, you can left click on the down arrow next to the search box and see an option to “Add Brian.Carnell.Com”, which will give you the option of search this site from the search bar.

I’m not necessarily sure why you’d want to do so, but it’s there if you’re as obssessed with my life as I am.

The OpenSearch setup for this site uses the internal search engine, but Clinton’s tutorial shows how to set one up a Google search of just your site. But the example is easily modifiable to use your own blog or web site search engine.

Google Co-op Search

Google Co-Up search went live this week and is just as awesome a search product as we’ve come to expect from Google.

Like Rollyo, Google Co-Op allows the user to define a custom search engine that performs a search on a subset of websites. For example, here’s a quick custom search of Google that searches Buffy: The Vampire Slayer and Angel-related web sites:





Google’s approach is light years ahead of what Rollyo offered. First, Google’s tool allows apparently an unlimited number of sites to be added whereas Rollyo peaked at a few dozen.

Google also lets you not only create a customized search, but also lets you embed the results directly into your own website and get a cut of ad revenue generated from the search. That’s at least 12 kinds of awesome.

To Those In Favor of Search Engine Spam, We Salute You

A number of conservative websites have been slamming Google lately — it started with Google’s foray into China, but has ranged from Google’s removing sites from its Google News service because they ran racist anti-Muslim screeds to rumors that Google is a major source of funding for MoveOn.Org.

But the anti-Google screed that takes the cake is Glenn Reynolds’ pimping of Dan Riehl’s anti-Google screed.

Riehl is angry that Google removed his site from its index. Riehl simply lies about that de-listing in the post that Reynolds links to (emphasis added),

After a month of hearing nothing from Google, I emailed them a post from Instapundit today. I also wrote that if they were smart they would understand that in order for their newer products to take hold, they needed early adopters – precisely the kind of people they are consistently ticking off due to little if any real customer service. So, after a month of nothing, imagine my surprise, they pick today, after my nth email with Glenn’s post included, to respond.

Hi Dan,

Thank you for your note. Your page has been blocked from our index because it does not meet the quality standards necessary to assign accurate PageRank. We cannot comment on the individual reasons your page was removed. However, certain actions such as cloaking, writing text in such a way that it can be seen by search engines but not by users, or setting up pages/links with the sole purpose of fooling search engines may result in permanent removal from our index. Please read our Webmaster Guidelines at
http://www.google.com/support/webmasters/bin/answer.py?answer=35769 for more information.

Thank you for taking the time to write.

Regards,
The Google Team

That’s a non-answer. I’ve done none of those things. I have no hidden links, couldn’t create blind text if I wanted to – and previously have offered to correct anything necessary, if only I knew what it was. I don’t and apparently never will, thanks to Google. And public relations like this is going to leave their new initiatives in the basement where they belong.

In fact, of course, Riehl allowed search engine spammers to create thousands of junk subdomains like this one which is still up and working as of May 29, 2006.

Frankly, I don’t think Google has done enough to deal with the search engine spam problem, especially from its own Blogger service, but I’ll leave it to Dan and Glenn to argue that Google’s being heavy-handed in blocking a domain that was hosting thousands of search engine spam pages.

Sometimes Even I Wonder What Google Is Thinking

Chinua Achebe is a famous novelist. His 1958 book, Things Fall Apart, is a widely read novel describing the effects of Western colonialism.

Anyway, in 1987 Achebe wrote a novel about the failure of African politicians and intellectuals, Anthills of the Savanna which is also apparently very good (I haven’t read it). But Google currently spits out this site as the #1 link if you search on the title.

In the early 21st century some organization or another came up with a top 100 list of African novels, and I posted the list to this website so I could keep track of it and gradually work my way through the ones that were available in English.

Anthills of the Savanna was on that list, which led someone to post a discussion group message in 2004 requesting “notes” about the novel.

And then Google indexed the damn thing and decided it was the #1 page out there about Anthills of the Savanna.

WTF.

Google Desktop 2 — Better, But Still Buggy

Desktop search is one of my obsessions, especially given how much data I tend to generate. Google’s first effort at a desktop search tool left me nonplussed, but the recently released Google Desktop 2 makes quite a few strides in the right direction even if it continues to come up short.

The good news is that Google now includes the ability to index and search a number of additional file types out-of-the-box, including PDFs as well as MP3s and graphics files that have embedded metadata. In addition, it includes a plugin architecture so developers can add the ability for the program to index additional file types.

The program is elegant and simple to use, though it also lacks a lot of the power that the best Windows search product, DTSearch, features. On the other hand, it lacks the unwieldy interface that DTSearch uses to give users access to all that searching power. There’s definitely a tradeoff between the two products in useability vs. complexity of queries.

But the real drawback to Google Desktop 2 from my brief usage is that, like the initial Google Desktop Search product, it chokes on very large file sets which causes regular crashes.

Google Desktop 2 made fairly quick work of my 260,000 archive e-mails, crashing only 3-4 times while indexing them. But when its crawler turned to the My Documents folder on my primary data drive, it was a whole other story. So far, the program has crashed upwards of two dozen times and after almost a week only reached the 46 percent complete marker. Yes, that’s more than 250,000 files so far, but it is very annoying that it cannot index all 160gb worth of PDFs and HTML files without the constant crashing.

It certainly takes DTSearch a very long time to index that huge volume of files, but DTSearch has never crashed on me while indexing. Google Desktop 2 also insists on playing nice and only indexing during “idle time.” The only problem is it never says what triggers “idle time.” I’d prefer to be able to override that setting and have an “index now” button that would just run through the indexing without me having to wonder whether this or that action is going to throttle it back down (for example, even a low-impact task like editing a text file seems to cause Google Desktop 2 to idle).

Probably for most people, the fact that Google Desktop 2 does much of what a product like DTSearch does and at zero cost will render it an ideal desktop search tool. Me, I’m torn between Google Desktop 2 and its elegant interface but crashing problems, vs. DTSearch and its rock-solid performance but uglier-than-sin interface.