Google eBookstore Fail

Google’s new e-book initiative had such potential, but taking a look after it launched it was hard not to think: this is it? This is all Google could come up with?

On launch day, the entire project was largely  useless. No wish list support? On the Android app, they couldn’t be bothered to make the author hyperlinked so I could quickly see what other titles by the same writer were available? Adding a book to the library meant being kicked back to the main screen rather than to the search page I’d landed on? Are you kidding me?

Google’s eBookstore effort looked like it was knocked out by a couple of college students over a weekend as part of a half-ass class project (which seems to describe a lot of the efforts coming out of Google lately). But a bigger problem was the supposed openness that Google touted to its e-book effort, and what in fact turned out to be yet another closed system.

So I can read my Google ebooks on a number of platforms, including iOS and Android devices as well as the Sony Reader and the Nook. Frankly that’s not very impressive and not all that different from the other big e-book services out there. Moreover, DRM is baked into many of the commercial books sold by Google. That is not surprising, but again just what is Google offering that is any different from the 3 or 4 other big players in this market?

Here’s what would have impressed me — let me upload the non-DRMed ebooks I already own into a book locker at Google’s site and let me manage those the same way I can manage the books I get from Google. Let me upload all of my Baen books in epub format, for example, and then read those across all my devices. Let me take some of the PDFs I’ve got of books that are self-published and have a Creative Commons license and add those too.

Google, of course, won’t let you do that for the same reason it is running into problems trying to launch its music locker services — publishers would likely scream and withhold their content. I get it. Although it would be cool, Google would be left without many business partners and probably only a fraction of the 3 million ebooks it now advertises as being available.

But, at least then it would be something new and potentially revolutionary. As it is, Google Books is just another retread of every other e-book offering out there. If I were to go with a DRM-heavy service, frankly I’d go with the Amazon Kindle at this point.

Gmail Adds Nested Labels to Labs

Gmail Labs recently added a nested labels feature to Gmail. This is a feature that has been available for quite a while thanks to the Folders4Gmail GreaseMonkey script, but it is nice to see Google move to add this feature in Labs so it will work in any browser. Out of the box right now it doesn’t work quite as well as the Folders4Gmail script did, but presumably Google will tweak it over the next few weeks to bring it up to that level.

How Much Does Google Earn from Typosquatting?

New Scientist reports on a study by two Harvard researchers suggesting that Google may earn as much as $500 million annually from typosquatting — those ad-filled domains that target people who mistype the domain name of popular websites.

According to New Scientist,

[Tyler] Moore and [Benjamin] Edelman started by using common spelling mistakes to create a list of possible typo domains for the 3264 most popular .com websites, as determined by Alexa.com rankings. They estimate that each of the 3264 top sites is targeted by around 280 typo domains.

They then used software to crawl 285,000 of these 900,000-odd sites to determine what revenue the typo domains might be generating.

If the top 100,000 websites suffer the same typosquatting rate as the sites Moore and Edelman studied, up to 68 million people a day could visit a typo site, they say. They estimate that almost 60 per cent of typo sites could have adverts supplied by Google.

If the company earns as much per visitor from ads on typo sites as it reportedly does from ads alongside search results, it could potentially earn $497 million a year in revenue from typo domains, they conclude.

Edelman is co-counsel on a lawsuit against Google by a firm seeking damages from Google for serving ads on a typosquatting domain, so he’s not a disinterested party here.

Google will stop serving ads on typosquatting domain names if the owner of the “legitimate” domain name complains, but it doesn’t pro-actively seek out such domains.

Edelman apparently wants Google to do so and notes that typosquatters tend to own hundreds or thousands of such domain names, so presumably Google could block Domain X and then also perhaps use WHOIS to find other typosquatted domain names owned by the same person. That seems like an extremely problematic solution that would be relatively easy for squatters to route around fairly easily.

If, as Edelman says, it is fairly easy to identify typosquatters, why not sue them directly, perhaps in a class action representing the presumably thousands of businesses who are allegedly harmed by this practice?

Google Finally Adds Creative Commons Image Search Option

Google finally added the ability to restrict image searches to only images that are tagged with a Creative Commons license.

This feature allows you to restrict your Image Search results to images that have been tagged with licenses like Creative Commons, making it easier to discover images from across the web that you can share, use and even modify. Your search will also include works that have been tagged with other licenses, like GNU Free Documentation license, or are in the public domain.

Nice.

Folders4Gmail Greasemonkey Script

I don’t know about the rest of you, but I have a real love/hate relationship with Gmail. For the most part I love it, but Google seems to have Steve Jobs Disease in thinking that for key features there is only One True Way to implement a feature. If you don’t happen to like that way, Google’s support is happy to post friendly “you don’t know what you’re talking about” responses (see, for example, their responses to the completely screwed up way Gmail will associate completely unrelated e-mails into the same Conversation).

Fortunately, there are scripts and add-ons to deal with most of the defects, such as the Folders4Gmail Greasemonkey script which lets the user organize labels into a hierarchical structure. Google says nobody needs this, but I beg to differ. This is extremely helpful. For example, each of my web sites sends out numerous administrative e-mails which I assign labels to. It’s extremely helpful to have a Web Sites –> websiteX hiearchy, so it’s easy to quickly drill down to these particular set of e-mails without cluttering up the labels list.

NYT Exec Throws Fit Over Placement in Google Searches

Advertising Age has a great example of the complete and utter incompetence many newspaper executives still have when it comes to the web.

Then in January, Martin Nisenholtz, New York Times Co. senior VP-digital operations, got up at the annual Online Publishers Association summit in Florida, an event closed to the press, to blast both the algorithm and the results presentation on the screen.

Priorities
He’d just run a search for Gaza, which had been at war with Israel since Dec. 27. Google returned links to outdated BBC stories, Wikipedia entries and even an anti-Semitic YouTube video well before coverage by the Times, which had an experienced reporter covering the war from inside Gaza itself.

Search results for “Gaza” on March 20 began with two Wikipedia links, a March 19 BBC report, two video clips of unclear origin, the CIA World Factbook, a Guardian report and, most strikingly, a link to Gaza-related messages on Twitter.

The last paragraph isn’t quite accurate. What shows up is not a report from the Guardian on Gaza, but rather a portal-style page where The Guardian has *gasp* actually assembled a page listing all of its recent news stories on Gaza along with links to videos, photographs, commentary and assorted other resources on Gaza.

Now, there’s nothing stopping the New York Times from creating a similar aggregation page, except that it is too busy whining at publisher meetings and tinkering with nonsense like ACAP to think about what its readers (and, in turn, Google) might find valuable.

The reason the NYT didn’t show up on on the first page of a Google search on Gaza is quite simply that it didn’t deserve to. Maybe it can borrow more money Carlos Slim Helú to buy a clue. It might start by finding a VP of digital operations who knows what he’s doing.

Google and “Duplicate” Content

A couple months ago, Google clarified how its search engine handles duplicate content, but I still keep running across people and blogs getting all twisted in knots as to whether or not Google will penalize them because all the URLs on their site can be reached both with and without a trailing slash.

As Google made clear back in September,

But most site owners whom I hear worrying about duplicate content aren’t talking about scraping or domain farms; they’re talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site’s performance, but it doesn’t cause penalties. From our article on duplicate content:

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don’t follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

So unless you’re intentionally trying to deceive Google with dupliate content sites, the worst that is going to happen with duplicate content is Google might decide URL X is the canonical URL for some content on your website, when you really might prefer it use URL Y.

So enough with all dire warnings and silly plugins designed to prevent the horrors of Google finding duplicate content.

Scroogle.Org

Google may not yet be evil, but it is certainly moving further and further down toward that end of the continuum with its extremely poor privacy practices in combination with the almost absurd amount of user data it appears to be logging and storing.

With that in mind, I suspect more services like Scroogle will arise to route around Google’s blase attitude toward user privacy.  Scroogle is basically a Google search proxy. Enter your search into Scroogle and it passes it on to Google using one of a small number of IP addresses, so yours is never logged. Scroogle then intercepts the cookie that Google returns and then displays just the actual search results.

Unlike Google which stores user identifiable information about the search for 18 months, Scroogle promises that a) it doesn’t store search terms at all and b) it only maintains logs for a maximum of 48 hours.

I noticed Daniel Brandt, who I’ve criticized in the past for his conspiratorial ways, is listed as one of the directors of the Scroogle effort. It’s nice to see him turn his anti-Google obsession to positive solutions.

Google Webmaster Feature Improvements

Google implemented some nice features this month for those of us who run websites to help get a better handle on who and how many people are following along with our sites.

In early February, in greatly expanded the ability to track external links to a web site you control. It is possible in Google already to get a list of pages that link to a particular page, but for some reason Google is actually only presenting a small subset of that information to the public.

Once you have verified your site with Google (which involves adding a meta tag), Google will now let you view and download a more comprehensive list of external links to any site you control. From my cursory look at the external links reported by Google to this site, there are still a lot of external links that Google either doesn’t know about or aren’t being included in this more comprehensive report, but it is still much better than what you get from using the link operator in the search engine.

The other thing Google did was add a notation when Google Reader hits an RSS feed that indicates how many people are subscribed to that feed. Open up a server log and search for the Google Reader bot and you can quickly see how many people are subscribed to the feed via Google Reader. Bloglines and other web-based RSS readers have had this feature for a long time, so it’s nice to see Google Reader finally implement this as well.

Tutorial for Adding OpenSearch via Google to Your Blog/Website

DeWitt Clinton has
a nice tutorial on adding an OpenSearch plug-in to your website.

OpenSearch is a search engine description standard supported by both Firefox 2 and Internet Explorer 7 that makes it easy to customize the search engine. For example, if you’re using Firefox 2, you can left click on the down arrow next to the search box and see an option to “Add Brian.Carnell.Com”, which will give you the option of search this site from the search bar.

I’m not necessarily sure why you’d want to do so, but it’s there if you’re as obssessed with my life as I am.

The OpenSearch setup for this site uses the internal search engine, but Clinton’s tutorial shows how to set one up a Google search of just your site. But the example is easily modifiable to use your own blog or web site search engine.