Jason Kottke decides to start a meme and Cory Doctorow can’t help but jump on the bandwagon about the WhiteHouse.gov robots.txt file which went from 2,400 lines to 2. Of course there must be some nefarious purpose there or lesson about the closed nature of the Bush administration vs. the new open Obama administration.
Kottke tells us the difference represents “a small and nerdy measure of the huge change in the executive branch of the US government today” and Doctorow tags his post with CIVLIB just to let us know this is not just some technical issue.
Which, of course, it is. You can view the entire robots.txt file here. For every /directory/ on the Whitehouse.gov site, the Bush administration created a text-only /directory/text/ subdirectory. The robots.txt file tells Google not to index the text-only version so that the complete page remains canonical for Google. In fact, this is exactly what Google suggests doing for sites that have large amounts of duplicated content (on this site, for example, most pages have a print-only option and the robots.txt file instructs Google not to index any URLs that contain /print/).
I wonder if this sort of nonsense is what Teresa Nielsen Hayden meant by “dumb-and-resentful” political commentators.