Mathemagenic On Blog as Personal Productivity/Knowledge Management Tool

Lilia Efimova has a nice summary of using a weblog as a personal productivity/knowledge management tool. Efimova is currently finishing up her PhD and frequently posts to her blog about ideas/information that she has that are relevant to her thesis.

Communication and information sharing. Sharing information via a weblog is not a specific activity, but a by-product of writing. In most cases it’s an advantage; however it limits potential uses of blogging when access to some of the weblog posts have to be restricted. Weblog is not good for a goal-driven communication to a known few people, but it is a perfect instrument for non-intrusive sharing of ideas in cases where potential audience is not well defined.

In the comments, Dave Ferguson expands on this idea,

I agree with several of your points. Usually I’m on the same computer, so accessability isn’t that big a deal for me… but accessability for others is. I have many friends and contacts who aren’t big on blogging. It’s easy for me to say, “go to my blog and search for XYZ. I have a link in the post, so you can go to the original.

I do that all the time to people because, well, I do it all the time myself. The weird thing is there is this other woman who portrays herself as an expert on weblogs and has a very successful business doing so who pretty much says you should never just write a blog, essentially, for yourself that consists largely of things that you want to keep around to reference later. Instead, apparently, it’s not really a blog unless you’re writing for some specific audience, however vaguely you might define that.

Pshaw. People occasionally tell me they this or that post here useful, but for the most part I blog about things that strike a chord in me that I know I will forget about unless I write about them here so I can look them up later. In fact, more than once I have Googled for the answer to some specific problem or another only to find my site comes up on the first page of links, and I think to myself “I wrote about that? When?” (Seriously, I’m not so sure about the Singularity, but I’m ready for a pill that expands human memory like yesterday).

In fact, I love the name of Ferguson’s blog — Dave’s White Board.

That’s also what annoys me so much over the received wisdom from elitists that blogs are useless precisely because they are assemblances of random stuff without any real connecting thread (i.e., they do not tend to be like 500 page nonfiction books or 15 page New Yorker stories). That’s not a bug, that’s a feature.

Creating RSS Feed with Slogger

Ever since I discovered it a few months ago, I’ve been plugging the Slogger extension for Firefox. Slogger allows the user to automate saving a local copy of every web page he or she visits. Doing so takes up 100-150mb a day, at least for me, but storage is cheap so why not?

Slogger also produces a daily page listing all of the pages you’ve logged, linking both to the local and web versions. In early versions of Slogger you could control the look, feel and content of that daily page by altering the template. The latest version of Slogger takes that one step further and allows you to define multiple profiles, so you could have an HTML version of that list as well as an ASCII-text version.

Or you could do what this user’s done and create an RSS feed of all the pages Slogger has logged and then read that in your news reader.

Why would you want to do so? Well, not everyone is like me and wants to save every web page they visit. Some people, instead, configure Slogger so that it only logs a page if they push a button that Slogger installs on the Navigation toolbar.

So you could use Slogger to archive pages that you want to go back and read at a later date, for example, and then maintain a list of those in an RSS channel on your local newsreader.

Slogger — Your Own Personal Internet Archive

I am more than a bit jealous of Brester Kahle’s Archive.Org and for a long time have wanted a tool that would let me create something along the lines of a personal Internet archive. Since storage is dirt cheap, as I’ve ranted and raved about, why not just automatically save ever web page I ever visit?

There are some commercial programs that come close, but not close enough and besides they all tend to be IE only.

Last night, however, I decided to see if there might be any extensions for Mozilla Firefox that allow something like this, and lo and behold I ran across Ken Schutte’s excellent Slogger extension.

This is exactly what I wanted. I have configured Slogger so that every time I visit a page, it automatically saves that entire page. It also appends the page to a log list that shows the page name, the time I visited it, the original URL, and a link to a local copy. It uses a naming convention that uses the date down to the milliseconds for file names to avoid duplicate file name problems, etc.

You don’t have to configure it this way. Slogger can be set up just to keep a detailed history, or it could be configured to save pages at the press of a button rather than automatically.

There are only a couple drawbacks that I noticed in the current version of the software.

First, if you are using tabbed browsing (and if you’re using Firefox you’d be crazy not to), Slogger can only save the page in the active tab. So if you have Firefox set up to load new tabs in the background, Slogger can’t automatically capture those background tabs (in fact, when you load a new background tab it will simply make another copy of the page in the active tab). I just configured Firefox to load new tabs in the foreground — a bit of a change, but nothing I can’t adapt to.

Second, although storage is certainly cheap, making copies of every single web page visited can still chew through hard drive space very quickly. Today has been a very light web surfing day for me, but Slogger has added about 105mb worth of files. I’d imagine on a typical day I’d be looking at 300-400mb of files. That ends up at about about 146 gigabytes/year. Since I only buy external drives these days, that’s an annual storage cost of about $154 at today’s hard drive prices. That’s still less than $.50/day, as far as I’m concerned it’s still incredibly cheap, but, as always, your milage may vary.

Furl.Net

A couple weeks ago I mentioned my frustrating with looking at older articles I or other people have written only to find that the source material those blog entries were based on had gone 404. That’s also, by the way, why I don’t really understand the point of blogs that are nothing more than links — visit the archives of those sites sometimes and they’re basically worthless because of that problem.

Anyway, Conversant is flexible enough that I could put together a system in a couple hours — without having to do any programming or scripting — so I can store copies of all the articles I’m referencing with the blog entries that reference them.

But what about things I run across that I want to save but that I’m not necessarily blogging about? I’ve used a variety of tools over the years to try to solve this problem, but none of them are as elegant as Furl.Net.

Currently in beta, with Furl you sign up for an account — free at the moment — and then add a little “Furl It” icon to your Bookmarks toolbar. Then when you find an HTML page you want to archive, simply press the “Furl It” bookmark and up pops a dialog box where you can assign a number of metadata to the page to be saved, such as a rating, a topic, keywords, description, etc. Hit “Save” on the form and the page is added to your Furl.Net archive.

This wouldn’t be of much use if a) it didn’t work seamlessly, b) it did’t have a nice interface to search through and sort your archive, and c) it didn’t have a way to get your archive off of Furl and on to your local hard drive.

Fortunately, the Furl.Net folks have done a wonderful job of covering all of the bases there. The process works flawlessly from my testing, the interface — especially with all of the metadata options it provides — is wonderful, and there are ways to get your data out today with additional export methods in development (today you can only get the links out via XML, but the developer promises a full export of all data via ZIP or some other compressed filed method is in the works).

Conversant and the Interconnected Weblog

Apparently there are quite a few people like me wanting to organize isolated weblog-style posts into more meaningful collections of information. One proposed solution is to use a weblog front-end for a Wiki (with “biki” and “bliki” being the proposed terms for such said convergence).

This is, of course, exactly what I’ve been working on producing over the past couple years with Conversant . It’s been an amazing process, especially watching the feedback back and forth between the developers and users of Conversant which have helped make it an outstanding tool for the sort of personal knowledge management that the biki folks are looking for.

On this site alone, for example, the weblog posts are filtered into almost 220 different categories or subcategories in a process that just adds a few seconds to the posting process. Once a new entry is posted it is automatically added to the respective category page(s). In addition, there is an enormous amount of cross-linking going on, so articles show links to the category pages that they are relevant to, and the category pages also link to other relevant category pages. So, for example, the category page about the 2000 election links automatically to the Al Gore, George W. Bush, and Ralph Nader topical pages (which, in turn, link automatically to the 2000 election page).

Every single category page also has its own RSS feed which is linked to via the little orange XML icon (and also included in the header of the page for autodiscovery). For example, here is the 2000 Election RSS feed.

At the moment across the six web sites I run, I have almost 1,400 separate categories into which posts are sorted, each with an accompanying category page and RSS feed. My animal rights site currently has 524 separate categories, and adds about one a day.

Everything I do is strictly just a one man show, but the system has the capability to create groups and give permissions to such groups to edit, add, etc. categories and category pages.

Update: the title of this essay was originally “Weblogs+Wiki=Conversant.” This was changed to avoid confusion — Conversant has the features to make an interconnected weblog built in, but it has a different featureset than a Wiki.

In Praise of dtSearch

A few years ago I plugged DtSearch.Com’s excellent indexing and searching program dtSearch Desktop. In the July 2003 issue of Wired Brian Lam picks dtSearch Desktop 6.11 as the best (if, still, expensive) program to index and then search data on your hard drive for Windows-based users. Lam illustrates just how good the program is,

This hard disk detective is the most powerful document search tool on the market. use the Stemming search if you want to crunch all grammatical variations. Need help with typos? A Fuzzy search may come in handy. The app is also capable of doing some amazing phonetic and thesaurus-based searches. When I looked up “mucus,” Desktop 6.11 picked out a document titled “booger.”

More importantly, to my mind, dtSearch is the only program I’ve ever found that a) I could actually afford and b) could handle all of the data I threw at it. I’ve tried pretty much every program like this out there (including Enfish wish Lam lists has his “Best Buy” at only $100, but which I’ve never had anything but trouble with) and this is the only one I’ve found that won’t choke when you start to throw 30 or 40 gigabytes of data at it. DtSearch is also nice in that when it returns a list of documents you can view the documents right there in its built-in browser without having to launch the app (and yes, I’ve seen this in other personal knowledgement software, but again it actually works seamlessly in dtSearch).

In fact, dtSearch is the third part of my three-prong personal knowledge management solution. Between dtSearch, Conversant, and hours of using and getting a better handle on Google, it usually takes me no more than few seconds to get my hands on exactly the information I need.

I’m in the process right now of pretty much ditching all of the paper in my life — literally every piece of paper at work and home is getting, scanned, PDF-ed and indexed (more on that project later). DTSearch runs rings around this data — type in a project I worked on last month and I’m looking instantly at all of the e-mails, memos, invoices, etc. associated with that project. Just a few keystrokes and I can drill down to my heart’s content.

The only drawback is still the price — $200 is still quite a lot of money, but it will pay for itself many times over if, like me, you have large amounts of data to manage and you always seem to need to find documents right now.

(Note: all of the above only applies to Windows.