It’s a Digital World (Toward the Paperless Office . . . and Life)

A couple days ago, Seth Dillingham posted about his initial experience with Snapfish — a company that takes your film negatives, develops them, sends you the negatives and prints, and then posts high-res versions of the photos online that you can download for a small fee. Seth’s posting piqued my interest and soon I tracked down the web sites as well as reviews of several of Snapfish’s competitors.

The one that I’m going to give a whirl this week is ClubPhoto.Com. The downside with ClubPhoto.Com is that for what I want, you have to pay a $24.95 to $34.95 yearly fee. But after that, for $1/roll plus shipping they will develop your film, send you the negatives, and place high-res scans on their web site that you can download for free.

That is exactly the sort of service I’ve been looking for, since the only thing I ever do with my prints is scan them and then pack them away in archival safe binders in my basement.

I know I should just buy a digital camera, but I have yet to see one that can do what my Olympus 170 Zoom point-and-shoot camera can do. Excellent zoom and decent optics, no shutter lag, and it only cost me a bit over $300. Most of the digital cameras that I’ve used either have bad shutter lag (and I am the most impatient person in the world, so this just drives me bonkers), poor optics/no zoom, or is in the stratospheric price range. Show me a digital camera that’s 4x3x2, has 3x optical zoom, no shutter lag, 4mp resolution and under $400 and I’ll consider changing my mind.

In response to my praise of ClubPhoto.Com, Seth wrote,

I like having prints, I have boxes and boxes full of them. How old fashioned of me! 🙂

A minor obssession of mine over the past few years has been to dramatically reduce the number of physical objects I have to keep on hand and manage. My approach with photos and pretty much everything else where it is applicable is simple — scan it, categorize it, and then store it for safe keeping depending on how important it is. Film negatives I keep. A lot of personal papers I’ve simply scanned and pitched.

Seth’s groupware product, Conversant, is partially responsible for this as I’ve grown addicted to the ease with which Conversant allows me to manage the things I post to my web site. Ultimately, I’d like to have every photo, document, journal, essay, etc. that I’ve ever produced available in such a system.

At the moment my laptop contains about 14 gigs worth of such materials (not including pictures), and I’d say I’ve probably got another 20,000 to 30,000 pages of materials still in analog-only form plus another 20,000 to 30,000 pictures which I’m gradually converting to electronic formats.

The one thing I have to keep around that I’d really like to get rid of is my book collection. At the moment, my book collection has reached 1,704 volumes. Maybe 10 percent of those I’d actually want to keep around, but for the rest a PDF scan of the books would more than meet my needs (actually, it’d probably drastically increase their value, just as I get a lot more out of my CD collection now that everything’s converted to MP3s and the CD’s are stored in the basement).

I am looking forward to the day when someone finally ships a laptop with a 1 TB hard drive so I can have access to every song, every book, every picture, every article, essay, recorded speech and radio show at my fingertips no matter where I am.

Knowledge Logging and E-Mail

Seth Dillingham posted a response today to this post by John Robb about whether or not e-mail is an appropriate tool for waht Robb calls “knowledge logging.”

Robb has a lot of excellent insights about knowledge management and I try to follow his posts pretty closely, but Robb also has a habit of ignoring or denigrating worthwhile tools that do not fit into Userland’s plans (i.e., half the time his posts are excellent, half the time they’re just Userland marketing drivel).

One of his ongoing projects is dismissing e-mail as an effective component of knowledge management, but his claims make no sense at all. According to Robb, e-mail is:

1. Too time consuming — Robb claims it takes him 3-4 hours to go through 200 e-mails where he can scan 500 weblog posts in just 20 minutes. This can’t be serious. From my experience, e-mail is much faster to go through, especially if you have an e-mail client with decent filtering.

My Animal Rights site gets 50-60 posts a day. All of those posts are sent to me via e-mail, filtered into a folder, and I can go through them all very quickly — far more quickly than I could by reading them on the web site.

And I know I’m not alone in this. Many of the people who access my web sites do so only via e-mail. They never actually visit the web version because e-mail is so much easier to deal with.

I suppose Robb might reply that they could go even quicker by using an RSS aggregator tool like Radio, but a) nobody outside of a (growing) handful of geeks knows what RSS is, and b) few people want to learn yet another application. Everybody has e-mail these days, however.

2. Not Archived and Horrible Search Features — I have about 400 megabytes of archived e-mails, so I’m not so sure what Robb is talking about here. Most e-mail lists I’m subscribed to have some external list archive as well, so if my local archive is destroyed there is always a public archive.

I use Eudora and can search my local archives very quickly. I needed to find a friend’s phone number last night, and it took just a couple minutes to find the relevant e-mail I was looking for. And the reason it took that long was the real problem with search functions, which is figuring out how to generate a request that will return the desired results.

Robb sums up by saying,

For sharing knowledge with a large group of constantly
shifting individuals; K-Logs win hands down.

I couldn’t disagree more. E-mail wins hands down for this purpose.

But beyond that, I want my knowledge management tools to be largely independent of the particular way that users want to access the information. I prefer e-mail. Others prefer their browser. Some folks might want to use Radio. Others might want to use a newsreader. Design tools such that users can get to the information however they want. Macrobyte has this philosophy exactly right in their documentation for Conversant,

Although most people see a Conversant conversation primarily through the web, it’s important to understand that Conversant is not, in and of it’s self, a web application. As much as possible, Conversant is ignorant of what Input/Output method is used to bring information in and out of the application.

. . .

The advantage of this design is that at anytime additional I/O modules may be written to provide alternate means of access a conversation without requiring any changes to the modules already in place.

That’s just beautiful, man.

Concept Maps Are Confusing

An Associated Press story about concept mapping is making the rounds and getting a lot of attention from the usual suspects. The story discusses work at the Institute for Human and Machine Cognition designed to find a better way to browse information on the web.

IHMC associate director Alberto Canas wants to know, “Why should we organize it as pages? There’s no reason. It’s just that we’re used to it.” He thinks it would be much easier to browse concept maps that provide a graphic representation of a subject and all its related subjects.

I think Canas is on the wrong track. Concept mapping or mind mapping or whatever you want to call it is an excellent way to brainstorm or for individuals or groups to begin to find ways to organize information they already have, but for general use concept maps are extremely confusing and almost useless in my experience as everyday navigation tools.

One of the existing concept maps the Associated Press mentions, for example, is NASA’s concept map for its Center for Mars Exploration, which nicely illustrates everything that is wrong with concept maps.

If I’m a science teacher wanting to organize my thoughts about all of the issues surrounding a Mars expedition for a class unit, constructing a concept map like that is probably a pretty good way to lay out the various things I might want to cover. But if I’m a reader who knows little about Mars exploration, this gives me way too much information.

More importantly the concept maps don’t do a good job of providing proper context for information. I have no idea how important any given choice on the map is nor how relevant it is likely to be to any specific questions I have (unless I happen to view the world in precisely the same way that the person who put together that map does).

On the other hand, compare that to the front page of The Whole Mars Catalog. I’m not sure that the simple list of links is all that more helpful, but it provides essentially the same ultimate navigational tools without being completely overwhelming with arrows pointing everywhere.

I seriously doubt that concept maps will ever be widely used for web navigation.

dtSearch Slices and Dices Text — For A Price

In the 20 or so years I have been using computers to write, I have accumulated
a ton of files. From school essays to love letters to freelance writing, not
to mention the 70,000+ e-mail messages accumulated just in the past three or
four years, the personal data files on my hard drive consume about 5 gigabytes
and are growing at an alarming rate. With that much information, finding a specific
part of it can be difficult. I know somewhere on my hard drive I have a paper
I wrote in college about Ingmar Bergman, but finding it is another matter entirely.

For the past couple years, I have been looking for a way to tame this mess
and have tried probably half a dozen different programs that claimed they would
help bring my data under control. The main problem with most of these applications
is they were not really designed for the sheer volume of data I have. I evaluated
quite a few products that would have worked great if I had maybe a couple hundred
megabytes of data — but they simply choked at what I was throwing at them.

A couple months ago, though, I found the Holy Grail I had been looking for
in a program called dtSearch. The product literature for dtSearch claimed it could quickly index and search many gigabytes of information. Not
only did it promise to handle a plethora of file formats (I am a bit behind
on migrating all of my data to current file formats), but that it would also
be able to index all 70,000 of my Eudora e-mail messages as well. Since a 30-day
trial version was available on the dtSearch web site, I downloaded the program
and set it to indexing my data.

I was extremely impressed. No program is going to be able to instantaneously
index several gigabytes of text, but dtSearch was extremely fast. It indexed
all of the data I threw at it in just a couple hours. More importantly, after
the initial indexing was finished, dtSearch intelligently handled re-indexing
of new or changed files. For example, I periodically go through my main e-mail
folder and move e-mail messages into archive folders by month and year. After
doing a thorough reorganization of my e-mail one night, and then asking dtSearch
to update its index, the program figured out I had moved about 13,000 messages
and updated only those e-mail messages, not wasting time to re-index the other
60,000 or so that I had not touched. Very smart.

On the searching end of things, dtSearch was fast and accurate. The speed was
pretty close to instant on most searches I did. Asking dtSearch to find e-mail
messages containing a particular phrase that were from a specific group of users,
for example, popped up on the screen as quickly as I could submit the search.
Something I really liked about dtSearch was its ability to highlight the hit
words and phrases in the text of most files. For example, if I performed a search
on “Ingmar Bergman,” not only does dtSearch find all of the files
that contain that phrase, but it highlights all of the instances of Ingmar Bergman
in its document window and lets me quickly jump through the document to all
the places where the director is mentioned.

The feature set of dtSearch is amazing. It lets the user adjust search “fuzziness”
which made it possible to search for terms that might be misspelled, so if I
accidentally typed “typogrpgal,” that document would be marked as a hit on a
search for “typographical.” That is extraordinarily useful for someone like
me who has a lot of OCRed documents that I have never had time to clean up.

There are only two real downsides to dtSearch. First, because it builds an
index of all the searchable documents, it eats up a lot of disk space for its
indexes. The promotional literature for the program says index size will be
about 25% of the size of the documents being searched, and that was pretty accurate
in my experience. That’s probably not as much of an issue today, with large,
cheap hard drives being commonplace, but it might be for people with older systems
and smaller hard drives.

The other issue is the price. At $199 for the single user version, this program
is definitely not for the casual user who only does occasional searching of
her documents. On the other hand, for somebody who does need to do find particular
documents or e-mail quickly, dtSearch is a godsend and more than worth $199.