In the 20 or so years I have been using computers to write, I have accumulated
a ton of files. From school essays to love letters to freelance writing, not
to mention the 70,000+ e-mail messages accumulated just in the past three or
four years, the personal data files on my hard drive consume about 5 gigabytes
and are growing at an alarming rate. With that much information, finding a specific
part of it can be difficult. I know somewhere on my hard drive I have a paper
I wrote in college about Ingmar Bergman, but finding it is another matter entirely.
For the past couple years, I have been looking for a way to tame this mess
and have tried probably half a dozen different programs that claimed they would
help bring my data under control. The main problem with most of these applications
is they were not really designed for the sheer volume of data I have. I evaluated
quite a few products that would have worked great if I had maybe a couple hundred
megabytes of data — but they simply choked at what I was throwing at them.
A couple months ago, though, I found the Holy Grail I had been looking for
in a program called dtSearch. The product literature for dtSearch claimed it could quickly index and search many gigabytes of information. Not
only did it promise to handle a plethora of file formats (I am a bit behind
on migrating all of my data to current file formats), but that it would also
be able to index all 70,000 of my Eudora e-mail messages as well. Since a 30-day
trial version was available on the dtSearch web site, I downloaded the program
and set it to indexing my data.
I was extremely impressed. No program is going to be able to instantaneously
index several gigabytes of text, but dtSearch was extremely fast. It indexed
all of the data I threw at it in just a couple hours. More importantly, after
the initial indexing was finished, dtSearch intelligently handled re-indexing
of new or changed files. For example, I periodically go through my main e-mail
folder and move e-mail messages into archive folders by month and year. After
doing a thorough reorganization of my e-mail one night, and then asking dtSearch
to update its index, the program figured out I had moved about 13,000 messages
and updated only those e-mail messages, not wasting time to re-index the other
60,000 or so that I had not touched. Very smart.
On the searching end of things, dtSearch was fast and accurate. The speed was
pretty close to instant on most searches I did. Asking dtSearch to find e-mail
messages containing a particular phrase that were from a specific group of users,
for example, popped up on the screen as quickly as I could submit the search.
Something I really liked about dtSearch was its ability to highlight the hit
words and phrases in the text of most files. For example, if I performed a search
on “Ingmar Bergman,” not only does dtSearch find all of the files
that contain that phrase, but it highlights all of the instances of Ingmar Bergman
in its document window and lets me quickly jump through the document to all
the places where the director is mentioned.
The feature set of dtSearch is amazing. It lets the user adjust search “fuzziness”
which made it possible to search for terms that might be misspelled, so if I
accidentally typed “typogrpgal,” that document would be marked as a hit on a
search for “typographical.” That is extraordinarily useful for someone like
me who has a lot of OCRed documents that I have never had time to clean up.
There are only two real downsides to dtSearch. First, because it builds an
index of all the searchable documents, it eats up a lot of disk space for its
indexes. The promotional literature for the program says index size will be
about 25% of the size of the documents being searched, and that was pretty accurate
in my experience. That’s probably not as much of an issue today, with large,
cheap hard drives being commonplace, but it might be for people with older systems
and smaller hard drives.
The other issue is the price. At $199 for the single user version, this program
is definitely not for the casual user who only does occasional searching of
her documents. On the other hand, for somebody who does need to do find particular
documents or e-mail quickly, dtSearch is a godsend and more than worth $199.