Sunday, January 18, 2009

Today's XKCD

Just classic.

Dvorak keyboard myth busted

Found this article by Stan Liebowitz and Stephen E. Margolis by way of Slashdot. It debunks the myth of the superiority of the Dvorak keyboard layout. Gives me the chance to feel a little smug about never bothering to learn that keyboard. ;)

Wednesday, January 14, 2009

Tagging email

Summary
To enable effective, portable categorization of my email for searching and archiving purposes, I use Thunderbird 2/3 tagging with some modifications. I either don't touch the default five tags, or I delete them through the options dialogue. This removes a level of abstraction in the tags preserved in my email messages. I use MailTweaks to enable the import/export of tags to allow for viewing tags on my archived email from a different Thunderbird installation than that under which the emails were originally tagged. The tags are saved in the messages in the X-Mozilla-Keys message header. This allows me to use scripts to process the tags into some different format if my choice of email platform changes.

Detail
For a while, now, I've been mildly obsessed with effectively finding old work emails. I don't know whether it's an issue of excessive categorization on my part, or just too much email on too many varied topics. Being able to look back to dig up now-suddenly-relevant-again information, or being able to justify my decisions and actions when a huge cross-functional project enters into "find someone to blame" mode, requires an effective searchable email archive. Limitations: corporate workstation standard is Windows, email server is Exchange.

My first solution was to sort my emails into folders, whether by project, or client, or whatever other criteria. I eventually found myself with folders nested three deep, and not necessarily able to find a relevant email because it could reasonable have been filed into more than one possible folder. 

Around the time that I was getting really frustrated with this approach, I started trying to rely on search engines. The search built into Micrsoft Outlook was not quite fast enough or helpful enough. Google Desktop, when I first grabbed it, couldn't search Outlook mail files. Yahoo came out with a desktop search engine that could, and did acceptably. A bit better than Outlook. And then finally Google was able to search Outlook files. But I hit one fundamental problem with searches. Unless I knew precisely the keywords I was looking for, I might not find anything. Problem is that none of the search engines are smart enough to search for synonyms. The way my brain works, after a couple of years I might have a memory of discussing some issue, but while the meaning of the words in my head match the discussion, the precise choice of words doesn't. 

So, what to do now? I can't just remember keywords, I need a list of them that I can reliably scan quickly and pick the ones I need to search on. And I need to be able to use multiple keywords on a given email, because I might need to find a given email under different contexts, and I might need multiple keywords to narrow my search. Sounds a lot like the tags that are popular on the web now.

So, I dive into the use of tags. There is a nice tool for Outlook, Taglocity, that provides tagging for Outlook email, and even provides a tag cloud for tagging and searching your email. That problem solved. Cool.

But wait. Around that time, I also start getting concerned about access to my email should I no longer have access to Outlook, or my PST file gets a little corrupted, or what have you. 

So, what uses a more widely adopted mailbox storage format that can be accessible from multiple different mail clients? The format in this case is MBOX, which stores messages in their RFC 2822 format. This opens me up to a number of possibilities. The cross-platform choice for me is Thunderbird, a number of others I looked at were based on the same platform. I happily start using Thunderbird's tagging capability for a while. I redo my email filing structure to file emails by month, just so that no folder becomes too big in size, and for ease of archiving later. 

Secure in the knowledge that I had found my cross-platform, cross-application approach to being able to find and archive my email, one day I tried to bring up my email backup on my Linux box as a test. 

Sigh.

All my tags were gone. Now what? If I have to reinstall on another computer, all my tagging is gone, I have to go back to search engines to find old mail. 

Now, please pardon my ignorance at the next few steps, you Thunderbird wizards. Next approach was to use the TagTheBird addon to Thunderbird. Good, I get tags that go with the message, as the tags are added to the message header. Bad, I'm afraid the interface isn't very good. Trying to hit that tiny little pencil icon when I want to add tags is just a pain, and I'm afraid I haven't been willing to tackle the Mozilla extension learning curve with the limited time I have available. And the coup de grace, it's not supported in Thunderbird 3 yet. The reason that is an issue is that my co-workers have a penchant for emailing me 20MB documents that Thunderbird 2 chokes on.

Now, the ignorance I referred to before: because I "lost" my tags when I moved platforms, I had assumed that Thunderbird was storing all my tags in the mail summary file. I spent a happy few hours repurposing Jamie Zawinski's mork.pl to pull labels from the msf file and apply them as X-Tags to my mail messages, when I saw that in fact Thunderbird was saving the labels in the messages in the header X-Mozilla-Keys. Doh!

Last remaining challenge was that the first few tags in your Thunderbird config are saves as $label1 through $label5 in the message header. Not very portable if you have changed the first few labels from the default, which I had done. 

With a little bit of testing, I found that I could get rid of that level of abstraction by deleting the first five tags under Tools->Options->Display->Tags. Now my tags travel around with my email messages. The remaining step is visibility. With MailTweaks installed, I can export all my tags. As long as I export them to my email directory, the export file gets backed up automatically with all my email. And if, for some reason, I come across a spiffy new email client that does everything I want, I should be able to run a perl script on my mbox files to rename an X-Mozilla-Keys header to whatever header my new email client or Thunderbird plugin likes for tags.

The one thing remaining on my wish list, whether for Thunderbird, which I'm using for my work email, or Gmail, which I'm using for my personal email, is a tag cloud interface. Drop-down menus are not an interface I enjoy using frequently, and my tagging habits require heavy use of this interface.