A million words

I believe every day is significant

I enjoy remembering my life, reminiscing, reflecting, and learning about myself just as much as the next person. Almost everyone keeps a journal, and I'm no exception in that regard. However, I have recorded much more than quotes, tidbits, or random thoughts for each day. Journaling my own life in more and better ways has become one of my greatest passions.

Since mid-2007, I've been writing daily entries in a "Life Log," a giant Rich Text File (RTF) that lives on my Mac. I chose RTF back when I started because I wanted to use a lowest common denominator format that would stand the test of time (or so I figured back then). In almost six years of writing, this file has grown to about 970,000 words.

Managing this Log has become a real pain, especially now that I want to do quantitative and qualitative analysis on it. Text is easy, but it has its limits. I've wanted to migrate my Log into a better format for a long time, but some helpful folks and my recent interest in self-tracking have turned that desire into a real project.

The Goal: Get to Day One

I've decided to adopt Day One as my new daily journal manager, for many, many reasons. It uses XML to store the entries themselves, which has been widely adopted and considered a standard for structuring many kinds of data. Its writing input recognizes Markdown, something I initially didn't feel comfortable with but have come to embrace since creating my entries in Writedown for several months. And finally, prolific productivity guru Brett Terpstra has created a Ruby program called Slogger that will automatically pull social feeds into Day One, adding richness I would otherwise have to include by hand (and often have in the past).

Bottom line, getting into Day One will add value, longevity, portability, and automation to my journaling. However, getting my existing journal text into Day One is not quite as simple as hitting an "Import" button.

The Problem

My entries in the Life Log generally consist of 1000-2500 words, each written as a chronological narrative, broken into paragraphs by line break + tab or a double line break. I've kept the date format consistent throughout the file and have a " - - - " separating each discrete entry.  I also have phases where I include extra sections at the end of the entry, such as Clothes, Weather, IM conversations, or Audio. Notwithstanding idiosyncrasies such as time periods with missed entries, the occasional month summary that I did for a while, and several year-end summaries, I've been relatively consistent in recording and structuring my thoughts. 

Day One likes Markdown, so converting this personal system into Markdown is my first challenge. I have used bolding, italics, all caps, and other forms of emphasis irregularly throughout the Log, so parsing these into appropriate Markdown equivalents presents one challenge. My choice of RTF has proven a poor one. There is no simple way to convert an RTF into Markdown, as I've discovered that RTF includes an inordinate variety of markups, many with no direct translation into Markdown syntax.

Next is the problem of splitting the properly converted Log text into discrete files for importing into Day One, which is very strict in this regard. Thanks to their tools page I located a handy script that can convert properly named Markdown files into Day One entries.

Next Steps

Today I spent a couple of hours refreshing my little Python knowledge. Thanks to Code Academy I've gotten the gist of this relatively straightforward scripting language, and already written some simple code that "cleans" a plain text subset of my Log, removing some of my own formatting inconsistencies. I've also diagrammed a rough outline for a loop-based workflow that can parse and split the text into entries.

I'll post some of this code as I get farther along. Every step is bringing me closer.

More to come!

Recommended Extras