from a Yahoo! and XML geek

Quick Links: Consulting | Book info | Brain Attic | Home

Push Button Paradise

Micah Dubinko

Mon, 11 Jul 2005

Patternalia - Plain Text (Life Pattern)

...storing data in digital format has major advantages, but a major drawback as well--digital data has the ability to be locked to particular systems or software packages. This pattern discusses a key concept small-scale information storage, namely taking advantage of the simplest possible formats all the way down to plain text.

#

A personal story: avoiding proprietary formats has served me well.

I was hard at work on my book (see links), on a shiny new HP notebook running factory-loaded Windows XP. I had a comfortable authoring environment, using a software to save directly into the markup language DocBook, specially suited for technical documents. I was making great progress. Then the crashing started. When you are "in the zone", when words are flowing so fast you can barely type them in as they arrive, few things shatter your concentration like a system crash followed by a lengthy reboot and hard drive scan and subsequent data recovery steps. In disgust, I backed up my data, reformatted the drive, and installed a copy of Linux. General aggravations of switching to a new platform set my writing back a good month, but it was worth it. More importantly, it was possible.

What operating system runs your computer underlies an important but subtle battle going on: who owns the bits on your hard drive? Proprietary software companies would like nothing better than to have a controlling interest in anything you do with a computer. They do this in many ways, but basically they all come down to removing freedoms from users, and adding to their own. The "End User License Agreement" that comes with Microsoft Word or any other package is crafted to systematically eliminate your rights. Another popular technique is to deliberately create obscure data formats that competitors have trouble reliably working with. Thus, once a user accumulates a library of data in said format, it is unnaturally difficult for her to do anything with it but keep buying more copies of the program for each new computer (assuming the program is still available) in order to be able to open the documents.

Later on, I abandoned the HP hardware in favor of a PowerBook. All my important data carried straight across; I had no need for special applications to read or write my own data.

Writers have a special relationship with their writing. Words are babies. So at a deep, paternal level, it pains me to see anyone blithely accepting some closed word processor as a Writing Tool. One has to marvel at how successfully these have been marketed to schools and businesses. As a consultant, I have seen the front ends of many different hiring and procurement systems, and a large percentage of them assume that you have the ability to open, save, and print files in the same proprietary format as their HR department.

I don't fault any particular group for this, not even Microsoft; everyone is just doing what seems natural. In fact, newer versions of Microsoft Office promise to use more open XML-based formats, a welcome change, even though the Extensible Markup Language doesn't solve all the problems. The solution comes from within, each creative individual deciding to store things in the simplest format possible (but no simpler). For a great many things, this means keeping things around in plain *.txt files.

If language represents thought, plain text represents crystallized thought. Plain Text (technically Unicode) is the most natural way to store the written word, whether it's a journal, notes, fiction, or in-progress Patternalia columns. Systems like Markdown (see links) allow basic formatting, such as italics, heading, hyperlinks, and lists to be easily represented in Plain Text, including the text you are currently reading.

A few other things especially lend themselves to Plain Text:

  • Most things in list format (todo, addresses, phone numbers, article submissions and so on)
  • CVs and résumés (you will get constantly asked for a Word version, but after politely declining, the text version will be accepted)

An especially counterproductive trend is for personal information applications, whether home-grown of off-the-shelf, to use opaque database formats for storage. For one, this introduces a dependency on particular software and often a particular platform to access your data. Even if the software is good, it may only be around for 5 or 10 years. In that amount of time, computing environments change substantially; you want your data to last 20, 50, even 100 years. The simpler the format, the more likely it will be accessible in that kind of a time frame.

Therefore:

Take a careful look at all the important data on your hard drive. If any of it is in a complex or proprietary format, but could work just as well in a simpler format, downconvert it. If you use a Personal Information Manager or small database to store any data, consider how it would work without the requirement of special software, or at least seek out software to synchronize between it and a simpler format like plain text. Constantly remain on the watch for opportunities to simplify your digital life.

Links:

About this article series: This article series is being provided at no cost on the internet. To show your support,

  1. Become a regular reader, including any back-articles you've missed.
  2. Link to it; tell your friends to read it and link to it.
  3. Buy things from the advertisements on this page.
  4. Leave encouraging feedback.

All articles will appear underneath http://dubinko.info/blog/patternalia

Amazon associate links:

(Research tip: check out the reviews on some of these)

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.

posted at: 22:08 | under: patternalia | 1 comment(s)



Hi,
  Nice article. I too support simpler formats. I used to be a text format user, but now for some reason I don't know, I use html with css. Your article again reminds me of the my .txt days. Although text is the simplest format, it still has its portability problems, for example text on unix and text on M$ platform differ in their linefeeds characters. But still text is good. I have heard of Markdown, but haven't looked into it more deeply. Will look into it now. Keep up the good work.

-Arun
Posted by Arun G Nair at Tue Jul 12 22:01:56 2005


Name:


E-mail:


URL:


Comment:



Syndicate: RSS feed

What am I reading?
Don Quixote
Self-Editing for Fiction Writers
The Complete Joy of Homebrewing
Analog magazine
Compilers
TAOCP


What am I browsing?
BlogFour
Blake Ross
Brianstorms
Caveat Lector
Claus Wahlers
Copia
Cringely
David Temkin
Dave Hyatt
Groklaw
Mark Birbeck
M.David
Miguel de Icaza
Mitch Kapor
Norm Walsh
Omar Tazi
Sean McGrath
Sjoerd Visscher
Ted Leung
Tom Bradford
Wil Wheaton


Archives:
Link

Powered by PyBlosxom