Newest Post

March 5th, 2010

A Hyperlink Offering revisited

The xml-dev mailing list has been discussing XLink 1.1, which after a long quiet period popped up as a “Proposed Recommendation”, which means that a largely procedural vote is is all that stands between the document becoming a full W3C Recommendation. (The previous two revisions of the document date to 2008 and 2006, respectively)

In 2005 I called continued development of XLink a “reanimated spectre”. But even earlier, in 2002 I wrote one of the rare fiction pieces on xml.com, A Hyperlink Offering, which using the format of a Carrollian dialog between Tortoise and Achilles, explained a few of the problems with the XLink specification. It ended with this:

What if the W3C pushed for Working Groups to use a future XLink, just not XLink 1.0?

Indeed, this version has minor improvements. In particular, “simple” links are simpler now–you can drop an xlink:href attribute where you please and it’s now legit. The spec used to REQUIRE additional xlink:type=”simple” attributes all over the place. But it’s still awkward to use for multi-ended links, and now even farther away from the mainstream hyperlinking aspects of HTML5, which for all of its faults, embodies the grossly predominant description of linking on the web.

So in many ways, my longstanding disappointment with XLink is that it only ever became a tiny sliver of what it could have been. Dashed visions of Xanadu dance through my head. -m

March 1st, 2010

Newsweek should never have been free

Andrew Zolli argues in Newsweek that online content should never have been free. I’m probably not the first one to make this profound observation–but if it were not for the free online edition of Newsweek (and link aggregator sites like Digg) I wouldn’t have read a single word of Newsweek in years, nor would I be linking to it as my previous sentence does… Maybe Newsweek is OK with that. -m

February 22nd, 2010

Mark Logic User Conference 2010

Are you coming? Link. It starts on May 4 (Star Wars day!) at the InterContinental Hotel in San Francisco. Guest speakers include Chris Anderson, Editor-in-Chief of Wired and Michelle Manafy, Editor-in-Chief of EContent magazine.

Early bird registration ends Feb 28. -m

February 16th, 2010

There is no honor in namespaces

As heard from my friend and Mark Logic contractor Ryan Grimm. -m

February 13th, 2010

Geek Thoughts: Dora the Spamadora

Dora: Oh no! Lawrence Fawusu, 52, Operational Manager of the International Commercial Bank Ghana Limited is in trouble! He needs to move the sum of US$22, 000.000 (TWENTY TWO MILLION UNITED STATES DOLLAR) outside the country, but doesn’t know where to turn.

Dora: Who do we call when we don’t know the way to go? That’s right, the map! (He’s the map, he’s the map, he’s the map!)

Map: Dora and Mr. Fawusu need to 1) get your bank account info, 2) transfer funds, and 3) proft!

Dora: Say it with me: Bank account, transfer funds, profit!

Dora: We need YOUR help to complete the transaction.

(clicking sound)

We did it, yay, lo hicimos, etc.

More collected Geek Thoughts at http://geekthoughts.info.

February 8th, 2010

Economics 101 question

Let’s say you have a box that (completely legally) spits out 1 dollar per day. I’m using “box” in an abstract sense here: maybe it’s an investment or a business opportunity. How much would you pay for this box? In other words, what’s its fair market value?

What if it spit out one dollar per hour? Would you pay exactly 24x as much for it then? Or one per week–would you pay 1/7th as much?

What if it’s hard to measure how much money comes out of it–maybe sometimes it emits a dollar, but sometimes you have to put one in. Then what? -m

February 2nd, 2010

Larry Masinter on overspecification

Some thoughts worth considering on state of HTML development today. -m

January 31st, 2010

Writing tools to avoid: WhiteSmoke

I was lured in by a slick promotion, and decided to give a new writing tool a try. WhiteSmoke seems like it’s primarily aimed at folks for whom English is not a first language, but quotes likeInnovative technology for native and non-native English speakersmake it seem like it could help. When I wrote an article for xml.com that summarized recent mailing list activity, I liked to compile readability statistics on the messages. Maybe this would be similar.

I had some question of whether this would work on a Mac or not, but the FAQ assures one thatMac users are able to use WhiteSmoke’s online interface (also known as the Online Editor), which contains all grammar, enrichment and spelling featuresand (in curiously clumsy language) “Should you be running Safari MacOS x10.3 and encounter any problems, please use FireFox.”

Sounds good.

The spell checker is decent, probably about as good as the one in WordPress.

The thesaurus is pretty good. Clicking on almost any word will prompt a drop-down list of synonyms. This sometimes makes selecting text troublesome. The list itself is often too small, and entries toward the bottom are obscured.

The grammar checker is OK, but I couldn’t point to anything it does that Word couldn’t have handled (though it has been a long time since I have run Word).

The user interface is terrible. Any errors are shown in slightly bolder text either red or green with nothing distinguishing in the background. I’m not too good with colors, so it’s hard for me to say. The text is very difficult to scan. It has an AutoCorrect mode, which can fix some mistakes without interaction, but just as often breaks your text. For example, it changed the previous paragraph fromSounds good.” toremark: Incomplete Sentence good.” The changed text is bold, but only until the next scan, when it becomes indistinguishable from actual text.

At the XML level, it produces horrible output, with stacks upon stacks of nested spans, with duplicate IDs. Some of this may be from the necessary back-and-forth between the web interface and whatever your actual editor is. View source on this posting to see what I mean.

It gets worse. The online interface is limited to 10,000 characters-at-a-timenot wordscharacters. To compare, this short posting contains slightly more than 3,000 characters. I did some experimentation and found the actual limit is somewhat less than the stated 10K; somewhere north of 7K characters, it will show a spinner forever and never finish checking. Clicking the browser “back” button from the forever-spinner screen takes you back to a blank page–all your text is gone. For someone working on, say, a 60,000 word (360,000+ character) project, it would have to be diced up into maybe 50 small pieces, each individually checked, each introducing the prospect of adding rather than fixing problems. Making even a single pass through all the text would require a senseless amount of tedious cut-and-paste work. It’s essentially unusable.

To make sure I know what I’m talking about, I composed this posting in WhiteSmoke, which very well may be the last time I use it. -m

January 20th, 2010

XForms: binding to an optional element

I asked this on the XSLTForms list, but it’s worth casting a wider net.

Say I have an XML instance like this:
<root>
<format>xml</format>
</root>

Possible values for the format element are “xml”, “text”, “binary”, or a default setting, indicated by a complete absence of the <format> element. I’d like this to attach to a select1 with four choices:

<xf:select1 ref=”format”>…

* xml
* text
* binary
* default

Where the first three set the value of the <format> element, and the fourth omits the element altogether. In XForms 1.0, the difficulty comes from binding to a nonexistent element, which causes it to become non-relevant. Are there features from 1.1, 1.2, or extensions that make this case easier? Anyone have an example?

-m

January 17th, 2010

Geek Thoughts: engineer’s curse

May all in your life be an optimization problem to solve.

More collected Geek Thoughts at http://geekthoughts.info.

January 15th, 2010

Economic indicators: recruiting picking up again

I got a personal email pitch from recruiters at both Facebook and Google, oddly enough both messages within a 3-minute window on a Monday morning. Hiring is on the uptick again, it seems. My team is still looking for the right front end engineer–someone who knows the JavaScript language in depth, how to use semantic HTML and CSS, AND all about browser quirks. Email me. -m

January 6th, 2010

MZFinance.NoPasswordToken_message

MZFinance.NoPasswordToken_message is the apparently Google-unique error message my iPhone gave me today whilst purchasing a free app. Always one happy to leave a new mark on the web, I’m recording it here. If you’ve seen it too (and you come here within the 30-day window) please post a comment on your experience. -m

January 3rd, 2010

Geek Thoughts: the ultimate real-time strategy game

Games like Farmville and the iPhone knock-off iFarm throw in a unique twist in the realm of strategy gaming: crops that get planted mature in “real time”. If a crop takes 24 hours to grow, then you need to literally wait the full 24 hours. Great for making an app “sticky” and getting users to repeatedly log in. Side fact: Farmville sells more virtual tractors in a day than real tractors sold in the US in a Year.

Game producers keep upping the ante in terms of real-time strategy games interacting with the real world. Take the latest for instance, a free iPhone app called Lose It!. Everything in this game runs in real-time–a game day is always a full 24 hours. Instead of conventional points, it uses “calories”, which are gained by the actual foods you physically eat, and subtracted via actual exercise. The app includes a massive database of food items and exercises to help you keep an accurate record, apparently on the honor system. The goal: to set a calorie target for each day and come in under it. A secondary scoring system is based on your own weight, though you will need an accurate scale (not included with the app) to measure it.

So far I’ve done pretty well at the game. I’ve averaged better than 1000 calories under my goal for the last several weeks, and have done well on the weight number too. And it’s pretty interesting to have a log of everything I’ve eaten. What will they think of next?

More collected Geek Thoughts at http://geekthoughts.info.

January 1st, 2010

Chocolate

Chocolate is reasonably healthy in small quantities, as long as you get the good stuff without too much sugar or dairy added. I’ve been tasting around for a few months now. I claim zero credentials in chocolate tasting, so what I say here might seem amusing to experts. Scores are purely subjective: how much did I enjoy it?

Quetzalcoatl, E. Guittard

72% cacao mass. Ingredients: Cacao beans, pure cane sugar, soya lecithin, vanilla beans.

More sweet and less bitter than most chocolates of similar cacao content. Even brief contact with the skin melts a little leaving a faint stain that doesn’t wipe off. Very smooth gently roasted flavor. Earthy notes, with a faint aftertaste of apricots or stone fruit. Lightly bitter finish. 9.0/10

Nocturne, E. Guittard

91% cacao. Ingredients: cacao beans, cocoa butter, pure cane sugar, soya lecithin, vanilla beans.

Very firm chocolate. Biting into it, your teeth only go in a bit then the piece snaps off with a loud pop. Mouthfeel like melting butter. Flavor steadily builds in intensity until it’s a focused earthy bitterness on the center of your tongue. Subtle raspberry notes. 8.5/10

Extra Dark, Scharffen Berger

82% cacao. Ingredients: Cacao beans, sugar, whole vanilla beans

Dark, intense, lightly sour flavor. Wine-like notes. As smooth in texture as soy lecithin alternatives. Spicy hints like coriander or szechuan pepper. Lingering finish much like that of a hoppy beer, felt in the back of the throat. 7.5/10

Bittersweet, Scharffen Berger

70% cacao. Ingredients: Cacao beans, cocoa butter, non-gmo soy lecithin, whole vanilla beans.

Good flavor from the moment it touches your tongue. Creamy, texture is like milk chocolate. Buttery mouthfeel. Flavor is intense and bitter, sweet and sour, more approachable than stronger alternatives. Clean chocolate flavor. 8/10

Midnight Reverie, Ghiradelli

86% cacao. Ingredients: Bittersweet chocolate (unsweetened chocolate, cocoa butter, sugar, milk fat, soy lecithin), vanilla, natural flavor.

Slightly waxy, less intense flavor, though it gradually builds as it melts in your mouth. Trace of acidity, like dried cherries. Coffee-like bitterness. Fairly sweet toward the end, though the finish has hints of licorice with lingering bitterness. 7/10

Twilight Delight, Ghiradelli

72% cacao. Ingredients: Unsweetened chocolate, sugar, cocoa butter, vanilla, soy lecithin.

Immediate sweetness which fades into a classic dark chocolate flavor. Roasty notes, hints of raisin and dark fruit. Rapid melter. Extra silky texture. When you snap a piece off, tiny flecks tend to land on your skin and immediately melt. Long finish reminds you for many minutes that you ate chocolate. 8/10

NOIR, Côte d’Or (Kraft)

70% cacao. Ingredients: Chocolate, sugar, cocoa butter, cocoa, milkfat, soy lecithin, natural flavor.

Lesser flavor at first, gradually builds as it melts. Fruity bananna melon notes. Unlike some of the more intense chocolates, this one could be consumed freely, not saved for special occasions like some others. Has a waxy feel, a bit like biting into a crayon. 7/10

Stone Ground Organic Chocolate, 70% dark, Taza

70% cacao. Ingredients: cacao beans (Dominican Republic), cane sugar, cocoa butter, whole vanilla beans.

Slightly gritty texture, feels like little bits of sugar in there (like the bottom of a class of chocolate milk made with powder). Warm, earthy aroma, full balanced sweetness. Earthy, muddy (in a good way) flavor, mouth filling. Long finish that fades into dark fruit notes. 9.5/10

Stone Ground Organic Chocolate, 80% dark, Taza

80% cacao. Ingredients: cacao beans (Dominican Republic), cane sugar, cocoa butter, whole vanilla beans.

Lightly gritty texture, occasional bits of crunchy sugar in there. More intense bitterness, and wine-like notes in aroma and flavor. Hints of sour cherries, against lingering bitterness. Long tapered finish with pleasant aftertaste. Great chocolate experience. 9/10

Cacao Reserve by Hershey’s

65% cacao. Ingredients: Semi-sweet chocolate (chocolate; sugar; cocoa; milk fat; cocoa butter; soy lecithin; natural vanilla flavor), cacao nibs, milk.

Surprisingly dark for a milk chocolate. Lots of little crunchy nibs within. Lots of flavor, mouth filling. Candy sweetness and hints of roasted nut. Kind of like a Mr. Goodbar (except actually good). Occasionally you bite into a (less-roasted?) nib with some vegetal character. Long coffee-like finish. Melts instantly against your finger. 9/10

Premium Extra Dark Chocolate, See’s Candies

56% cacao. Ingredients: Chocolate, sugar, cocoa butter, milkfat (butter), soy lecithin, vanilla, vanillin (an artificial flavor).

Strong caramel and milk chocolate notes. Tastes sweet against the tongue before even biting into it. More like a candy bar then a piece of real chocolate. I could see how it might be addictive, though. 5/10

Chocolat Noir de Domaine, Palmira, Valrhona (2006)

64% cacao. Ingredients: Cocoa beans from Venezuela, sugar, cocoa butter, soya lecithin, vanilla.

When a friend of mine heard I was tasting chocolates, he said “you’ve GOT to try this one”. I have no idea whether chocolate gets better with age, but this one said Best Before 11-2007. It has a velvety, slippery feeling to it–hard to pick up with your fingers. Lots of pleasing aromas coming off the block–honey, toast. Biting into it a piece snaps off with a loud pop, and fills your mouth with smoky, earthy flavors. Some notes like a dry red wine. Leaves an aftertaste like brazil nuts. 9/10

Le Noir Extra Amer, Valrhona

85% cacao. Ingredients: cocoa beans, sugar, cocoa butter, soya lecithin, natural vanilla beans.

Dark, intense, sharp flavor up front, reminiscent of sour dark fruit. Very meltable, leaving marks on your fingers. Taste fades into a creamy buttery note with lingering bitterness. 9/10

Dark Chocolate, Valor

70% cacao. Ingredients: chocolate processed with alkali, sugar, cocoa butter, soy lecithin.

Has a bitter undertone, like stout beer, but alongside a creaminess like milk chocolate. Caramel and coffee notes. Lingering aftertaste. 8/10

New Moon, Dagoba

74% cacao. Ingredients: Organic dark chocolate (cacao beans, cane sugar, cacao butter, non-gmo soy lecithin), milk (less than 0.1%).

Fast melting on to fingers. Has a deep penetrating flavor with hints of blueberry & concord grape notes. After a few moments this fades into a classic bitter chocolate taste. Slightly muddy mouthfeel, but somehow not in a bad way. It seems odd, but this does not pair well with water. Taking a sip immediately after leaves a dry minerally aftertaste. 8.5/10.

“Bat” Intense Dark Chocolate with Cacao Nibs, Endangered Species Chocolate

72% cacao. Ingredients: bittersweet chocolate (chocolate liquor, unbleached water-filtered beet sugar, cocoa butter, soy lecithin, vanilla), cocoa nibs.

Rich deep chocolate favor with pleasingly crunchy bits. Roasty notes like dark toast and coffee. Nutty. Sweet and sour like tart orange. 9/10

December 25th, 2009

The Physics of Santa

Hands down, the stupidest Science Friday segment evar. I want my 11 minutes back. -m

December 21st, 2009

Failure as the secret to success

Excellent article in Wired, perhaps a good explanation of my career. :-)

Dunbar observed that the skeptical (and sometimes heated) questions asked during a group session frequently triggered breakthroughs, as the scientists were forced to reconsider data they’d previously ignored.

Which sounds like a fairly typical spec review at Mark Logic. Hint: we’re hiring–email me.

-m

December 18th, 2009

Mark Logic Careers

Check out the updated careers page, including a quote from YT. If you’re looking for an amazing place to work, get in touch with me. In particular I’m looking for top-notch JavaScript/FE/UI people. -m

December 18th, 2009

Taste Your Beer

I just ordered a beer appreciation kit from tasteyourbeer.com. I’m all for less swilling, more appreciating. This one includes little vials of 13 different kinds of hops to compare. Train your palate, but be warned: once you start down this road, forever will it dominate your soul. You’ll be picking out different flavors in everything you eat or drink, and some things you don’t (like toothpaste). -m

December 17th, 2009

Steorn Orbo on display: analysis

So Steorn’s Orbo technology is on display in Dublin. They have multiple live video streams, but at the #3 view at this hour shows “Offline” and “The channel owner has prohibited viewing from this web page”. Public viewing hours run from 10a to 7p six days a week.

What is it? There’s a detailed exploded diagram (PDF) of the display model on their site. It shows a rotor assembly with three main rings. The bottom two each have 8 magnets mounted in pairs at 90 degree intervals, and the skinnier top ring has only four magnets in the same alignment. The orientation of the magnets isn’t shown. The rotor assembly spins inside a frame with four pairs of toroidal coils which line up with the bottom two rings. A separate pair of “pick-up coils” align with the top ring. No wiring diagram is included. Based on the term “pick-up coil” the top assembly looks like a generator. Spinning magnets past a pick-up coil would produce AC, so the rectifiers shown below turn the AC into bumpy DC. Meanwhile, energy flows into the system through the remaining coils.

Oh, and the “Battery D-Size”. They claim it is only being used for temporary storage, to smooth out the flow of energy, and that the device is producing three times the energy it is taking in. It goes without saying that choosing to include a battery in a display model is a terrible choice for someone trying to convince a skeptical public that the device is more than a funny-looking motor.

From pictures I’ve seen, the battery doesn’t look off-the-shelf. It’s probably a high-density lithium-ion unit with capacity similar to a D cell, maybe 12 Amp hours @ 3 volts. That’s 36 watt hours. The rotor assembly itself uses low-resistance bearings and has an overall smooth shape for low wind resistance, so the amount of energy needed to keep it turning is probably quite small. Let’s say 50 mW. Given those figures, the device could run continuously for 30 days without needing to generate one scrap of energy, even discounting the possibility of clandestine midnight battery changes and the like.

The way the system is set up, it’s difficult to establish a reliable measurement. What if you hooked up a meter to the circuit with the battery in it–which way would current be flowing?

Turns out that’s a difficult question. The current flow varies over time, which introduces all kinds of measurement difficulties. A few paragraphs back, we did a simple DC power computation with volts times amps. In AC circuits, its more complicated–the addition of inductance or capacitance to a circuit adds an element of temporary energy storage which causes the two to become out of sync, so a simple scalar calculation isn’t possible. You get in the realm of imaginary numbers and a mathematical construct called a phasor, which you draw as a simple 2d diagram. For non-sinusoidal currents, including that bumpy DC from a rectifier, the math gets even more hairy.

Is it really this hard? Yep. I wrote earlier comparing the situation to the three-body problem, most often applied to gravitaional systems, but EM analysis is even harder. Take into account magnetic fields, the interchange of electric current and magnetic fields, back-emf, and Lorentz forces operating at right angles instead of attraction along a straight line. It all gets fearsomely complicated, well past the “vector calculus just for fun” level I’m at these days.

What they’ve now publicly shown seems to be enough for a competent person to duplicate, figuring out which of the few possible permutations of electrical connections make sense. If the claims turn out to be true, expect to see independent validations springing up. But it’s not easy, so expect this interesting situation to continue to unfold over (at least) several months. If I were Steorn and wanted to speed the process up, I would ditch the battery and take steps to make it easier, not harder, to validate the demonstration.

Disclaimer: I am a member of “the 300″, and have access to the SKDB. I haven’t paid anything to nor been paid from Steorn. Information in this posting comes from only public sources. -m

Update: The official video here clearly shows the battery with the markings NiMH and 10000, which is the capacity in mAH. So my capacity estimate above was a little high, though still a good guess if it was a lithium ion cell.

But the Nickel Metal Hydride battery raises other issues: NiMH batteries are very sensitive to overcharging, which should happen if the device is dumping continuous energy into it. At best this means that the battery’s capacity will get quickly diminished, reducing effectiveness as an energy buffer (and at worst, it means boom).

December 11th, 2009

Tinderbox 5 is out

At first glance, this seems to be the Snow Leopard of Tinderbox releases–lots of behind-the-scenes technology updates and largely the same core features. If you’re looking for a way to get more organized, it’s worth a look. Link. -m

December 10th, 2009

500th Post

Celebrating 500 posts since I went to WordPress in May 2006. Prior to that, an additional 730 posts as I floated through a typical evolution of blogging platforms:

  • Easy start: blogger (299 posts in 24 months)
  • Succumbing to the desire to roll your own (259 posts in 12 months)
  • Realizing that rolling your own is too difficult: Pyblosxom (172 posts in 12 months)
  • Moving to a mature platform you don’t need to worry about much: WordPress (500 posts in 42+ months)

-m

November 30th, 2009

The best thing you can do…

The best thing a user can do to advance the Web is to help move people off IE 6

– Ryan Servatius, senior product manager for Internet Explorer.

Source. -m

November 29th, 2009

The Model Endpoint Template (MET) organizational pattern for XRX apps

One of the lead bullets describing why XForms is cool always mentions that it is based on a Model View Controller framework. When building a full XRX app, though, MVC might not be the best choice to organize things overall. Why not?

Consider a typical XRX app, like MarkLogic Application Builder. (You can download a your copy of MarkLogic, including Application Builder, under the community license at the developer site.) For each page, the cycle goes like this:

  1. The browser requests a particular page, say the one that lets you configure sorting options in the app you’re building
  2. The page loads, including client-side XForms via JavaScript
  3. XForms requests the project state as XML from a designated endpoint; this becomes the XForms Instance Data
  4. Stuff happens on the page that changes the client-side state
  5. Just before leaving the page, XML representing the updated state is HTTP PUT back to the endpoint

The benefit of this approach is that you are dealing with XML all the way through, no impedance mismatches like you might find on an app that awkwardly transitions from (say) relational data to Java objects to urlencoded name/value pairs embedded in HTML syntax.

So why not do this in straight MVC? Honestly, MVC isn’t a bad choice, but it can get unwieldy. If an endpoint consists of a separate model+view+controller files, and each individual page consists of separate model+view+controller files, it adds up to a lot of stuff to keep track of. In truly huge apps, this much attention to organization might be worth it, but most apps aren’t that big. Thus the MET pattern.

Model: It still makes sense to keep the code that deals with particular models (closely aligned with Schemas) as a separate thing. All of Application Builder, for example, has only one model.

Endpoint: The job of an endpoint is to GET and PUT (and possibly POST and DELETE) XML, or other equivalent resource bundles depending on how many media types you want to deal with. It combines an aspect of controllers by being activated by a particular URL and views by providing the data in a consistent format.

Template: Since XForms documents already contain MVC mechanics, it not a high-payoff situation to further use MVC to construct the XForms and XHTML wrapper themselves. The important stuff happens within XForms, and then you need various templating mechanisms for example to provide consistent headers, footers, and other pieces across multiple pages. For this, an ordinary templating mechanism suffices. I can imagine dynamic assembly scenarios where this wouldn’t be the case, but again, many apps don’t need this kind of flexibility, and the complexity that comes along with it.

What about separation of concerns? Oh yeah, what about it? :-) Technically both Endpoints and Templates violate classical SOC. In an XRX app, this typically doesn’t lead to the kinds of spaghetti situations that it might otherwise. Endpoints are self contained, and can focus on doing just one thing well; with limited scope comes limited ability to get into trouble. For those times when you need to dig into the XQuery code of an endpoint, it’s actually helpful to see both the controller and view pieces laid out in one file.

As for Templates, simplicity wins. With the specifics of models and endpoints peeled away, the remaining challenge in developing individual pages is getting the XForms right, and again, it’s helpful to minimize the numbers of files one XForms page are split across. YAGNI applies to what’s left, at least in the stuff I’ve built.

So, I’ve been careful in the title to call this an “organizational pattern”, not a “design pattern” or an (ugh) “architectural pattern”. Nothing too profound here. I’d be happy to start seeing XRX apps laid out with directory names like “models”, “endpoints”, and “templates”.

What do you think? Comments welcome.

-m

November 27th, 2009

Gandhi

Richard Attenborough’s epic biopic is available to watch instantly on Netflix, but only until November 30. Recommended viewing for the weekend. -m

November 22nd, 2009

How Xanadu Works: technical overview

One particular conversation I’ve overheard several times, often in the context of web and standards development, has always intrigued me. It goes something like this:

You know, Ted Nelson’s hypertext system from the 60’s had unbreakable, two-way links. It was elegant. But then came along Tim Berners-Lee and HTML, with its crappy, one-way, breakable links, and it took over the world.

The general moral of the story is usually about avoiding over-thinking problems and striving for simplicity. This has been rolling around in the back of my mind ever since the first time I heard the story. Is it an accurate assessment of reality? And how exactly did Nelson’s system, called Xanadu (R), manage the trick of unbreakable super-links? Even if the web ended up going in a different direction, there still might be lessons to learn for the current generation of people building things that run (and run on) the web.

Nelson’s book Literary Machines describes the system in some detail, but it’s hard to come by in the usual channels like Amazon, or even local bookstores. One place does have it, and for a reasonable price too: Eastgate Systems. [Disclosure: I bought mine from there for full price. I'm not getting anything for writing this post on my blog.] The book has a versioning notation, with 93.1 being the most recent, describing the “1993 design” of the software.

Pause for a moment and think about the history here. 1993 is 16 years ago as I write this, about the same span of time between Vannevar Bush’s groundbreaking 1945 article As We May Think (reprinted in full in Literary Machines) and Nelson’s initial work in 1960 on what would become the Xanadu project. As far as software projects go, this one has some serious history.

So how does it work? The basic concepts, in no particular order, are:

  • A heavier-weight publishing process: Other than inaccessible “privashed” (as opposed to “pub”lished) documents, once published, documents are forever, and can’t be deleted except in extraordinary circumstances and with some kind of waiting period.
  • All documents have a specific owner, are royalty-bearing, and work through a micropayment system. Anyone can quote, transclude, or modify any amount of anything, with the payments sorting themselves out accordingly.
  • Software called a “front end” (today we’d call it a “browser”) works on behalf of the user to navigate the network and render documents.
  • Published documents can be updated at will, in which case unchanged pieces can remain unchanged, with inserted and deleted sections in between. Thus, across the history of a document, there are implicit links forward and backward in time through all the various editions and alternatives.
  • In general, links can jump to a new location in the docuverse or transclude part of a remote document into another, and many more configurations, including multi-ended links, and are granular to the character level, as well as attached to particular characters.
  • Document and network addressing are accomplished through a clever numbering system (somewhat reminiscent of organic versioning, but in a way infinitely extensible on multiple axes). These address, called tumblers, represent a Node+User+Document+Subdocument, and a minor variant to the syntax can express ranges between two points therein.
  • The system uses its own protocol called FEBE (Front End Back End) which contains at several verbs including on page 4/61: RETRIEVEV (like HTTP GET), DELETEVSPAN, MAKELINK, FINDNUMOFLINKSTOTHREE, FINDLINKSFROMTOTHREE, and FINDDOCSCONTAINING [Note that "three" in this context is an unusual notation for a link type] Maybe 10 more verbs are defined in total.

A few common themes emerge. One is the grandiose scope: This really is intended as a system to encompass all of literature past, present, and future, and to thereby create a culture of intellect and reshape civilization. “We think that anyone who actually understands the problems will recognize ours approach as the unique solution.” (italics from original, 1993 preface)

Another theme is simple solutions to incredibly difficult problems. So the basic solution to unbreakable links is to never change documents.  Sometimes these solutions work brilliantly, sometimes they fall short, and many times they ends up somewhere in between. In terms of sheer vision, nobody else has come close to inspiring as many people working on the web. Descriptions of what today we’d call a browser would sound familiar, if a bit abstract, even to casual users of Firefox or IE.

Nothing like REST seems to have occurred to Nelson or his associates. It’s unclear how widely deployed Xanadu prototypes ever were, or how many nodes were ever online at any point. The set of verbs in the FEBE protocol reads like that a competent engineer would come up with. The benefits of REST, in particular of minimizing verbs and maximizing nouns, are non-obvious without a significant amount of web-scale experience.

Likewise Creative Commons seems like something the designers never contemplated.  “Ancient documents, no longer having a current owner, are considered to be owned by the system–or preferably by some high-minded literary body that oversees their royalties.” (page 2/29) While this sounds eerily like the Google Books settlement, this misses the implications of truly free-as-in-beer content, but equally misses the power of free-as-in-freedom documents. In terms of social impact there’s a huge difference between something that costs $0 and $0.000001.

In this system anyone can include any amount of any published document into their own without special permission. In a world where people writing Harry Potter Lexicons are getting sued by the copyright industry, it’s hard to imagine this coming to pass without kicking and screaming, but it is a nice world to think about. Anyway, in Xanadu per-byte royalties work themselves out according to the proportion of original vs. transcluded bytes.

Where is Google in this picture? “Two system directories, maintained by the system itself, are anticipated: author and title, no more” (page 2/49) For additional directories or search engines, it’s not clear how that would work: is a search results page a published or privashed document? Does every possible older version of every result page stick around in the system? (If not, links to/from might break) It’s part of a bigger question about how to represent and handle dynamic documents in the system.

On privacy: “The network will not, may not monitor what is written in private documents.” (page 2/59) A whole section in chapter 3 deals with these kinds of issues, as does Computer Lib, another of Nelson’s works.

He was early to recognize the framing problem: how in a tangle of interlinked documents, to make sense of what’s there, to discern between useful and extraneous chunks. Nelson admits to no general solution, but points at some promising directions, one of which is link typing–the more information there is on individual links, the more handles there are to make sense of the tangle. Some tentative link types include title, author, supersession, correction, comment, counterpart, translation, heading, paragraph, quote, footnote, jump-link, modal jump-link, suggested threading, expansion, citation, alternative version, comment, certification, and mail.

At several points, Nelson mentions algorithmic work that makes the system possible. Page 1/36 states “Our enfilade data structures and methods effectively refute Donald Knuth’s list of desirable features that he says you can’t have all at once (in his book Fundamental Algorithms: Sorting and Searching)”. I’m curious if anyone knows more about this, or if Knuth ever got to know enough details to verify that claim, or revise his.

So was the opening anecdote a valid description of reality? I have to say no, it’s not that simple. Nelson rightly calls the web a shallow imitation of his grand ideas, but those ideas are–in some ways literally–from a different world. It’s not a question of “if only things had unfolded a bit differently…”. To put it even more strongly, a system with that kind of scope cannot be designed all at once, in order to be embraced by the real world it has to be developed with a feedback loop to the real world. This in no way diminishes the value and influence of big ideas or the place that Roarkian stick-to-your-gunnedness has in our world, industry, and society. We may have gotten ourselves into a mess with the architecture of the present web, but even so, Nelson’s vision will keep us aspiring toward something better.

I intend to return to this posting and update it for accuracy as my understanding improves. Some additional topics to maybe address are: a more detailed linking example (page 2/45), comparing XLink to Xanadu, comparing URIs and tumblers, and mention the bizarre (and yet oddly familiar if you’ve ever been inside a FedEx Kinkos) notion of “SilverStands”.

For more on Nelson, there is the epic writeup in Wired. YouTube has some good stuff too.

Comments are welcome. -m

Xanadu is a registered trademark, here used for specific identifying purpose.

November 18th, 2009

MarkLogic and XSLT

MarkLogic fans should check out Norm Walsh’s posting about his talk at the NY User Group. If you follow the right Twitter feeds, this is probably not too much of a surprise, but now the cat is officially disjoint with the volume inside the bag. Disclaimer: be sure to read the disclaimer there. -m

November 14th, 2009

Geek Thoughts: if this keeps going

If Moore’s law applies to flash (and flash-like) memory storage, and it certainly seems like it does, in another decade we will all be carrying around a terabyte on our phones.

What happens then?

More collected Geek Thoughts at http://geekthoughts.info.

November 8th, 2009

High Temperature Superconductors

If this site is accurate, it’s now possible to have superconducting material at household freezer temperatures: 254k, or a tiny bit below 0F. From power lines to maglevs to supercolliders to energy storage, the potential applications boggle the mind. -m

Note: I’m having trouble finding independent verification of this, other than what appears to be re-hashes of the superconductor.org article. If you have any additional proof or refutation, please post it in the comments.

November 5th, 2009

Metadata FTW

Link credit goes to Joho.

This looks pretty significant. The AZ Supreme Court ruled that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including data.gov and the Federal Register, this fits in well. Quite often metadata including creation date and author and the like make for much better searching and faceting. -m

November 4th, 2009

Geek Thoughts: unlikely tail

Tractors are to dogs as rocking chairs are to cats.

More collected Geek Thoughts at http://geekthoughts.info.