Archive for October, 2009

Saturday, October 24th, 2009

Are Windows 7 reviewers logic challenged?

At the risk of sounding fanboy, are Windows 7 reviewers logic challenged? Not to pick on any one in particular, but here’s the most recent one I bumped into–I’ve seen similar qualities in other reviews. Under the reasons to get it:

1. Your computer can probably run it. Unlike Vista, which proved a giant slop-feeding resource hog compared to XP, Windows 7’s system requirements haven’t changed much at all since Vista,

So if Vista was a “giant slop-feeding resource hog”, and the Windows 7 requirements haven’t changed much relative to that…how is this a plus again?

2. It costs less than Vista did. Microsoft really seems to have learned its lesson with Vista pricing, which was way too high at first. Although Windows 7 is hardly cheap…

Similar to #1. The argument amounts to ‘it’s not as ridiculous as Vista’. Yay.

3. You’re not stuck with whatever version you choose first. There are a lot of versions of Windows 7 , all with different combinations of features. If you buy Home Premium and decide at some future point that you really need Ultimate—who doesn’t need BitLocker at some point?—you don’t have to drop $319.99 on top of the $199.99 you already spent the first time.

Remember the version chart? If for some reason you choose “Professional” over “Ultimate”, saving a cool $20 at retail price, you can always go back and upgrade for a modest $129.99. Remember, this is from the list of reasons to choose Windows.

5. You don’t have to give up Windows XP. Yes, exiting any long-term relationship can be difficult, but sometimes it has to be done.

A reason to upgrade is that you don’t have to give up the thing you are probably upgrading from?

7. Comedic value. Even if Windows 7 can’t be hailed for anything else, it inspired an enlightening and truly hilarious column from Editor-in-Chief Lance Ulanoff…

Comedic value? Seriously? The comedic value in Windows 7 reviews seems to be entirely unintentional… -m

(Posted from 30k feet. Hooray for Virgin America)

Wednesday, October 21st, 2009

Application Builder behind-the-scenes

I’ll be speaking next Tuesday (Oct 27) at the Northern Virginia MarkLogic User Group (NOVAMUG). Here’s what I’ll be talking about.

Application Builder consists of two main parts: Search API to enable Google-style search string processing, and the actual UI wizard that steps users through building a complete search app. It uses a number of technologies that have not (at least not up until now!) been widely associated with MarkLogic. Why some technologies that seem like a perfect fit for XML apps are less used in the Mark Logic ecosystem is anyone’s guess, but one thing App Builder can contribute to the environment is some fresh DNA. Maybe your apps can benefit from these as well.

XForms and XRX. Clicking through the screens of App Builder is really a fancy way of editing XML. Upon first arriving on a page, the client makes a GET request to an “Application XML Endpoint” (axe.xqy) to get the current state of the project, which is rendered in the user interface. Interacting with the page edits the in-memory XML. Afterwards, the updated state is PUT back to the same endpoint upon clicking ‘Save’ or prior to navigating away. This is a classic XRX architecture. MarkLogic ships with a copy of the XSLTForms engine, which makes use of client-side XSLT to transform XForms Markup into divs, spans, classes, and JavaScript that can be processed entirely in the browser. Thus XForms works on all supported browsers all the way back to IE6. The apps built by the App Builder don’t use any XForms (yet!) but as App Builder itself demonstrates, it is a great platform for application development.

To be honest, many XForms apps have fallen short on the polished UI department. Not so with App Builder, IMHO. An early, and in hindsight somewhat misdirected, thrust of XForms advocacy pushed the angle of building apps with zero script needed. But one advantage of using a JavaScript implementation of XForms is that it frees you to use script as needed. So in many places, large amounts of UI, all mapped to XML, are able to be hidden away with CSS, and selectively revealed (or mapped to yet other HTML form controls) in small, self-contained overlays triggered via script. While it doesn’t fulfill the unrealistic promise of completely eliminating script, it’s a useful technique, one I predict we’ll see more of in the future.

Relax NG. XML Schema has its roots deep into the XML infrastructure. The type system of XQuery and XSLT 2.0 is based on it. Even XForms has ties to it. But for its wide reach, XML Schema 1.0 has some maddening limitations, and “takes some getting used to” before one can sight read it. In the appendices of many recent W3C specifications use the highly-readable compact syntax to describe content models is a way equally human and machine-readable.

What are these limitations I speak of? XML Schema 1.1 goes a long way toward resolving these, but isn’t yet widely in use. Take this example, the content model of the <options> element from Search API:

start = Options | Response

# Root element
OptionsType = (
 AdditionalQuery? &
 Annotation* &
 ConcurrencyLevel? &
 Constraint* &
 Debug? &
 DefaultSuggestionSource? &
 Forest* &
 Grammar? &
 Operator* &
 PageLength? &
 QualityWeight? &
 ReturnConstraints? &
 ReturnFacets? &
 ReturnMetrics? &
 ReturnQtext? &
 ReturnQuery? &
 ReturnResults? &
 ReturnSimilar? &
 SearchOption* &
 SearchableExpression? &
 SortOrder* &
 SuggestionSource* &
 Term? &

The start line indicates that, within this namespace, there are two possible root elements, either <options> or <response> (not shown here). An instance with a root of, say search:annotation is by definition not valid. Try representing that in XML Schema.

The definition of OptionsType allows a wide variety of child elements, some zeroOrMore times, other optional (zero or one occurrence), with no ordering restrictions at all between anything. XML Schema can’t represent this either. James Clark’s trang tool converts Relax NG into XML Schema, and has to approximate this as an xsd:choice with maxOccurs=”unbounded”, thus the elements that can only occur once are not schema-enforced. Thus the Relax NG description of the content model, besides being more readable, actually contains more information than the closest XML Schema. So particularly for XML vocabularies that are optimized for human use, Relax NG is a good choice for schema development.

Out of line validation. So if XML Schema doesn’t fully describe the <options> node, how can authors be sure they have constructed one correctly? Take a step back: even if XML Schema could fully represent the content model, for performance reasons you wouldn’t want to repeatedly validate the node on every query. The options node tends to change infrequently, mainly during a development cycle. Both of these problems can be solved with out-of-line validation: a separate function call search:check-options().

Inside this function you’ll find a validate expression that will make as much use of the Schema as it can, but also much more. The full power of XQuery can be leveraged against the proposed <options> node to check for errors or inconsistencies, and provide helpful feedback to the developer. Since it happens out-of-line, these checks can take substantially longer than actually handing the query based on them. The code can go as in-depth as it needs to without performance worries. This is a useful technique in many situations. One potential shortfall is that people might forget to call your validation function, but in practice this hasn’t been too much trouble.

Higher-order functions. The predecessor to Search API had a problem that it was so popular that users would modify it to suit their unique requirements, which lead to dozens of minor variations floating around in the wild. Different users have different needs and expectations for the library, and making a one-size-fits-all solution is perhaps not possible. One way to relieve this kind of feature pressure is to provide enough extension hotspots to allow all the kinds of tweaks that users will want, preserving the mainline code. This involves prediction, which is difficult (especially about the future). But a good design makes this possible.

Look inside the built-app and you will find a number of function pointers, implemented as a new datatype xdmp:function. XQuery 1.1 will have a more robust mechanism for this, but it might be a while before this is widespread. By modifying one file, changing a pointer to different code, nearly every aspect of the application can be adjusted.

Similarly, a few hotspots in the Search API can be customized, to hook in different kinds of parsers or snippet-generators. This powerful technique can take your own apps to the next level.


Wednesday, October 21st, 2009

XForms 1.1 is out

XForms 1.1 is now a full W3C Recommendation. Compared to version 1.0, which went live a bit more than 6 years ago, version 1.1 offers lots of road-tested tools that make development easier and more powerful, including new datatypes and XPath functions, a significantly more powerful submission subsystem, and a more flexible event model.

And XSLTForms already supports almost all of the new goodies. -m

Wednesday, October 14th, 2009

Phyllis Dubinko 1926-2009

In loving memory.

Monday, October 12th, 2009

Speaking at Northern Virginia Mark Logic User Group Oct 27

Come learn more about Mark Logic and get a behind-the-scenes look at the new Application Builder. I’ll be speaking at the NOVA MUG (Northern Virginia Mark Logic User Group) on October 27. This turns out to be pretty close to the big Semantic Web conference, so I’ll stick my head in there too. Stop by and look me up!

Details at the developer site.


Sunday, October 11th, 2009

Geek Thoughts: Netflix as a productivity tool

If you live close enough to a Netflix mailing hub, it’s possible to get on the maximal schedule:

  • Enjoy a DVD over the weekend
  • Mail it back on Monday
  • Tusday, Netflix gets it, ships a new one
  • Which you get (and watch) on Wednesday
  • Return in Thursday mail
  • Friday, Netflix gets it, ships a new one
  • repeat

This can scale up to multiple discs at a time, but at a time management level, it starts to suck. In particular, you get very little done Wednesday evenings. If you miss either mailing deadline, you fall back to 1 DVD for that week.

A better system is to reward yourself with some movie time after meeting a milestone. That way, as long as your task remains uncompleted, you’re racking up a $15 (or whatever) a month penalty for your own sloth. It seems like most people I know who have Netflix subscriptions tend to slip into a slow pattern anyway–in the mailing room I see the same mailer sitting there for weeks at a time–so why not harness human nature for motivation purposes?

More collected Geek Thoughts at

Wednesday, October 7th, 2009

US Federal Register in XML

Fed Thread is a front end for the newly XMLified Federal Register. Why is this a big deal? It’s a daily publication of the goings-on of the US government. It’s a primary source for all kinds of things that normally only get rehashed through news organizations. And it is bulky–nobody can read through it on a regular basis. A yearly subscription (printed) would cost nearly $1000 and fill over 80,000 pages.

Having it in XML enables all kinds of searching, syndication, and annotation via flexible front ends like this one. Yay for transparency. -m

Sunday, October 4th, 2009

Geek Thoughts: editing

First draft: get it on the paper (or screen). No editing. No criticism. Crap is fine, just get it down. Leave markers in trouble spots, but don’t stop.

First revision: Quick pass over everything. Get the obvious flaws fixed. Wordsmithing, checking for horrible words, passive voice, adverbly writing, etc.  Skip over the hard stuff. About half of the markers get fixed here.

Second revision: Careful read over everything. Cross checking notes. About half of the remaining markers get fixed here.

Third revision. No excuses. It ends here, today. To opportunistic skipping around. Once you start fixing the chapter, you finish it.

Final polish: Wordsmithing, checking for horrible words, passive voice, adverbly writing, etc.

More collected Geek Thoughts at