Archive for the 'announcement' Category

Tuesday, May 11th, 2010

XProc is ready

Brief note: The W3C XProc specification, edited by my partner-in-crime Norm Walsh, has advanced to Recommendation status. Now go use it. -m

Thursday, April 29th, 2010


The new MarkLogic developer site is up, cleaner, better organized, and more social. Even cooler, it’s an XSLT-heavy application running on a pre-release version of MarkLogic. The new blog gives some of the details of the new site and transition.

So, if you’re already a MarkLogic developer, this is a great resource. And if you’re not, the site itself shows how fast and simple it is to put together a XSLT and XQuery-powered app. -m

Monday, February 22nd, 2010

Mark Logic User Conference 2010

Are you coming? Link. It starts on May 4 (Star Wars day!) at the InterContinental Hotel in San Francisco. Guest speakers include Chris Anderson, Editor-in-Chief of Wired and Michelle Manafy, Editor-in-Chief of EContent magazine.

Early bird registration ends Feb 28. -m

Friday, January 15th, 2010

Economic indicators: recruiting picking up again

I got a personal email pitch from recruiters at both Facebook and Google, oddly enough both messages within a 3-minute window on a Monday morning. Hiring is on the uptick again, it seems. My team is still looking for the right front end engineer–someone who knows the JavaScript language in depth, how to use semantic HTML and CSS, AND all about browser quirks. Email me. -m

Sunday, January 3rd, 2010

Geek Thoughts: the ultimate real-time strategy game

Games like Farmville and the iPhone knock-off iFarm throw in a unique twist in the realm of strategy gaming: crops that get planted mature in “real time”. If a crop takes 24 hours to grow, then you need to literally wait the full 24 hours. Great for making an app “sticky” and getting users to repeatedly log in. Side fact: Farmville sells more virtual tractors in a day than real tractors sold in the US in a Year.

Game producers keep upping the ante in terms of real-time strategy games interacting with the real world. Take the latest for instance, a free iPhone app called Lose It!. Everything in this game runs in real-time–a game day is always a full 24 hours. Instead of conventional points, it uses “calories”, which are gained by the actual foods you physically eat, and subtracted via actual exercise. The app includes a massive database of food items and exercises to help you keep an accurate record, apparently on the honor system. The goal: to set a calorie target for each day and come in under it. A secondary scoring system is based on your own weight, though you will need an accurate scale (not included with the app) to measure it.

So far I’ve done pretty well at the game. I’ve averaged better than 1000 calories under my goal for the last several weeks, and have done well on the weight number too. And it’s pretty interesting to have a log of everything I’ve eaten. What will they think of next?

More collected Geek Thoughts at

Friday, December 11th, 2009

Tinderbox 5 is out

At first glance, this seems to be the Snow Leopard of Tinderbox releases–lots of behind-the-scenes technology updates and largely the same core features. If you’re looking for a way to get more organized, it’s worth a look. Link. -m

Thursday, December 10th, 2009

500th Post

Celebrating 500 posts since I went to WordPress in May 2006. Prior to that, an additional 730 posts as I floated through a typical evolution of blogging platforms:

  • Easy start: blogger (299 posts in 24 months)
  • Succumbing to the desire to roll your own (259 posts in 12 months)
  • Realizing that rolling your own is too difficult: Pyblosxom (172 posts in 12 months)
  • Moving to a mature platform you don’t need to worry about much: WordPress (500 posts in 42+ months)


Friday, November 27th, 2009


Richard Attenborough’s epic biopic is available to watch instantly on Netflix, but only until November 30. Recommended viewing for the weekend. -m

Sunday, November 8th, 2009

High Temperature Superconductors

If this site is accurate, it’s now possible to have superconducting material at household freezer temperatures: 254k, or a tiny bit below 0F. From power lines to maglevs to supercolliders to energy storage, the potential applications boggle the mind. -m

Note: I’m having trouble finding independent verification of this, other than what appears to be re-hashes of the article. If you have any additional proof or refutation, please post it in the comments.

Thursday, November 5th, 2009

Metadata FTW

Link credit goes to Joho.

This looks pretty significant. The AZ Supreme Court ruled that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including and the Federal Register, this fits in well. Quite often metadata including creation date and author and the like make for much better searching and faceting. -m

Wednesday, October 21st, 2009

Application Builder behind-the-scenes

I’ll be speaking next Tuesday (Oct 27) at the Northern Virginia MarkLogic User Group (NOVAMUG). Here’s what I’ll be talking about.

Application Builder consists of two main parts: Search API to enable Google-style search string processing, and the actual UI wizard that steps users through building a complete search app. It uses a number of technologies that have not (at least not up until now!) been widely associated with MarkLogic. Why some technologies that seem like a perfect fit for XML apps are less used in the Mark Logic ecosystem is anyone’s guess, but one thing App Builder can contribute to the environment is some fresh DNA. Maybe your apps can benefit from these as well.

XForms and XRX. Clicking through the screens of App Builder is really a fancy way of editing XML. Upon first arriving on a page, the client makes a GET request to an “Application XML Endpoint” (axe.xqy) to get the current state of the project, which is rendered in the user interface. Interacting with the page edits the in-memory XML. Afterwards, the updated state is PUT back to the same endpoint upon clicking ‘Save’ or prior to navigating away. This is a classic XRX architecture. MarkLogic ships with a copy of the XSLTForms engine, which makes use of client-side XSLT to transform XForms Markup into divs, spans, classes, and JavaScript that can be processed entirely in the browser. Thus XForms works on all supported browsers all the way back to IE6. The apps built by the App Builder don’t use any XForms (yet!) but as App Builder itself demonstrates, it is a great platform for application development.

To be honest, many XForms apps have fallen short on the polished UI department. Not so with App Builder, IMHO. An early, and in hindsight somewhat misdirected, thrust of XForms advocacy pushed the angle of building apps with zero script needed. But one advantage of using a JavaScript implementation of XForms is that it frees you to use script as needed. So in many places, large amounts of UI, all mapped to XML, are able to be hidden away with CSS, and selectively revealed (or mapped to yet other HTML form controls) in small, self-contained overlays triggered via script. While it doesn’t fulfill the unrealistic promise of completely eliminating script, it’s a useful technique, one I predict we’ll see more of in the future.

Relax NG. XML Schema has its roots deep into the XML infrastructure. The type system of XQuery and XSLT 2.0 is based on it. Even XForms has ties to it. But for its wide reach, XML Schema 1.0 has some maddening limitations, and “takes some getting used to” before one can sight read it. In the appendices of many recent W3C specifications use the highly-readable compact syntax to describe content models is a way equally human and machine-readable.

What are these limitations I speak of? XML Schema 1.1 goes a long way toward resolving these, but isn’t yet widely in use. Take this example, the content model of the <options> element from Search API:

start = Options | Response

# Root element
OptionsType = (
 AdditionalQuery? &
 Annotation* &
 ConcurrencyLevel? &
 Constraint* &
 Debug? &
 DefaultSuggestionSource? &
 Forest* &
 Grammar? &
 Operator* &
 PageLength? &
 QualityWeight? &
 ReturnConstraints? &
 ReturnFacets? &
 ReturnMetrics? &
 ReturnQtext? &
 ReturnQuery? &
 ReturnResults? &
 ReturnSimilar? &
 SearchOption* &
 SearchableExpression? &
 SortOrder* &
 SuggestionSource* &
 Term? &

The start line indicates that, within this namespace, there are two possible root elements, either <options> or <response> (not shown here). An instance with a root of, say search:annotation is by definition not valid. Try representing that in XML Schema.

The definition of OptionsType allows a wide variety of child elements, some zeroOrMore times, other optional (zero or one occurrence), with no ordering restrictions at all between anything. XML Schema can’t represent this either. James Clark’s trang tool converts Relax NG into XML Schema, and has to approximate this as an xsd:choice with maxOccurs=”unbounded”, thus the elements that can only occur once are not schema-enforced. Thus the Relax NG description of the content model, besides being more readable, actually contains more information than the closest XML Schema. So particularly for XML vocabularies that are optimized for human use, Relax NG is a good choice for schema development.

Out of line validation. So if XML Schema doesn’t fully describe the <options> node, how can authors be sure they have constructed one correctly? Take a step back: even if XML Schema could fully represent the content model, for performance reasons you wouldn’t want to repeatedly validate the node on every query. The options node tends to change infrequently, mainly during a development cycle. Both of these problems can be solved with out-of-line validation: a separate function call search:check-options().

Inside this function you’ll find a validate expression that will make as much use of the Schema as it can, but also much more. The full power of XQuery can be leveraged against the proposed <options> node to check for errors or inconsistencies, and provide helpful feedback to the developer. Since it happens out-of-line, these checks can take substantially longer than actually handing the query based on them. The code can go as in-depth as it needs to without performance worries. This is a useful technique in many situations. One potential shortfall is that people might forget to call your validation function, but in practice this hasn’t been too much trouble.

Higher-order functions. The predecessor to Search API had a problem that it was so popular that users would modify it to suit their unique requirements, which lead to dozens of minor variations floating around in the wild. Different users have different needs and expectations for the library, and making a one-size-fits-all solution is perhaps not possible. One way to relieve this kind of feature pressure is to provide enough extension hotspots to allow all the kinds of tweaks that users will want, preserving the mainline code. This involves prediction, which is difficult (especially about the future). But a good design makes this possible.

Look inside the built-app and you will find a number of function pointers, implemented as a new datatype xdmp:function. XQuery 1.1 will have a more robust mechanism for this, but it might be a while before this is widespread. By modifying one file, changing a pointer to different code, nearly every aspect of the application can be adjusted.

Similarly, a few hotspots in the Search API can be customized, to hook in different kinds of parsers or snippet-generators. This powerful technique can take your own apps to the next level.


Wednesday, October 21st, 2009

XForms 1.1 is out

XForms 1.1 is now a full W3C Recommendation. Compared to version 1.0, which went live a bit more than 6 years ago, version 1.1 offers lots of road-tested tools that make development easier and more powerful, including new datatypes and XPath functions, a significantly more powerful submission subsystem, and a more flexible event model.

And XSLTForms already supports almost all of the new goodies. -m

Wednesday, October 14th, 2009

Phyllis Dubinko 1926-2009

In loving memory.

Monday, October 12th, 2009

Speaking at Northern Virginia Mark Logic User Group Oct 27

Come learn more about Mark Logic and get a behind-the-scenes look at the new Application Builder. I’ll be speaking at the NOVA MUG (Northern Virginia Mark Logic User Group) on October 27. This turns out to be pretty close to the big Semantic Web conference, so I’ll stick my head in there too. Stop by and look me up!

Details at the developer site.


Wednesday, October 7th, 2009

US Federal Register in XML

Fed Thread is a front end for the newly XMLified Federal Register. Why is this a big deal? It’s a daily publication of the goings-on of the US government. It’s a primary source for all kinds of things that normally only get rehashed through news organizations. And it is bulky–nobody can read through it on a regular basis. A yearly subscription (printed) would cost nearly $1000 and fill over 80,000 pages.

Having it in XML enables all kinds of searching, syndication, and annotation via flexible front ends like this one. Yay for transparency. -m

Tuesday, September 22nd, 2009

XForms Developer Zone

Another XForms site launched this week. This one seems pretty close to what I would like XForms Institute to become, if I had an extra 10 hours per week. -m

Wednesday, September 16th, 2009

Billion triples challenge

I had been asking around earlier for large RDF datasets. Here’s one. Looks like a great contest to build an app around this, but unfortunately, the deadline looks like it’s soonish (1 Oct).

What is it?

The major part of the dataset was crawled during February/March 2009 based on datasets provided by Falcon-S, Sindice, Swoogle, SWSE, and Watson using the MultiCrawler/SWSE framework. To ensure wide coverage, we also included a (bounded) breadth-first crawl of depth 50 starting from

The downloaded content was parsed using the Redland toolkit with rdfxml, rss-tag-soup, rdfa parsers. We rewrote blank node identifiers to include the data source in order to provide unique blank nodes for each data source, and appended the data source to the output file. The data is encoded in NQuads format and split into chunks of 10m statements each.

The page includes some fairly detailed statistics on the data breakdown. Cool. -m

Thursday, August 27th, 2009

Steorn and the three-body problem

As part of the 300 program, Steorn recently released specific details about their technology, which was pretty much the whole point of the 300. The general reaction has been vaguely positive and appreciative (like this posting), though there is a huge self-selection bias in play.

Their key operating principle is clever and unlike anything I’ve seen in my armchair studies of supposed magnetic motion machines. But it’s complicated, in a way that is like the EM equivalent of the three-body problem. In other words, their description is neither obviously wrong nor right. Any time you have moving magnetic fields and pulsating electromagnet currents, hard-to-predict interactions tend to happen. There’s also a host of measurement difficulties, including properly accounting for power factors and complex number phasors for power input/output in inductive circuits.

There’s still a vast disconnect between the jury announcement and failed public demonstration and everything else still going on. It’s fascinating to watch. :-)


Friday, August 7th, 2009

Balisage bound

I’m heading off to beautiful downtown Montréal this weekend for Balisage, my first appearance at this particular conference. If you’re heading there too, look me up. -m

Monday, August 3rd, 2009

Geek Thoughts: I hate cars

I hate moving at high speed with multiple large chunks of metal in close formation.

I hate the sound of traffic. The smell.

I hate it when  people jump in a car to drive somewhere a block away.

I hate driving. I hate parking. I hate SUVs.

Also, getting a root canal leaves me in a foul mood.

More collected Geek Thoughts at

Friday, July 24th, 2009

Java-style namespaces for markup

I’m noodling around with requirements and exploring existing work toward a solution for “decentralized extensability” on xml-dev, particularly for HTML. The notion of “Java-style” syntax, with reverse dns names and all, has come up many times in the context of these kinds of discussions, but AFAICT never been fully fleshed out. This is ongoing, slowly, in available time–which as been a post or two per week.  (In case there is any doubt, this is a spare-time effort not connected with my employer)

Check it out and add your knowledge to the thread. -m

Tuesday, July 7th, 2009

Demo Jam at Balisage 2009

Come join me at the Demo Jam at Balisage this year. August 11 at 6:30 pm. There will be lots of cool demos, judged by audience participation. I’d love to see you there. -m

Saturday, June 27th, 2009

Steorn: the jury has spoken

The Steorn 300 program is underway, and yes, I am one of the 300 looking at their information which is coming out in once-a-week bursts in the form of educational modules. So far, nothing interesting. Some basic physics lessons, and somewhat more interesting forum activity.

But all signs seem to be pointing in the wrong direction for a miraculous breakthrough. A jury of members selected by Steorn recently unanimously stated:

The unanimous verdict of the Jury is that Steorn’s attempts to demonstrate the claim have not shown the production of energy. The jury is therefore ceasing work.

Even this announcement raises more questions, and at this point in the game, more questions is not a good thing. Of the 22 original jury members, apparently only 16 were left at the end. Those 16 were unanimous, but what did the other 6 think? Were they booted as dissenters? Also allegedly the jury was never presented with actual hardware, which seems completely crazy and counterproductive from the standpoint of the company that convened the jury.

This kind of story has unfolded many times before, and it doesn’t end well. I’ve spent many hours debunking energy claims and perpetual motion devices. But hey, the company says they are proceeding with plans to commercialize the technology by the end of 2009. No matter what happens, it will be interesting to watch. -m

Previously: How Orbo works, When the experimenter wants to believe, and The downside of free energy.

Thursday, June 25th, 2009

MarkLogic Server 4.1, App Services released

I’m thrilled to announce MarkLogic 4.1 and with it my project App Services, is here. Top-of-the-post props go out to Colleen, David, and Ryan who made it happen.

You might already know that MarkLogic Server is a super-powerful database slash search engine powering projects like MarkMail. (But did you know there’s a free-as-in-beer edition?) The next step is to make it easier to use and build your own apps on top of the server.

The first big piece is the Search API, which lets you do “Google-style” searches over your content like this:

search:search(“MP3 OR iPod AND color:black -Zune”)

The built-in grammar includes AND, OR, parens for grouping, – for negation, quotations for phrases, and easy ways to define facets like date:today or author:”Bill Shakespeare” or GPA:3.95. By passing in additional options, you can redefine the grammar and control all aspects of the search and how the results are returned. Numerous grass-roots efforts at doing someting like this had begun to spring up, so the time was right to come out with an officially-sanctioned API. For those developers who haven’t seen the light yet and don’t fancy XQuery, an API like this is a huge benefit.

The next piece builds on the Search API to offer a graphical App Builder tool that produces a simplified MarkMail-type app around your content. It looks like this:

App Builder screen shot, Search page

The App Builder itself is based on XForms via the excellent XSLTForms library and REST, making it a full-blown XRX application.

Lots more info, videos, screencasts, articles, and more are coming soon.

You can start playing with this now by visiting the download page. Under the Community License, you can put 10 gigs of content into it for noncommercial production free-as-in-beer.

Enjoy! I’ll be catching my breath for the next two months*. -m

* Not really

Tuesday, June 23rd, 2009

The Science of a Good Beer

When I get time, I want to watch all of this program on from Dave McLean in SF who talks about how to make beer, why it tastes like it does, and why some people prefer various styles of beer.

It’s a good follow-up to the NHC reception I made it to last week, with a 3 course dinner (each made with and served with a different beer), a lecture by the highly entertaining Brewing Scientist Charlie Bamforth, and a tasting panel of 20 different additives as palate training.

Even if you’re busy, take some time to appreciate the things you might otherwise enjoy without thinking. -m

Sunday, June 14th, 2009

Selling my house

I’m sticking around Sunnyvale, but am selling my house. It’s a smaller “starter home”place good for a small family. It’s close to Yahoo!, Google, Ebay, Cisco, and lots of other South Bay companies. In a great neighborhood with lots of parks, restaurants (Giovanni’s Pizza just down the street is fantastic), and a nearby movie theater. If you know anyone moving into the area and looking for a place, here’s a chance to short-circuit a lot of the hassle and get straight into well-cared-for place from a reputable seller.

I’m hesitant to post my address and pictures of my house, etc. here. Email me if you want to see more. -m

Thursday, June 4th, 2009

Displaced Yahoo Placement Service

I was shocked today to find out that one of my old friends from the Yahoo Search days was let go in the last round. He’s simply brilliant and would have been one of the last people I would have expected that the managers-in-purple could do without.

At the same time, I’m getting hounded by recruiters–five so far just this week.

So let me put these two forces against each other and see if they cancel out. To any former Yahoos: get in touch with me and I’ll do what I can to hook you up with a cool opportunity. This offer is good for June and July–after that I can’t reasonably say I’ll have time for matchmaking. Send me your CV via email and I’ll get started. No promises on results, but I’ll do what I can. :-)


Wednesday, June 3rd, 2009

See you at Balisage

Balisage, formerly Extreme Markup, is the kind of conference I’ve always wanted to attend.

Historically my employers have been not quite enough involved in the deep kinds of topics at this conference (or too cash-strapped, but let’s not go there) to justify spending a week on the road. So I’m glad that’s no longer the case: Mark Logic is sponsoring the conference this year. I’m looking forward to the show, and since I’m not speaking, I might be able to relax a little and soak in some of the knowledge.

See you there! -m

Saturday, May 30th, 2009

XForms Institute moved to SVN

About a week ago I moved XForms Institute over to Subversion. Now the entire site is under version control, with a local copy I can edit. Publishing is as easy as logging in and running the command ‘svn up’. Honestly, I should have done this long ago. And any future sites I work on will use this approach too–it’s fantastic.

If you notice any glitches, let me know. -m

Thursday, May 21st, 2009

One year at Mark Logic

Another anniversary this week, one year at Mark Logic. Much of it in stealth mode, but more details of what I’ve been up to are forthcoming. -m