Working at MarkLogic has forced me to recalibrate my expectations around XML-related performance issues. Not to brag or anything, but it’s screaming fast. Conventional wisdom of avoiding // in paths doesn’t apply, since that’s the sort of thing the indexes are made to do, and that’s just the start. Single milliseconds are now a noteworthy…
Category: xpath
Wendell Piez, Mulberry Technologies Assertion-based schema language. A way to test XML documents. Rule-based validation language. Cool report generator. Good for capturing edge cases. Same architecture as XSLT. (Schematron specifies, does not perform) <schema xmlns=”http://purl.cclc.org/dsdl/schematron”> <title>Check sections 12/07</title> <pattern id=”section-check”> <rule context=”section”> <assert test=”title”>This section has no title</assert> <report test=”p”>This section has paragraphs</report> … Demo….
Priscilla Walmsley, Datypic. “I feel like crying every time I have to go back to 1.0.” Normally this is a full-day course. Familiarity with XSLT 1.0 assumed here. Venn diagram… Much of what people think of as “XQuery” is actually XPath 2.0. XPath differences: root node -> “document node”. Namespace nodes, axis are deprecated. More…
I was asked offline for more details about what I have in mind around XiX. Take a simple piece of XML, like this: <root><a>3</a><b>4</b><total/></root>. An XForms Model can be applied, in an out-of-line fashion, to that instance. This is done through a bind element, with XPath to identify the nodes in question, plus other “model…
This article made my day. Very similar approach to what I did in WebPath, but even cleaner. Great explanation and performance numbers. -m P.S. Thanks to Crock for pointing this out.
I registered ‘xfv’ on Google App Engine. Too bad there doesn’t appear to be any significant XML libraries supported. I have XPath covered by my pure-python WebPath, but what about Relax NG? Anyone know of anything in pure python? -m
In the new-to-me department, here’s a library and description of useful XQuery functions from my friend Priscilla Walmsley. XSLT 2, also. -m P.S. Mark my words, more news is coming…
The WebPath bug reports continue to roll in. For one, queries against *.wikipedia.* don’t seem to work. You get something back, but it has no resemblance to the page you were looking for. The problem comes from the W3C tidy service that I use, specifically that the (understandably overworked and understaffed) admins at the Wikimedia…
It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m
I’ve taken this opportunity to ditch CVS on all my existing Sourceforge projects (pyxmlwiki, xfv) while setting up my newest project. Here’s the browable subversion source. Have at it. Where should you start with this code? Step zero, if you haven’t already, is to look through my XML 2007 slides on my site. First thing…
WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate. The focus of WebPath was rapid development and providing an experimental platform. There remains…
Thanks to all the folks who showed interest in this little XPath puzzler published here a few weeks ago. Some asked to see the dataset, but I’m not able to release it at this time (but ask me again in 3 months). Turns out it was a combination of two bugs, one mine, one somebody…
Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon. WebPath (html) WebPath (OpenDocument, 4.7 megs) Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text…
While I’ve got your attention, here’s an XPath (1.0) puzzler. I have an RDFa dataset compiled from various and sundry sources. It’s all wrapped up in a single XML file. I run this XPath to see how many meta elements are present: //meta and it returns a node-set of size 762. Now, I want to…
Well, my plans for a series of postings about details of implementing XPath 2.0 fell rather short, so let’s skip straight to the good stuff. An article by Mike Kay giving the details of the Saxon architecture. On the surface it’s about performance, but it also has an excellent section in internals. Worth a look….
In researching for an XPath 2.0 implementation, I ran across this curious document from the W3C. Despite being labeled a Working Draft (as opposed to a Note), it appears to be a one-shot document with no future hope for updates or enhancements. In short, it outlines several options for the first stage or two of…
Depending on who’s asking and who’s answering, W3C technologies take 5 to 10 years to get a strong foothold. Well, we’re now in the home stretch for the 5th anniversary of XForms Essentials, which was published in 2003. In past conferences, XForms coverage has been maybe a low-key tutorial, a few day sessions, and hallway…
As widely reported by now, the final schedule for XML 2007 this December in Boston is up. All I have to add is the suggestion of careful attention to the Tuesday program at 4:00. :) If you can’t wait, some technical details are forthcoming in this space. That is all. -m
It’s a common need to parse space-separated attribute values from XPath/XSLT 1.0, usually @class or @rel. One common (but incorrect) technique is simple equality test, as in {@class=”vcard”}. This is wrong, since the value can still match and still have other literal values, like “foo vcard” or “vcard foo” or ” foo vcard bar “….