xpath | MicahLogic Queryopticon

commercialism

Recalibrating expectations of XML performance

By : mdubinko April 2, 2010

0 Comment

Working at MarkLogic has forced me to recalibrate my expectations around XML-related performance issues. Not to brag or anything, but it’s screaming fast. Conventional wisdom of avoiding // in paths doesn’t apply, since that’s the sort of thing the indexes are made to do, and that’s just the start. Single milliseconds are now a noteworthy…

languages

XML 2008 liveblog: Introduction to Schematron

By : mdubinko December 9, 2008

0 Comment

Wendell Piez, Mulberry Technologies Assertion-based schema language. A way to test XML documents. Rule-based validation language. Cool report generator. Good for capturing edge cases. Same architecture as XSLT. (Schematron specifies, does not perform) <schema xmlns=”http://purl.cclc.org/dsdl/schematron”> <title>Check sections 12/07</title> <pattern id=”section-check”> <rule context=”section”> <assert test=”title”>This section has no title</assert> <report test=”p”>This section has paragraphs</report> … Demo….

xml

XML 2008 liveblog: Exploring the New Features of XSLT 2.0

By : mdubinko December 9, 2008

0 Comment

Priscilla Walmsley, Datypic. “I feel like crying every time I have to go back to 1.0.” Normally this is a full-day course. Familiarity with XSLT 1.0 assumed here. Venn diagram… Much of what people think of as “XQuery” is actually XPath 2.0. XPath differences: root node -> “document node”. Namespace nodes, axis are deprecated. More…

XForms

XiX: Details about XForms in XQuery

By : mdubinko November 4, 2008

2 Comment

I was asked offline for more details about what I have in mind around XiX. Take a simple piece of XML, like this: <root><a>3</a><b>4</b><total/></root>. An XForms Model can be applied, in an out-of-line fashion, to that instance. This is done through a bind element, with XPath to identify the nodes in question, plus other “model…

python

Top Down Operator Precedence in Python

By : mdubinko July 15, 2008July 15, 2008

0 Comment

This article made my day. Very similar approach to what I did in WebPath, but even cleaner. Great explanation and performance numbers. -m P.S. Thanks to Crock for pointing this out.

announcement

XForms Validator on Google App Engine?

By : mdubinko May 28, 2008

0 Comment

I registered ‘xfv’ on Google App Engine. Too bad there doesn’t appear to be any significant XML libraries supported. I have XPath covered by my pure-python WebPath, but what about Relax NG? Anyone know of anything in pure python? -m

xpath

FunctX XQuery library

By : mdubinko May 9, 2008

0 Comment

In the new-to-me department, here’s a library and description of useful XQuery functions from my friend Priscilla Walmsley. XSLT 2, also. -m P.S. Mark my words, more news is coming…

annoyance

WebPath and Wikipedia

By : mdubinko March 3, 2008March 3, 2008

2 Comment

The WebPath bug reports continue to roll in. For one, queries against *.wikipedia.* don’t seem to work. You get something back, but it has no resemblance to the page you were looking for. The problem comes from the W3C tidy service that I use, specifically that the (understandably overworked and understaffed) admins at the Wikimedia…

announcement

WebPath on next.yahoo

By : mdubinko February 13, 2008

0 Comment

It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m

announcement

WebPath: Python XPath 2 engine now up on Sourceforge

By : mdubinko January 25, 2008

2 Comment

I’ve taken this opportunity to ditch CVS on all my existing Sourceforge projects (pyxmlwiki, xfv) while setting up my newest project. Here’s the browable subversion source. Have at it. Where should you start with this code? Step zero, if you haven’t already, is to look through my XML 2007 slides on my site. First thing…

announcement

WebPath wants to be free (BSD licensed, specifically)

By : mdubinko January 24, 2008January 24, 2008

2 Comment

WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate. The focus of WebPath was rapid development and providing an experimental platform. There remains…

annoyance

XPath puzzler: solution

By : mdubinko December 31, 2007

0 Comment

Thanks to all the folks who showed interest in this little XPath puzzler published here a few weeks ago. Some asked to see the dataset, but I’m not able to release it at this time (but ask me again in 3 months). Turns out it was a combination of two bugs, one mine, one somebody…

announcement

Slides from XML 2007: WebPath: Querying the Web as XML

By : mdubinko December 16, 2007

0 Comment

Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon. WebPath (html) WebPath (OpenDocument, 4.7 megs) Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text…

everythingismiscellaneous

XPath puzzler

By : mdubinko December 15, 2007

5 Comment

While I’ve got your attention, here’s an XPath (1.0) puzzler. I have an RDFa dataset compiled from various and sundry sources. It’s all wrapped up in a single XML file. I run this XPath to see how many meta elements are present: //meta and it returns a node-set of size 762. Now, I want to…

software

XPath 2.0 implementation details

By : mdubinko November 29, 2007

1 Comment

Well, my plans for a series of postings about details of implementing XPath 2.0 fell rather short, so let’s skip straight to the good stuff. An article by Mike Kay giving the details of the Saxon architecture. On the surface it’s about performance, but it also has an excellent section in internals. Worth a look….

software

Building a tokenizer for XPath or XQuery

By : mdubinko October 20, 2007

1 Comment

In researching for an XPath 2.0 implementation, I ran across this curious document from the W3C. Despite being labeled a Working Draft (as opposed to a Note), it appears to be a one-shot document with no future hope for updates or enhancements. In short, it outlines several options for the first stage or two of…

browsers

XForms evening at XML 2007

By : mdubinko October 15, 2007

0 Comment

Depending on who’s asking and who’s answering, W3C technologies take 5 to 10 years to get a strong foothold. Well, we’re now in the home stretch for the 5th anniversary of XForms Essentials, which was published in 2003. In past conferences, XForms coverage has been maybe a low-key tutorial, a few day sessions, and hallway…

announcement

XML 2007 Schedule

By : mdubinko October 8, 2007

0 Comment

As widely reported by now, the final schedule for XML 2007 this December in Boston is up. All I have to add is the suggestion of careful attention to the Tuesday program at 4:00. :) If you can’t wait, some technical details are forthcoming in this space. That is all. -m

browsers

simple parsing of space-seprated attributes in XPath/XSLT

By : mdubinko October 1, 2007

1 Comment

It’s a common need to parse space-separated attribute values from XPath/XSLT 1.0, usually @class or @rel. One common (but incorrect) technique is simple equality test, as in {@class=”vcard”}. This is wrong, since the value can still match and still have other literal values, like “foo vcard” or “vcard foo” or ” foo vcard bar “….

Category: xpath