simple parsing of space-seprated attributes in XPath/XSLT

It’s a common need to parse space-separated attribute values from XPath/XSLT 1.0, usually @class or @rel. One common (but incorrect) technique is simple equality test, as in {@class=”vcard”}. This is wrong, since the value can still match and still have other literal values, like “foo vcard” or “vcard foo” or ” foo vcard bar “.

The proper way is to look at individual tokens in the attribute value. On first glance, this might require a call to EXSLT or some complex tokenization routine, but there’s a simpler way. I first discovered this on the microformats wiki, and only cleaned up the technique a tiny bit.

The solution involves three XPath 1.0 functions, contains(), concat() to join together string fragments, and normalize-space() to strip off leading and trailing spaces and convert any other sequences of whitespace into a single space.

In english, you

  • normalize the class attribute value, then
  • concatenate spaces front and back, then
  • test whether the resulting string contains your searched-for value with spaces concatenated front and back (e.g. ” vcard “

Or {contains(concat(‘ ‘,normalize-space(@class),’ ‘),’ vcard ‘)} A moment’s thought shows that this works well on all the different examples shown above, and is perhaps even less involved than resorting to extension functions that return nodes that require further processing/looping. It would be interesting to compare performance as well…

So next time you need to match class or rel values, give it a shot. Let me know how it works for you, or if you have any further improvements. -m

One Response to “simple parsing of space-seprated attributes in XPath/XSLT”

  1. Anthony B. Coates

    This is the technique that I have been using for quite a while now, and it’s always worked well.
    Cheers, Tony.

MicahLogic is Stephen Fry proof thanks to caching by WP Super Cache