Archive for the 'software' Category

Thursday, May 29th, 2008

XRX

Bumped into XRX today. XForms + REST + XQuery. I like the sound of this, and XForms on the client just got a whole bunch easier…

I’m seeing multiple signs that the confluence of XForms and XQuery has legs. (And REST just plain makes sense in any situation). -m

Wednesday, May 28th, 2008

XForms Validator on Google App Engine?

I registered ‘xfv’ on Google App Engine. Too bad there doesn’t appear to be any significant XML libraries supported. I have XPath covered by my pure-python WebPath, but what about Relax NG? Anyone know of anything in pure python? -m

Wednesday, April 30th, 2008

Quote of the day

“Rails is a lot of fun, and lets me do cool new things – but it’s hard to eat it.”

Simon St. Laurent

-m

Monday, April 28th, 2008

SearchMonkey in private beta

I haven’t mentioned it yet, but SearchMonkey (now an official name, not just a project name) is in external limited beta. Keep an eye on ysearchblog, lots more technical content is on the way. -m

Monday, March 3rd, 2008

WebPath and Wikipedia

The WebPath bug reports continue to roll in. For one, queries against *.wikipedia.* don’t seem to work. You get something back, but it has no resemblance to the page you were looking for. The problem comes from the W3C tidy service that I use, specifically that the (understandably overworked and understaffed) admins at the Wikimedia Foundation seem to have blocked it. It seems like more than a simple IP or user-agent-based block. I’ve emailed them about it but haven’t heard back yet.

So, this highlights the limitation of having a single-source converter in the Platonic Web module of WebPath. So I turn to my readers: do you know of any other tidy servers? Or converters of a non-tidy origin? For any of these to work, they need to return clean XML corresponding to the original page (as opposed to, say, returning something with big headers/footers or ampersand-encoded). This seems like an outstanding need for the open source community.

Please comment below with ideas. Thanks! -m

UPDATE: heard back from the Wikipedia admins, and although professional and helpful-as-can-be-expected, they won’t be changing anything on their end. Still looking for more open source options.

Wednesday, February 13th, 2008

WebPath on next.yahoo

It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m

Thursday, January 24th, 2008

WebPath wants to be free (BSD licensed, specifically)

WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate.

The focus of WebPath was rapid development and providing an experimental platform. There remains tons of potential work left to do on it…watch this space for continued discussion. I’d like to call out special thanks to the Yahoo! management for supporting me on this, and to Douglas Crockford for turning me on to Top Down Operator Precedence parsers. Have a look at the code. You might be pleasantly surprised at how small and simple a basic XPath 2 engine can be. So, who’s up for some XPath hacking?

Code download. (Coming to SourceForge with CVS, etc., in however many days it takes them to approve a new project) I hope this inspires more developers to work on similar projects, or better yet, on this one! -m

Monday, January 7th, 2008

Yahoo! introduces mobile XForms

Admittedly, their marketing folks wouldn’t describe it that way, but essentially that’s what was announced today. (documentation in PDF format, closely related to what-used-to-be Konfabulator tech; here’s the interesting part in HTML) The press release talks about reaching “billions” of mobile consumers; even if you don’t put too much emphasis on press releases (you shouldn’t) it’s still talking about serious use of and commitment to XForms technology.

Shameless plug: Isn’t it time to refresh your memory, or even find out for the first time about XForms? There is this excellent book available in printed format from Amazon, as well as online for free under an open content license. If you guys express enough interest, good things might even happen, like a refresh to the content. Let’s make it happen.

From a consumer standpoint, this feels like a welcome play against Android, too. Yahoo! looks like it’s placing a bet on working with more devices while making development easier at the same time. I’ll bet an Android port will be available, at least in beta, before the end of the year.

Disclaimer: I have been out of Yahoo! mobile for several months now, and can’t claim any credit for or inside knowledge of these developments. -m

P. S. Don’t forget the book.

Monday, December 24th, 2007

OLPC is here

I’m taking some time off from work to relax a bit. And just in time for that, my OLPC arrived. Check out the photoset on Flickr. It’s an impressive little machine, and I’m very happy to have got this instead of a Kindle. :)

-m

Friday, December 21st, 2007

XML 2007 buzz: XForms 1.1

One whole evening of the program was devoted to XForms, focused around the new 1.1 Candidate Recommendation. I admit that some of the early 1.1 drafts gave me pause, but these guys did a good job cleaning up some of the dim corners and adding the right features in the right places. This is worth a careful look. -m

Sunday, December 16th, 2007

Slides from XML 2007: WebPath: Querying the Web as XML

Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon.

WebPath (html)

WebPath (OpenDocument, 4.7 megs)

Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text versons? -m

Thursday, November 29th, 2007

XPath 2.0 implementation details

Well, my plans for a series of postings about details of implementing XPath 2.0 fell rather short, so let’s skip straight to the good stuff.

An article by Mike Kay giving the details of the Saxon architecture. On the surface it’s about performance, but it also has an excellent section in internals. Worth a look. This has been quite influential for me, and maybe you too. -m

Monday, November 5th, 2007

A better name for CURIEs (?)

“Compact Clark Notation“. (Inspired by reading this) -m

Thursday, October 25th, 2007

Will Leopard run OK on an 800Mhz G4?

I see from the system requirements that Leopard requires an “866” Mhz processor. Is this a hard limit, or just an advisory? My first Mac–the one I wrote the book on–is a lowly 800 Mhz. box. Is it worth trying to upgrade it? -m

Saturday, October 20th, 2007

Building a tokenizer for XPath or XQuery

In researching for an XPath 2.0 implementation, I ran across this curious document from the W3C. Despite being labeled a Working Draft (as opposed to a Note), it appears to be a one-shot document with no future hope for updates or enhancements.

In short, it outlines several options for the first stage or two of an XPath 2.0 or XQuery implementation. (Despite the title, it talks about more than just a tokenizer; additionally a parser and a possible intermediate stage). Tokenizing and parsing XPath are significantly more difficult than other languages, because things like this are perfectly legitimate (if useless):

if(if) then then else else- +-++-**-* instance
of element(*)* * * **---++div- div -div

The document tries to standardize on some terminology for various approaches toward dealing with XPath. The remaining bulk of the document sketches out some lexical states that would be useful for one particular implementation approach. I guess the vibrant, thriving throngs of XPath 2.0 developers didn’t see the need for this kind of assistance.

In short, I didn’t find it terribly useful. Maybe some readers have, though. Feel free to comment below. Subsequent articles here will describe how I approached the problem. Stay sharp! -m

Monday, October 15th, 2007

XForms evening at XML 2007

Depending on who’s asking and who’s answering, W3C technologies take 5 to 10 years to get a strong foothold. Well, we’re now in the home stretch for the 5th anniversary of XForms Essentials, which was published in 2003. In past conferences, XForms coverage has been maybe a low-key tutorial, a few day sessions, and hallway conversation. I’m pleased to see it reach new heights this year.

XForms evening is on Monday December 3 at the XML 2007 conference, and runs from 7:30 until 9:00 plus however ERH takes on his keynote. :) The scheduled talks are shorter and punchier, and feature a lot of familiar faces, and a few new ones (at least to me). I’m looking forward to it–see you there! -m

Thursday, October 11th, 2007

Hacking Facebook

I didn’t get to do much for Yahoo Hack Day, but I did get to help a coworker a teeny bit with an implementation of Y! Search for social web sites, including Facebook. There could be some interesting repercussions from that, so I won’t say more now. But what did surprise me is how many Yahoos are active on Facebook.

Myself–I’m still a Facebook curmudgeon. But mostly I simply haven’t had the time to check it out, or figure out the value proposition of accepting an invitation. -m

Monday, October 8th, 2007

XML 2007 Schedule

As widely reported by now, the final schedule for XML 2007 this December in Boston is up. All I have to add is the suggestion of careful attention to the Tuesday program at 4:00. :) If you can’t wait, some technical details are forthcoming in this space. That is all. -m

Monday, October 1st, 2007

simple parsing of space-seprated attributes in XPath/XSLT

It’s a common need to parse space-separated attribute values from XPath/XSLT 1.0, usually @class or @rel. One common (but incorrect) technique is simple equality test, as in {@class=”vcard”}. This is wrong, since the value can still match and still have other literal values, like “foo vcard” or “vcard foo” or ” foo vcard bar “.

The proper way is to look at individual tokens in the attribute value. On first glance, this might require a call to EXSLT or some complex tokenization routine, but there’s a simpler way. I first discovered this on the microformats wiki, and only cleaned up the technique a tiny bit.

The solution involves three XPath 1.0 functions, contains(), concat() to join together string fragments, and normalize-space() to strip off leading and trailing spaces and convert any other sequences of whitespace into a single space.

In english, you

  • normalize the class attribute value, then
  • concatenate spaces front and back, then
  • test whether the resulting string contains your searched-for value with spaces concatenated front and back (e.g. ” vcard “

Or {contains(concat(‘ ‘,normalize-space(@class),’ ‘),’ vcard ‘)} A moment’s thought shows that this works well on all the different examples shown above, and is perhaps even less involved than resorting to extension functions that return nodes that require further processing/looping. It would be interesting to compare performance as well…

So next time you need to match class or rel values, give it a shot. Let me know how it works for you, or if you have any further improvements. -m

Friday, September 21st, 2007

Come see me at XML 2007

Watch this space for details. I’ll be speaking about something related to Python and XPath 2.0. Watch this blog for tidbits on the subject. :) -m

Saturday, September 8th, 2007

Steven Pemberton and Michael(tm) Smith on (X)HTML, XForms, mobile, etc.

Video from XTech, worth a look. -m

Wednesday, August 8th, 2007

New W3C Validator

Go check it out. It even has a Tidy option to clean up the markup. But they missed an important feature: it should include an option to run Tidy on the markup first then validate. This is becoming the defacto bar for web page validity anyway… -m

Monday, July 23rd, 2007

Prototypical inheritance in Python

Based on Doug Crockford’s chapter in Beautiful Code, I wanted to take a crack at implementing Top Down Operator Precedence in Python. After all, Python and JavaScript are quite similar, right?

Not really. As you can imagine, Doug’s code makes great use of JavaScript’s strengths, in this case the ability to assign new methods to any object. For an initial version, I wanted to make the Python version behave the same way, as opposed to a deeper redesign that would be more pythonic. (That would come later.)
My initial approach was a __getattr__ method that consisted simply of return getattr(self.prototype, name). When reattaching a new method to an instance, I needed an extra wrapper, done through a wrap method which consisted of return new.instancemethod(method, self, self.__class__). It would be used like this: obj.method = obj.wrap(some_func).

This caused a subtle problem that took me a while to track down. In JavaScript, any function can reference the built-in this variable, which works whether the function is bound to some specific object or not. (Even global functions are bound to the global object.) But Python doesn’t have such a keyword. The language prefers the explicit, and uses a explicitly passed parameter, called by convention self. The call to wrap a specific function also had the effect of binding the self parameter to that particular object; even if it later became a prototype for some other object. This manifested itself as all kinds of broken behavior. For example, the original code has a global scope object, and every time a new scope was entered, the global pointed to a newer object that kept a reference to the rest of the scope chain. But in the object’s methods, self pointed to something different than the global. Messy.

Before I get into solutions, I’d like to see what readers say. How would you go about implementing prototypical inheritance in Python? And what is a more pythonic way to accomplish the same thing? Comment below. Thanks! -m

Monday, July 16th, 2007

Beautiful Code

If it’s been quiet on this front it’s because I’ve been engrossed in my continuing education. Andy Oram sent me a copy of Beautiful Code, a thoroughly enjoyable work from O’Reilly. If you like stretching your brain by reading code-intense essays from top-tier coders, I recommend this volume. In particular, I’m been digging into Douglas Crockford’s Top Down Operator Precedence chapter.

Other than that, some interesting BJCP classes, but I’m keeping that non-tech stuff over on meadblog. -m

Thursday, May 3rd, 2007

The billion-dollar sand trap?

I thought this article was interesting in overall tone and a specific quote:

Modifying the software for each phone’s display is a matter of brute-force labor. There’s no intellectual way around it. Yahoo! is one of the few companies that’s been able to pull this off, but only because they have an army of Ph.D. hackers working for them.

Thanks! The primary design for the content adaptor was done by one non-Ph.D.–me–with plenty of help from the resident “phone whisperer” and a talented team of fellow non-Ph.Ds. It’s not a matter of “brute force” at all. The only way to solve the problem with finite resources is to understand developers, understand the problem space, and be smart about drawing a line between the two (and being flexible enough to handle the inevitable unknown).

One thing is certain: the industry is changing fast. A mobile app working great today will look dodgy in a year, and be obsolete in two years. It’s not clear if this will stabilize at some point, or keep shifting..

But I’m curious about what the rest of you think. Is mobile the next big thing, or a huge sand trap? Comment below. -m

Friday, February 16th, 2007

J2ME Disappointment

Sun Java Wireless Toolkit 2.5 is out of beta. Can anyone explain to me the logic of making a Java toolkit that’s Windows-only? Sheesh. -m

Tuesday, February 13th, 2007

Windows Live Search for Mobile

Spotted under the headline Windows Live Search for Mobile Goes Final, Still Great (like they were expecting it to suddenly plummet in quality?) on Gizmodo. It’s a 114k jar file that runs on my SLVR, where Yahoo! Go isn’t yet available yet, so points for that. Search suggestions show as you type, hugely useful on a klunky 9-key entry situation. They use an interesting UI to hold search results, densely packed–6 down the screen–with a status bar on top, and each search result marquee-scrolling back-and-forth as needed. A detail page can zap you in to map mode or set up a call.

My standard test search–a little offbeat but still plausible–for mead near Sunnyvale produced disappointing results. The meadery within walking distance didn’t show, and of the top 6, two were duplicates. Scrolling down to the 10th result, though, did show an interesting, useful result, albeit 60.15 miles away: Knowne World Meads. I wanted to visit the web site, but here lies another problem: there’s no web integration. None of the search results include a URL or clickable link.

For all the hassle, I’ll stick with Opera Mini and my favorite search engine, thank you. -m

Tuesday, February 13th, 2007

changes the architecture of the house, not just the color of the paint

ERH’s comments on XForms, as part of his predictions for 2007. Worth a read. -m

Thursday, February 8th, 2007

The internet is a series of pipes

Check it out. -m

Tuesday, January 30th, 2007

Yahoo! Keitai

A few more tidbits on the Softbank Mobile turnaround, for which helped architect the mobile platform.

SoftBank phones have a “Y!”-button which links to Yahoo!-keitai. Yahoo-Keitai! offers a list of official sites, new services (e.g. a new communicator service), and also access to free mobile internet sites through the YAHOO directory, as well as access to YAHOO services, such as YAHOO-auctions.

-m

MicahLogic is Stephen Fry proof thanks to caching by WP Super Cache