Archive for the 'software' Category

Monday, December 24th, 2007

OLPC is here

I’m taking some time off from work to relax a bit. And just in time for that, my OLPC arrived. Check out the photoset on Flickr. It’s an impressive little machine, and I’m very happy to have got this instead of a Kindle. :)

-m

Friday, December 21st, 2007

XML 2007 buzz: XForms 1.1

One whole evening of the program was devoted to XForms, focused around the new 1.1 Candidate Recommendation. I admit that some of the early 1.1 drafts gave me pause, but these guys did a good job cleaning up some of the dim corners and adding the right features in the right places. This is worth a careful look. -m

Sunday, December 16th, 2007

Slides from XML 2007: WebPath: Querying the Web as XML

Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon.

WebPath (html)

WebPath (OpenDocument, 4.7 megs)

Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text versons? -m

Thursday, November 29th, 2007

XPath 2.0 implementation details

Well, my plans for a series of postings about details of implementing XPath 2.0 fell rather short, so let’s skip straight to the good stuff.

An article by Mike Kay giving the details of the Saxon architecture. On the surface it’s about performance, but it also has an excellent section in internals. Worth a look. This has been quite influential for me, and maybe you too. -m

Monday, November 5th, 2007

A better name for CURIEs (?)

“Compact Clark Notation“. (Inspired by reading this) -m

Thursday, October 25th, 2007

Will Leopard run OK on an 800Mhz G4?

I see from the system requirements that Leopard requires an “866″ Mhz processor. Is this a hard limit, or just an advisory? My first Mac–the one I wrote the book on–is a lowly 800 Mhz. box. Is it worth trying to upgrade it? -m

Saturday, October 20th, 2007

Building a tokenizer for XPath or XQuery

In researching for an XPath 2.0 implementation, I ran across this curious document from the W3C. Despite being labeled a Working Draft (as opposed to a Note), it appears to be a one-shot document with no future hope for updates or enhancements.

In short, it outlines several options for the first stage or two of an XPath 2.0 or XQuery implementation. (Despite the title, it talks about more than just a tokenizer; additionally a parser and a possible intermediate stage). Tokenizing and parsing XPath are significantly more difficult than other languages, because things like this are perfectly legitimate (if useless):

if(if) then then else else- +-++-**-* instance
of element(*)* * * **---++div- div -div

The document tries to standardize on some terminology for various approaches toward dealing with XPath. The remaining bulk of the document sketches out some lexical states that would be useful for one particular implementation approach. I guess the vibrant, thriving throngs of XPath 2.0 developers didn’t see the need for this kind of assistance.

In short, I didn’t find it terribly useful. Maybe some readers have, though. Feel free to comment below. Subsequent articles here will describe how I approached the problem. Stay sharp! -m

Monday, October 15th, 2007

XForms evening at XML 2007

Depending on who’s asking and who’s answering, W3C technologies take 5 to 10 years to get a strong foothold. Well, we’re now in the home stretch for the 5th anniversary of XForms Essentials, which was published in 2003. In past conferences, XForms coverage has been maybe a low-key tutorial, a few day sessions, and hallway conversation. I’m pleased to see it reach new heights this year.

XForms evening is on Monday December 3 at the XML 2007 conference, and runs from 7:30 until 9:00 plus however ERH takes on his keynote. :) The scheduled talks are shorter and punchier, and feature a lot of familiar faces, and a few new ones (at least to me). I’m looking forward to it–see you there! -m

Thursday, October 11th, 2007

Hacking Facebook

I didn’t get to do much for Yahoo Hack Day, but I did get to help a coworker a teeny bit with an implementation of Y! Search for social web sites, including Facebook. There could be some interesting repercussions from that, so I won’t say more now. But what did surprise me is how many Yahoos are active on Facebook.

Myself–I’m still a Facebook curmudgeon. But mostly I simply haven’t had the time to check it out, or figure out the value proposition of accepting an invitation. -m

Monday, October 8th, 2007

XML 2007 Schedule

As widely reported by now, the final schedule for XML 2007 this December in Boston is up. All I have to add is the suggestion of careful attention to the Tuesday program at 4:00. :) If you can’t wait, some technical details are forthcoming in this space. That is all. -m

Monday, October 1st, 2007

simple parsing of space-seprated attributes in XPath/XSLT

It’s a common need to parse space-separated attribute values from XPath/XSLT 1.0, usually @class or @rel. One common (but incorrect) technique is simple equality test, as in {@class=”vcard”}. This is wrong, since the value can still match and still have other literal values, like “foo vcard” or “vcard foo” or ” foo vcard bar “.

The proper way is to look at individual tokens in the attribute value. On first glance, this might require a call to EXSLT or some complex tokenization routine, but there’s a simpler way. I first discovered this on the microformats wiki, and only cleaned up the technique a tiny bit.

The solution involves three XPath 1.0 functions, contains(), concat() to join together string fragments, and normalize-space() to strip off leading and trailing spaces and convert any other sequences of whitespace into a single space.

In english, you

  • normalize the class attribute value, then
  • concatenate spaces front and back, then
  • test whether the resulting string contains your searched-for value with spaces concatenated front and back (e.g. ” vcard “

Or {contains(concat(‘ ‘,normalize-space(@class),’ ‘),’ vcard ‘)} A moment’s thought shows that this works well on all the different examples shown above, and is perhaps even less involved than resorting to extension functions that return nodes that require further processing/looping. It would be interesting to compare performance as well…

So next time you need to match class or rel values, give it a shot. Let me know how it works for you, or if you have any further improvements. -m

Friday, September 21st, 2007

Come see me at XML 2007

Watch this space for details. I’ll be speaking about something related to Python and XPath 2.0. Watch this blog for tidbits on the subject. :) -m

Saturday, September 8th, 2007

Steven Pemberton and Michael(tm) Smith on (X)HTML, XForms, mobile, etc.

Video from XTech, worth a look. -m

Wednesday, August 8th, 2007

New W3C Validator

Go check it out. It even has a Tidy option to clean up the markup. But they missed an important feature: it should include an option to run Tidy on the markup first then validate. This is becoming the defacto bar for web page validity anyway… -m

Monday, July 23rd, 2007

Prototypical inheritance in Python

Based on Doug Crockford’s chapter in Beautiful Code, I wanted to take a crack at implementing Top Down Operator Precedence in Python. After all, Python and JavaScript are quite similar, right?

Not really. As you can imagine, Doug’s code makes great use of JavaScript’s strengths, in this case the ability to assign new methods to any object. For an initial version, I wanted to make the Python version behave the same way, as opposed to a deeper redesign that would be more pythonic. (That would come later.)
My initial approach was a __getattr__ method that consisted simply of return getattr(self.prototype, name). When reattaching a new method to an instance, I needed an extra wrapper, done through a wrap method which consisted of return new.instancemethod(method, self, self.__class__). It would be used like this: obj.method = obj.wrap(some_func).

This caused a subtle problem that took me a while to track down. In JavaScript, any function can reference the built-in this variable, which works whether the function is bound to some specific object or not. (Even global functions are bound to the global object.) But Python doesn’t have such a keyword. The language prefers the explicit, and uses a explicitly passed parameter, called by convention self. The call to wrap a specific function also had the effect of binding the self parameter to that particular object; even if it later became a prototype for some other object. This manifested itself as all kinds of broken behavior. For example, the original code has a global scope object, and every time a new scope was entered, the global pointed to a newer object that kept a reference to the rest of the scope chain. But in the object’s methods, self pointed to something different than the global. Messy.

Before I get into solutions, I’d like to see what readers say. How would you go about implementing prototypical inheritance in Python? And what is a more pythonic way to accomplish the same thing? Comment below. Thanks! -m

Monday, July 16th, 2007

Beautiful Code

If it’s been quiet on this front it’s because I’ve been engrossed in my continuing education. Andy Oram sent me a copy of Beautiful Code, a thoroughly enjoyable work from O’Reilly. If you like stretching your brain by reading code-intense essays from top-tier coders, I recommend this volume. In particular, I’m been digging into Douglas Crockford’s Top Down Operator Precedence chapter.

Other than that, some interesting BJCP classes, but I’m keeping that non-tech stuff over on meadblog. -m

Thursday, May 3rd, 2007

The billion-dollar sand trap?

I thought this article was interesting in overall tone and a specific quote:

Modifying the software for each phone’s display is a matter of brute-force labor. There’s no intellectual way around it. Yahoo! is one of the few companies that’s been able to pull this off, but only because they have an army of Ph.D. hackers working for them.

Thanks! The primary design for the content adaptor was done by one non-Ph.D.–me–with plenty of help from the resident “phone whisperer” and a talented team of fellow non-Ph.Ds. It’s not a matter of “brute force” at all. The only way to solve the problem with finite resources is to understand developers, understand the problem space, and be smart about drawing a line between the two (and being flexible enough to handle the inevitable unknown).

One thing is certain: the industry is changing fast. A mobile app working great today will look dodgy in a year, and be obsolete in two years. It’s not clear if this will stabilize at some point, or keep shifting..

But I’m curious about what the rest of you think. Is mobile the next big thing, or a huge sand trap? Comment below. -m

Friday, February 16th, 2007

J2ME Disappointment

Sun Java Wireless Toolkit 2.5 is out of beta. Can anyone explain to me the logic of making a Java toolkit that’s Windows-only? Sheesh. -m

Tuesday, February 13th, 2007

Windows Live Search for Mobile

Spotted under the headline Windows Live Search for Mobile Goes Final, Still Great (like they were expecting it to suddenly plummet in quality?) on Gizmodo. It’s a 114k jar file that runs on my SLVR, where Yahoo! Go isn’t yet available yet, so points for that. Search suggestions show as you type, hugely useful on a klunky 9-key entry situation. They use an interesting UI to hold search results, densely packed–6 down the screen–with a status bar on top, and each search result marquee-scrolling back-and-forth as needed. A detail page can zap you in to map mode or set up a call.

My standard test search–a little offbeat but still plausible–for mead near Sunnyvale produced disappointing results. The meadery within walking distance didn’t show, and of the top 6, two were duplicates. Scrolling down to the 10th result, though, did show an interesting, useful result, albeit 60.15 miles away: Knowne World Meads. I wanted to visit the web site, but here lies another problem: there’s no web integration. None of the search results include a URL or clickable link.

For all the hassle, I’ll stick with Opera Mini and my favorite search engine, thank you. -m

Tuesday, February 13th, 2007

changes the architecture of the house, not just the color of the paint

ERH’s comments on XForms, as part of his predictions for 2007. Worth a read. -m

Thursday, February 8th, 2007

The internet is a series of pipes

Check it out. -m

Tuesday, January 30th, 2007

Yahoo! Keitai

A few more tidbits on the Softbank Mobile turnaround, for which helped architect the mobile platform.

SoftBank phones have a “Y!”-button which links to Yahoo!-keitai. Yahoo-Keitai! offers a list of official sites, new services (e.g. a new communicator service), and also access to free mobile internet sites through the YAHOO directory, as well as access to YAHOO services, such as YAHOO-auctions.

-m

Friday, January 26th, 2007

Opera Mini turns One

Congrats to Opera Mini on its first anniversary. I just installed it on my new SLVR, and the download is an astounding 98k. Why can’t more software be this lean? And yes, Y! search came as the default. -m

Friday, January 26th, 2007

UBL Swinger

An easy to use UBL Editor. Has anyone tried it? -m

Wednesday, January 24th, 2007

Histogram of top 10 words used in the 2007 State of the Union address:

I’ve always had a thing for text analysis.

  • the 352
  • and 250
  • to 225
  • of 188
  • in 118
  • a 108
  • we 100
  • is 76
  • our 75
  • that 72

Source. -m

Tuesday, January 23rd, 2007

Does XPath 2.0 exist outside of Java?

So, about a year ago, I wanted to use XPath 2.0 on a project. Turns out no non-toy, non-alpha versions existed except in Java land (where Saxon is quite good). Has the situation changed at all? Anything on the horizon? Libxml2? Anybody?? -m

Tuesday, January 23rd, 2007

My .02 on Wikipedia and nofollow

The nofollow setting on an outbound link should be a user-editable option, subject to the same community process that all other content on wikipedia already is. (Site guidelines, dispute resolution, restricted editing on certain articles for unregistered users, etc.) By default, links would get nofollow, but over time, they could be ‘blessed’, perhaps after a certain amount of time or human review. Wasn’t this how nofollow was supposed to work in the first place?
The community process works. Why maneuver around it? -m

Monday, January 8th, 2007

Yahoo! + Opera = Crazy Delicious

(Press release) Starting today, Y! is the exclusive search partner for Opera Mini across more than 100 countries. The release also names “oneSearch”, going live later in Q1–definitely something to keep an eye on. -m

Sunday, November 26th, 2006

Micah visiting UC Berkeley

This Wednesday, I’m visiting Berkeley to speak with visiting professor Erik Wilde and his School of Information students. It’s an open-ended discussion, but will almost certainly center on XForms, the intentional web, and related information flow technologies. If you’re in Berkeley this Wednesday, drop me a line. -m

Sunday, October 1st, 2006

Yahoo! + SoftBank: watch this space

Today Softbank Mobile launched a new mobile service, delivering tons of Yahoo! Japan content, powered by Yahoo! US technology, to Softbank Mobile phones. This is notable for a few reasons:

  • In the past, content of this caliber been inside paid walled gardens in Japan. Opening this up could be the tipping point for a shake-up in one of the most amazing mobile markets.
  • This is the first time a carrier has been in so close with a content provider. If this works out (and leading signs are very good), it could be a model for the rest of the world.
  • I’ve seen some of the new hardware from SoftBank Mobile. The phones are great and–through tight Y! integration–go a long way toward solving longstanding UI problems related to the mobile web.
  • Number portability is coming to Japan, I believe beginning today on October 24. Once this gets momentum, user bases could shift rapidly. Today is the ideal time to be playing a strong card.
  • Apple rumors continue to swirl around SoftBank. I’m giddy at the thought of iPods accessing the web through my code. :-)

So, watch this space. More good things are coming. -m