For instance, The Business Value of Windows Vista. Seriously, Vista for “speed and security”? Or mobile? The comments on this post alone are worth the click. -m
Archive for the 'commercialism' Category
Thursday, June 5th, 2008
Microformat search done right
From the Yahoo! Developer blog, new search keywords you can use to hone in on indexed microformats.
For example, to see every hAtom-bearing page that mentions ‘dubinko’ use the query [searchmonkeyid:com.yahoo.uf.hatom dubinko]. Works similarly for hCard, hCalendar, hReview, and XFN. I’m sure more are coming soon too. -m
Monday, June 2nd, 2008
Apple Mobile Me? (But watch out for falling SCO)
Rumor is that the .Mac service is being renamed to “Mobile Me”. Great, in it’s present state, it’s always been the kind of thing that’s completely useless to me, even aside from the annoying name.
But watch out: everyone’s favorite gang of bankrupt litigious weasels, the SCO group, in a desperate effort to prove they they have a broader business plan than making up claims about owning open source software, already have a mobile-related product called “Me, Inc.“. On the plus side, these guys are so deep into their bankruptcy proceedings that they probably don’t have the mettle to go up against Apple at this point. But neither do they have much to lose for trying… -m
Wednesday, May 28th, 2008
XForms Validator on Google App Engine?
I registered ‘xfv’ on Google App Engine. Too bad there doesn’t appear to be any significant XML libraries supported. I have XPath covered by my pure-python WebPath, but what about Relax NG? Anyone know of anything in pure python? -m
Wednesday, May 28th, 2008
OK already, XQuery has FLWORs, I get it
A very short rant on the state of XQuery tutorial materials on the web (not naming any names or linking any links).
I get it. Thank you for your fanatical emphasis on FLWOR constructs, but there is much more to it than that.
A few introductory sources don’t fall in to this trap, though. Mike Kay’s stuff. Priscilla Walmsley’s O’Reilly book for another. I’m pretty much finishing up reading it so I’ll review it here soon. -m
Thursday, May 22nd, 2008
XForms Ubiquity
I just found out about a nice little XForms engine called Ubiquity. (Having dinner with Mark Birbeck, TV Raman, and Leigh Klotz certainly helps one find out about such things) :-)
It’s a JavaScript implementation done right. Open source under the Apache 2.0 license. Seems like a nice fit with, oh maybe MarkLogic Server? -m
Wednesday, May 21st, 2008
XQuery Annoyances…
If you are used to XSLT 1.0 and XForms, you see { $book/bk:title } and think nothing of it. XSLT 1.0 calls the curly-brace construct an Attribute Value Template, which is pretty descriptive of where it’s used. Always in an attribute, always converted into a string, even if you are actually pointing to an element.
In XQuery, though, the curly-brace construct can be used in many different places. Depending on the context, the above code might well insert a bk:title element into your output. The proper thing to do, of course, is { $book/bk:title/text() }. Many XSLT and XForms authors would omit the extra text() selector as superfluous, but in XQuery it matters.
What’s worse, depending on your browser, you might not see any output on the page within a <bk:title> element (or a title element of any namespace). Caveat browser! -m
-m
Tuesday, May 20th, 2008
The two-line CV
In my about page, I’ve written my CV in two lines. Why don’t you try it, then link back to here?
I’ve been known to use this as an interview question, and it’s quite a bit harder than it looks. A clever candidate will turn the paper sideways giving themselves more room to write “two lines”, but that’s not the point. This exercise forces one to really think about their qualifications, skills, and experience; one’s “unique selling proposition”.
Writing short, as opposed to rambling on, is notoriously difficult. Someone who can do that with their own CV is off to a good start in my book. -m
P. S. Mark Logic is looking for some high-caliber XML and web folks. Contact me offline if you know anyone looking…
Sunday, May 18th, 2008
Mark Logic
You probably noticed the byline on my recent Yahoo! developer network posting. It, and a few more posts still in the pipe, list me as a “SearchMonkey Team Alumnus”. So yeah, it’s official, I’ve hung up my exclamation point and moved on to something else.
Specifically, Mark Logic, where a group of impressively talented people reside, recently including Norm Walsh. My first day there is tomorrow, so I don’t fully know what I’ll be working on, though it does involve
the core server, and taking it from it current state of awesome raw bare-metal power into something more akin to a application development platform.
Mark Logic strikes me like this: think back 10 years or so to all the hype and introductory articles around this new thing called XML–how it would enable whole new kinds of applications though the miraculous abilities of “markup” and perform realtime structured search over the results. It turns out that all these dreams were missing one critical piece, a way to do all the fancy indexing and repository management needed to make that happen. And the MarkLogic Server, to a very good approximation, IS that piece.
So what do I think of SearchMonkey at this point? No change, really. Good riddance to the ten-blue-links result pages. It’s breaking new ground in search, and Google will have a hard time stomaching an equally radical (and potentially revenue-impacting) change. SearchMonkey is really good news for the lowercase semantic web, including microformats and RDFa. It’s doing all the right things for the right reasons. The project will do fine without me. :-)
I had a good run at Yahoo! and I’m proud to have accomplished all I did there. Onward. -m
Saturday, May 17th, 2008
Are microformats right for your site?
Yeah, more than ever before. See my article on Yahoo! developer net. The stuff I talk about here is currently live in the indexer. -m
Wednesday, May 14th, 2008
Reminder: SearchMonkey developer launch party Thursday
Reminder: Thursday evening at Yahoo! Sunnyvale headquarters is the launch party for the developer-facing side of SearchMonkey. In case you haven’t been paying attention, SearchMonkey is a new platform that lets developers craft their own awesomized search results. If you’re interested in SEO or general lowercase semantic web tools, you’ll love it. Meet me there. Upcoming link. Party starts at 5:30. -m
Update: The developer tool is live. Rasmus has a nice walkthrough.
Friday, May 2nd, 2008
SearchMonkey dev party
If you have webdev skillz, you might be interested in the SearchMonkey launch party on May 15. Good food, good drink, good coding. Space is limited, but I have a few invites to share. Comment here or contact me offline if interested. -m
Tuesday, April 29th, 2008
How to negotiate
Tips from Leo Reilly in How to Outnegotiate Anyone (Even a Car Dealer!).
- Be patient. If you insist on having something today, know what you want and be prepared to pay for it.
- Never disclose your deadline.
- Cultivate a positive relationship with the other party.
- Don’t make the other side look stupid (for a prolonged period of time).
- The best negotiators talk only 40% of the time.
- The most intimidating thing you can do to someone trying to intimidate you is to not be intimidated.
- Never be the one to make the first offer.
The most critical aspect of negotiation is the opening offer. Four opening gambits are possible:
- Lowballing. Offering substantially less to create psychological downward pressure on the price.
- “Up against the wall.” Forcing the other side to make more concessions than you do.
- Anchoring. Having both sides make equal concessions.
- “The Kiss.” Like anchoring, except allowing the other side to take one final (often minor) concession.
If you want to find out what Leo says about how to buy a car, in 5 minutes, below dealer cost, you’ll have to pick up the book though. :-) -m
Monday, April 28th, 2008
SearchMonkey in private beta
I haven’t mentioned it yet, but SearchMonkey (now an official name, not just a project name) is in external limited beta. Keep an eye on ysearchblog, lots more technical content is on the way. -m
Sunday, April 27th, 2008
Is there an inverse to the Innovator’s Dilemma?
Roughly speaking, the innovator’s dilemma happens when a product progressively gets more and more advanced features, to the point that it misses out (by listening to customers) on an entire new opportunity. At that point, a simpler, competing product can come into play and make large gains.
But what happens when a company is generally aware of the Innovator’s Dilemma and tries to compensate? It seems like second order effects might come into their own. A product widely known for being (and remaining) minimalist is exposed to attacks from deliberate enhancements and related complexificaiton of competitive products. As the market gets more mature, the steadfastly-simple market leader gets left behind. In a sense, it’s a role reversal from what Clayton Christensen describes. But can it work out the same in the end? Please comment. -m
Saturday, April 26th, 2008
Deadlines and connections

I’m not involved in the the corporate wrangling about Microsoft and Yahoo! talks. Which leaves me relatively free to comment on it. [Disclosure: I am, not too surprisingly, a Yahoo! shareholder.]
Lots of things have been happening lately. A deadline of, well, today. Talks of Google adsense trials. And all kinds of merger speculation involving Rupert Murdoch in some fashion, or else AOL.
But I haven’t seen anyone point out this connection: Google owns 5% of AOL, having invested a billion bucks and taken over search there a couple of years ago. So if Yahoo! and AOL merged, there would already be a Google advertising connection in place. Running pre-trials now is just due dilligence on something that might happen anyway.
Having both an in-house advertising network and an outsourced one has some advantages too, namely in the form of “knobs” that can be adjusted to tune margins as conditions warrant. And maintaining the in-house system keeps Google honest and makes sure that relatively good deals can be negotiated in the future.
Lots of pundits talk about regulatory scrutiny, but honestly, it’s been years since any antitrust machinery in this country has been effective. And the recent spectrum auctions showcased Google’s skill at turning regulatory tables in their favor. If it came down to it, the smart people on both sides of the table shouldn’t have a problem crafting an agreement in a way that meets muster, even in the stricter EU.
Summary: based solely on public reports, it seems like the AOL connection might be a credible threat to Microsoft’s appetite. The ball is firmly in Steve’s court now. We’ll see what he does.
Tuesday, April 22nd, 2008
Shame on you, J.K. Rowling

Harry Potter author J.K. Rowling, herself rowling in gazillions of dollars, is along with her publisher suing Steven Vander Ark, a poor librarian who produced a lexicon of the Harry Potter universe.
Rowling says it’s not about the money, it’s about control. Poppycock. If that was the case, she would have objected to the web site. Instead, Rowling is quoted as saying:
This is such a great site that I have been known to sneak into an Internet cafe while out writing and check a fact rather than go into a bookshop and buy a copy of Harry Potter (which is embarrassing).
A lexicon is a collection of existing (fictional) facts, not something that is going to wrest creative control of the franchise away from the author. This work makes the Harry Potter universe more valuable, not less. Even if legally this is a gray area, it’s a boneheaded move to sue one of your greatest fans for providing a valuable and useful reference.
What troubles the bean counters so much is that the printed lexicon costs, well, actual money, $24.95 to be exact. As an author it troubles me to see how out of touch copyright law is, and how badly the scent of a few dollars can make an otherwise reasonable person behave. -m
Friday, April 11th, 2008
Google App Engine dwarfed
Thanks to chromatic for the link. Largely hidden, largest app clusters of this particular platform can:
Control over a million computers and can deliver over a hundred billion advertisements per day.
However, “don’t be evil” is not a part of this particular platform’s strategy… -m
Thursday, March 20th, 2008
Geeking out
I have here a pre-release copy of Cory Doctorow’s novel Little Brother.
With permission.
In plain text.
Being read with the UNIX command less.
On an XO laptop.
And so far it’s awesome. -m
Thursday, March 13th, 2008
The (lowercase) semantic web goes mainstream
So today Yahoo! announced a major facet of what I’ve been working on lately: making the web more meaningful. Lots of fantastic coverage, including TechCrunch and ReadWriteWeb (and others, please link in the comments), and supportive responses and blog posts across the board. It’s been a while since I’ve felt this good about being a Yahoo.
So what exactly is it?
A few months ago I went through the pages on this very blog and added hAtom markup. As a result of this change…well, nothing happened. I had a good experience learning about exactly what is involved in retrofitting an existing site with microformats, but I didn’t get any tangible benefit. With the “SearchMonkey” platform, any site using microformats, or RDFa or eRDF, is exposed to developers who can enhance search results. An enhanced result won’t directly make my my site rank higher in search, it it most certainly make it prone to more clicks, and ultimately more readership, more inlinks, and better organic ranking.
How about some questions and answers:
Q: Is this Tim Berners-Lee’s vision of the Semantic Web finally getting fulfilled?
A: No.
Q: Does this presuppose everybody rushing to change their sites to include microformats, RDF, etc?
A: No. After all, there is a developer platform. Naturally, developers will have an easier time with sites that use official and community standards for structuring data, but there is no obligation for any site to make changes in order to participate and benefit.
Q: Why would a site want to expose all its precious data in an easily-extractable way?
A: Because within a healthy ecosystem it results in a measurable increase in traffic and customer satisfaction. Data on the public web is already extractable, given enough eyeballs. An openness strategy pays off (of which SearchMonkey is an existence proof).
Q: What about metacrap? We can never trust sites to provide honest metadata.
A: The system does have significant spam deterrents built in, of which I won’t say more. But perhaps more importantly, the plugin nature of the platform uses the power of the community to shape itself. A spammy plugin won’t get installed by users. A site that mixes in fraudulent RDFa metadata with real content will get exposed as fraudulent, and users will abandon ship.
Q: Didn’t ask.com prove that having a better user interface doesn’t help gain search market share?
A: Perhaps. But this isn’t about user interface–it’s about data (which enables a much better interface.)
Q: Won’t (Google|Microsoft|some startup) just immediately clone this idea and take advantage of all the new metadata out there?
A: I’m sure these guys will have some kind of response, and it’s true that a rising tide lifts all boats. But I don’t see anyone else cloning this exactly. The way it’s implemented has a distinctly Yahoo! appeal to it. Nobody has cloned Yahoo! Answers yet, either. In some ways, this is a return to roots, since Yahoo! started off as a human-guided directory. SearchMonkey is similar, except a much broader group of people can now participate. And there are some specific human, technical and financial reasons why as well, but I suggest inviting me out for beers if you want specifics. :-)
Disclaimer: as always, I’m not speaking for my employer. See the standard disclaimer. -m
Update: more Q and A
Q: How is SearchMonkey related to the recently announced Yahoo! Microsearch?
A: In brief, Microsearch is a research project (and a very cool one) with far-reaching goals, while SearchMonkey is targeted as imminently shipping software. I frequently talk to and compare notes with Peter Mika, the lead researcher for Microsearch.
Thursday, March 6th, 2008
microformat search at Yahoo!
Somehow I missed this posting and the underlying news that a Y Research project has a nice public demo of semantic search, driven by RDF, RDFa, and microformats. Still a rough sketch of a full solution, with multiple-second access times. But I particularly like the query for renaissance faire. -m
Tuesday, February 26th, 2008
Yahoo! Announces Open Search Platform
As spotted on TechCrunch, full article. This is a game-changer folks. Check out the comments attached to the article. -m
Friday, February 22nd, 2008
Hands-on Kindle

Amazon hosted a networking event tonight. They had me at free beer and a chance to look at a Kindle. Now that I’ve actually played with one, I can comment on some of its features for better or worse.
It’s heavier and more solid than it looks. With the little padded cover, it could pass for a physical book in most situations, and it would probably survive a drop to the floor just fine.
The screen does look great, even in the sub-optimal lighting conditions of a bar. I had to compare with the XO when I got home, and with the backlight off, I think the resoloutions are very nearly similar. However, the XO (without backlight) is fairly hard to read at indoor lighting levels, though in full sunshine it’s great. I don’t know how easy it would be to read the Kindle in full sunlight…
Page turning is annoyingly slow, and annoyingly easy to do by accident. The annoying part is that after pressing the button, nothing seems to happen for a second, then the page blacks out, waits another second, then displays the new content. I understand the technical limitations of the black flash (and the corresponding benefits–essentially zero power consumption to hold an image). But it feels like if it started working as soon as the button was pressed, it could cut the overall page change time in half. Keyboard entry felt slow and lagged as well.
Overall, the device didn’t feel usable to me. I somehow stumbled my way into Wikipedia and got to see the browser in action. I would love to see a touch-screen version.
Did seeing one change my mind about buying one? Nope. Still waiting. I’d buy this one at half it’s current price, an updated model for maybe more. -m
Wednesday, February 13th, 2008
WebPath on next.yahoo
It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m
Thursday, January 24th, 2008
WebPath wants to be free (BSD licensed, specifically)

WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate.
The focus of WebPath was rapid development and providing an experimental platform. There remains tons of potential work left to do on it…watch this space for continued discussion. I’d like to call out special thanks to the Yahoo! management for supporting me on this, and to Douglas Crockford for turning me on to Top Down Operator Precedence parsers. Have a look at the code. You might be pleasantly surprised at how small and simple a basic XPath 2 engine can be. So, who’s up for some XPath hacking?
Code download. (Coming to SourceForge with CVS, etc., in however many days it takes them to approve a new project) I hope this inspires more developers to work on similar projects, or better yet, on this one! -m
Wednesday, January 23rd, 2008
Machine tags
Take a look at this URL, and the page behind it. This is a list of all the Flickr photos with the tag “xmlns:dc=http://purl.org/dc/elements/1.1/“. Although these have been around for a while, I hadn’t been aware of this kind of tagging until recently.
Why “xml” in the namespace declaration? This doesn’t have much to do with XML. How many tags are there in the world that start with “dc:” and are not referring to Dublin Core? At least the tag declaring the namespace provides a good hook for finding things with machine tags. It’s only a small step up to RDFa from here, which is good! -m
Monday, January 7th, 2008
Yahoo! introduces mobile XForms
Admittedly, their marketing folks wouldn’t describe it that way, but essentially that’s what was announced today. (documentation in PDF format, closely related to what-used-to-be Konfabulator tech; here’s the interesting part in HTML) The press release talks about reaching “billions” of mobile consumers; even if you don’t put too much emphasis on press releases (you shouldn’t) it’s still talking about serious use of and commitment to XForms technology.
Shameless plug: Isn’t it time to refresh your memory, or even find out for the first time about XForms? There is this excellent book available in printed format from Amazon, as well as online for free under an open content license. If you guys express enough interest, good things might even happen, like a refresh to the content. Let’s make it happen.
From a consumer standpoint, this feels like a welcome play against Android, too. Yahoo! looks like it’s placing a bet on working with more devices while making development easier at the same time. I’ll bet an Android port will be available, at least in beta, before the end of the year.
Disclaimer: I have been out of Yahoo! mobile for several months now, and can’t claim any credit for or inside knowledge of these developments. -m
P. S. Don’t forget the book.
Friday, January 4th, 2008
Mac quick tip
I discovered this by accident, but my life has been measurably better since.
You probably already know that you can switch apps quickly with Cmd+tab. But if you reach your pinky up a bit more and hit Cmd+~ you can rotate through the windows of the current app. This turns out to be most useful when, say, your email compose window gets behind the main email client window.
What is the equivalent keystroke on Linux? -m
Monday, December 31st, 2007
XPath puzzler: solution
Thanks to all the folks who showed interest in this little XPath puzzler published here a few weeks ago. Some asked to see the dataset, but I’m not able to release it at this time (but ask me again in 3 months).
Turns out it was a combination of two bugs, one mine, one somebody else’s. Careful observers noted that I wasn’t using any namespace prefixes in the XPath, and since I did specify that it was XPath 1.0, that technically rules out XHTML as the source language. Like nearly all XML I work with these days, the first thing I do is strip off the namespaces to make it easier to work with. Bug #1 was that in a few cases, the namespaces didn’t get stripped.
Bug #2 was in the XPath engine itself. Which one? Uh, whatever one ships with the “XPath” plugin for JEdit. It’s hard to tell directly, but I think it might be an older version of Xalan-J. In the case of the expression //meta, it properly located only those elements part of no namespace. But in the case of //meta/@property, it was including all the nodes that would have been selected by //*[local-name(.)='meta']/@property. Hence, a larger number of returned nodes.
Confusing? You bet! -m
P.S. WebPath would not have this problem, since in the default mode it matches local-names only to begin with.