Archive for May, 2009

Saturday, May 30th, 2009

Geek Thoughts: funny headline

Google Android Will Be on 18-20 Phones by End of 2009

source. Let’s see, Larry, Sergey and Eric Schmidt, there’s three phones…

More collected Geek Thoughts at

Saturday, May 30th, 2009

XForms Institute moved to SVN

About a week ago I moved XForms Institute over to Subversion. Now the entire site is under version control, with a local copy I can edit. Publishing is as easy as logging in and running the command ‘svn up’. Honestly, I should have done this long ago. And any future sites I work on will use this approach too–it’s fantastic.

If you notice any glitches, let me know. -m

Friday, May 22nd, 2009

More on the GOOG book settlement

From Brewster Kahle. Good read, so to speak. -m

Thursday, May 21st, 2009

One year at Mark Logic

Another anniversary this week, one year at Mark Logic. Much of it in stealth mode, but more details of what I’ve been up to are forthcoming. -m

Monday, May 18th, 2009

Geek Thoughts: writing tip

Take something that’s done, no matter how crappy, and submit it. Right now, I’ll wait…

Next time you have something ready, by comparison it will be ever so much better, and you’ll have no excuse to avoid submitting that as well.

More collected Geek Thoughts at

Friday, May 15th, 2009

A nugget from _A Canticle for Leibowitz_

This brilliant bit is almost a throwaway paragraph on page 304, near the end.

[Two men in a satirical dialog] managed only to demonstrate that the mathematical limit of an infinite sequence of “doubting the certainty with which something doubted is known to be unknowable¬† when the ‘something doubted’ is still a preceding statement ‘unknowability’ of something doubted,” that the limit of this process at infinity can only be equivalent to a statement of absolute certainty, even though phrased ans an infinite series of negations of certainty.

It’s not like the whole book is like this…far from it. But it is chock full of little gems.


Friday, May 15th, 2009

Geek Thoughts: plan to Save The World

My plan is to start unsubstantiated rumors about the SUV Flu, and by association cause people to distance themselves from said implicated devices. You can help.

Despite denials from the CDC (you know they’re in the pockets of the auto industry), SUV flu is serious and spreading fast. You might already have it and not realize it. The SUV flu spreads primarily through close contact with gas-guzzling vehicles, such as so-called Sport Utility Vehicles. California has been hit the hardest, with sources reporting that in a small hamlet outside of Oxnard over 3000 drivers have been seen staggering away from their parked vehicles, and further reports indicate that up to 80 top epidemiologists nationwide are stranded and unable to commute to work.

Transmission occurs primarily via exhaust emission and requires close contact between source and recipient because contaminants do not remain suspended in the air and generally rise directly to the ozone layer. Contact with contaminated surfaces (including bucket seats and 4-wheel-drive shifters) is another possible source of transmission.

The estimated incubation period is unknown and could range from 1-7 days, but more likely 3 years or 36,000 miles.

Patients with uncomplicated disease due to confirmed (or unconfirmed) SUV flu virus infection have experienced inflated ego, increased road rage, chronic lack of consideration for others, decreased awareness of nearby traffic, fatigue, vomiting, or diarrhea. In West Palm Beach, 95% of patients with SUV flu met the case definition of opprobrism.

Anyone showing signs–however faint–of possible SUV flu should pull over, immediately self-diagnose, and proclaim the results on Twitter, Facebook, MySpace, or a nearby blog. If you are somehow still disease-free, carefully avoid contamination vectors mentioned above. Please help spread the warning about this dangerous disease, using the hashtag #suvflu.

Be careful out there.

More collected Geek Thoughts at

Tuesday, May 12th, 2009

Google Rich Snippets powered by RDFa

The new feature called rich snippets shows that SearchMonkey has caught the eye of the 800 pound gorilla. Many of the same microformats and RDF vocabularies are supported. It seems increasingly inevitable that RDFa will catch on, no matter what the HTML5 group thinks. -m

Sunday, May 10th, 2009

Yahoo!: One year gone

As of today, I have been out of Yahoo! for a full year. And what a year it’s been… I guess that means I’m now free to recruit…any good XML people still wearing purple? -m

Friday, May 8th, 2009

HTML: The Markup Language marks a new beginning

If you haven’t already, check out HTML: The Markup Langauge. Besides being a cool new recursive acronym for HTML, it is a reasonably-sane document. Also worth a look: Differences between HTML4 and HTML5. Many of the ideas from XHTML 2 (of which I was an editor at one point) are there.

I think it’s time for the W3C to show some tough love and force the two (X)HTML Working Groups together.

A while ago, I argued that the existence of both Flickr and Yahoo! Photos as an effective two-pronged strategy. Look how that worked out–Y! Photos is permanently shuttered. While there were benefits including a broader potential reach, in aggregate the benefits didn’t amount to more than the immense cost of having two parallel efforts. Same here. -m

Monday, May 4th, 2009

When the experimenter wants to believe

The universe is deeply, fundamentally weird. At the quantum level, all kinds of non-intuitive effects are the building blocks of, well everything. So what if not just observing, but believing in a particular outcome could influence the actual outcome of an experiment?

Something like that could explain a lot: many of the claims of perpetual motion machines, cold fusion a la Stanley and Pons, the placebo effect, Steorn Orbo technology (previous discussion), and numerous similar endeavors. Who’s to say that some aspect of what we call consciousness doesn’t involve some kind of probability manipulation?

The conventional scientific method would be at a loss to deal with such a situation. True Believers would proclaim miraculous results from their experiments, but Skeptics would be unable to reproduce the results. Strong skeptics would set up million dollar rewards to prove crackpottish claims under “controlled conditions”, and nobody would ever collect.

Such a conceit is the basis for a story I’m working on. The first drafts were written 18 months ago, as part of NaNoWriMo 2007. I may be ready for some early reviewers by the summer. Interested? -m

Sunday, May 3rd, 2009

Playing with Wolfram Alpha

I’ve been experimenting with the preview version of Wolfram Alpha. It’s not like any current search engine because it’s not a search engine at all. Others have already written more eloquent things about it.

The key feature of it is that it doesn’t just find information, it infers it on the fly. Take for exmple the query

next solar eclipse in Sunnyvale

AFAIK, nobody has ever written a regular web page describing this important (to me) topic. Try it in Yahoo! or Google and see for yourself. There are a few potentially interesting links based on the abstracts, but they turn out to be spammy. Wolfram Alpha figures out that I’m talking about the combination of a concept (“solar eclipse”) and a place (“Sunnyvale, CA”, but with an offer to switch to Sunnyvale, TX) and combines the two. The result is a simple answer–4:52 pm PDT | Sunday, May 20, 2012 (3.049 years from now). Hey, that’s sooner than I thought! Besides the date, there’s many related facts and a cool map.

This is in contrast to SearchMonkey, which I helped create, in two main areas:

  1. Wolfram Alpha uses metadata to produce the result, then renders it through a set of pre-arranged renderers. The response is facts, not web pages.
  2. SearchMonkey focuses on sites providing their own metadata, while Wolfram Alpha focuses on hand-curation.

Search engines have been striving to do a better job at fact-queries. Wolfram’s approach shows that an approach disjoint from finding web pages from an index can be hugely useful.

The engineers working on this have a sense of humor too. The query


returns a page that includes the text “power required to operate the flux capacitor in the DeLorean DMC-12 time machine” as well as a useful comparison (~ 0.1 x the power of space shuttle at launch).

Yahoo! and Google do various kinds of internal “query rewriting”, but usually don’t let you know other than in the broadest terms (“did you mean …”). Wolfram Alpha shows a diagram of what it understood the query to be. The diagrams make it evident that something like the RDF model is in use, but without peeking under the hood, it’s hard to say something definitive.

One thing I wonder about is whether Wolfram Alpha creates dynamic (as was a major goal of SearchMonkey) of giving web authors a reason to put more metadata in their sites–a killer app if you will. It’s not clear at this early date how much web crawling or site metadata extraction (say RDFa) plays into the curation process.

In any case Wolfram Alpha is something to watch. It’s set to launch publicly this month. -m