Cory Doctorow’s Little Brother is now shipping from Amazon and other stores. I reviewed a pre-release copy of it and liked it. But the best part is–like Cory’s other books–it’s downloadable right now, for free, under an open content license. I can attest that this is an effective strategy for getting your name and your work out into the wild. If you really like it, then please purchase it in a convenient portable package, also known as a printed book. :-) -m
Archive for the 'announcement' Category
Friday, May 2nd, 2008
SearchMonkey dev party
If you have webdev skillz, you might be interested in the SearchMonkey launch party on May 15. Good food, good drink, good coding. Space is limited, but I have a few invites to share. Comment here or contact me offline if interested. -m
Thursday, May 1st, 2008
Micahpedia
Today happens to mark the 6th anniversary of my blog. To celebrate going into year seven I’m refocusing it, including a new name: Micahpedia.
Blogging is an important skill, a subset of the overall skill of managing your online persona, so it’s worth devoting some attention to. The ego-burst doesn’t hurt either. My concrete goal is to get in the top 10 search results for the query [Micah], though I face some stiff competition including the prophet.
From an SEO perspective, “Push Button Paradise” wasn’t the greatest choice of name. It suffers from the common SEO mistake of being excessively clever and/or cute reflection of what I happened to be working on at the moment, namely XForms. If you see the old name standalone, or in a blogroll, or in an RSS reader, you still don’t have much of an idea what it’s about or who’s behind it. True I get pretty good ranking on the exact phrase, but nobody searches for that…
I will continue SEO tweaks on this site as time goes on and welcome any advice from any of my 7 readers.
In short, Micahpedia is about what I’m reading, writing, thinking about, and working on. I have plenty to say about these things. :-) The best is yet to come. -m
Monday, April 28th, 2008
SearchMonkey in private beta
I haven’t mentioned it yet, but SearchMonkey (now an official name, not just a project name) is in external limited beta. Keep an eye on ysearchblog, lots more technical content is on the way. -m
Thursday, March 13th, 2008
The (lowercase) semantic web goes mainstream
So today Yahoo! announced a major facet of what I’ve been working on lately: making the web more meaningful. Lots of fantastic coverage, including TechCrunch and ReadWriteWeb (and others, please link in the comments), and supportive responses and blog posts across the board. It’s been a while since I’ve felt this good about being a Yahoo.
So what exactly is it?
A few months ago I went through the pages on this very blog and added hAtom markup. As a result of this change…well, nothing happened. I had a good experience learning about exactly what is involved in retrofitting an existing site with microformats, but I didn’t get any tangible benefit. With the “SearchMonkey” platform, any site using microformats, or RDFa or eRDF, is exposed to developers who can enhance search results. An enhanced result won’t directly make my my site rank higher in search, it it most certainly make it prone to more clicks, and ultimately more readership, more inlinks, and better organic ranking.
How about some questions and answers:
Q: Is this Tim Berners-Lee’s vision of the Semantic Web finally getting fulfilled?
A: No.
Q: Does this presuppose everybody rushing to change their sites to include microformats, RDF, etc?
A: No. After all, there is a developer platform. Naturally, developers will have an easier time with sites that use official and community standards for structuring data, but there is no obligation for any site to make changes in order to participate and benefit.
Q: Why would a site want to expose all its precious data in an easily-extractable way?
A: Because within a healthy ecosystem it results in a measurable increase in traffic and customer satisfaction. Data on the public web is already extractable, given enough eyeballs. An openness strategy pays off (of which SearchMonkey is an existence proof).
Q: What about metacrap? We can never trust sites to provide honest metadata.
A: The system does have significant spam deterrents built in, of which I won’t say more. But perhaps more importantly, the plugin nature of the platform uses the power of the community to shape itself. A spammy plugin won’t get installed by users. A site that mixes in fraudulent RDFa metadata with real content will get exposed as fraudulent, and users will abandon ship.
Q: Didn’t ask.com prove that having a better user interface doesn’t help gain search market share?
A: Perhaps. But this isn’t about user interface–it’s about data (which enables a much better interface.)
Q: Won’t (Google|Microsoft|some startup) just immediately clone this idea and take advantage of all the new metadata out there?
A: I’m sure these guys will have some kind of response, and it’s true that a rising tide lifts all boats. But I don’t see anyone else cloning this exactly. The way it’s implemented has a distinctly Yahoo! appeal to it. Nobody has cloned Yahoo! Answers yet, either. In some ways, this is a return to roots, since Yahoo! started off as a human-guided directory. SearchMonkey is similar, except a much broader group of people can now participate. And there are some specific human, technical and financial reasons why as well, but I suggest inviting me out for beers if you want specifics. :-)
Disclaimer: as always, I’m not speaking for my employer. See the standard disclaimer. -m
Update: more Q and A
Q: How is SearchMonkey related to the recently announced Yahoo! Microsearch?
A: In brief, Microsearch is a research project (and a very cool one) with far-reaching goals, while SearchMonkey is targeted as imminently shipping software. I frequently talk to and compare notes with Peter Mika, the lead researcher for Microsearch.
Thursday, March 6th, 2008
microformat search at Yahoo!
Somehow I missed this posting and the underlying news that a Y Research project has a nice public demo of semantic search, driven by RDF, RDFa, and microformats. Still a rough sketch of a full solution, with multiple-second access times. But I particularly like the query for renaissance faire. -m
Tuesday, February 26th, 2008
Yahoo! Announces Open Search Platform
As spotted on TechCrunch, full article. This is a game-changer folks. Check out the comments attached to the article. -m
Wednesday, February 13th, 2008
WebPath on next.yahoo
It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m
Friday, January 25th, 2008
WebPath: Python XPath 2 engine now up on Sourceforge
I’ve taken this opportunity to ditch CVS on all my existing Sourceforge projects (pyxmlwiki, xfv) while setting up my newest project. Here’s the browable subversion source. Have at it.
Where should you start with this code? Step zero, if you haven’t already, is to look through my XML 2007 slides on my site. First thing is to grab a copy of PLY, which is a dependency. Then with all these files in your current directory, run python with no parameters. At the interpreter prompt type import demo then demo.demo1(), demo.demo2(), and so on. This will give you a feel for how the system works. Look at the source of demo.py to see how it works at the high level.
To actually get into the code, I suggest opening webpath.py and scrolling down to the end, where a large series of unit tests begins. Tracing through these will be (I hope!) instructive on how the various details of the engine are put together.
There are many missing pieces (a few intentionally so). So have a look around the code and start thinking about what you could do with it. One thing I would love to have happen soon is getting rid of minidom, replacing it with something more robust.
If you want developer access on Sourceforge, drop me a note with your sf username. -m
Thursday, January 24th, 2008
WebPath wants to be free (BSD licensed, specifically)

WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate.
The focus of WebPath was rapid development and providing an experimental platform. There remains tons of potential work left to do on it…watch this space for continued discussion. I’d like to call out special thanks to the Yahoo! management for supporting me on this, and to Douglas Crockford for turning me on to Top Down Operator Precedence parsers. Have a look at the code. You might be pleasantly surprised at how small and simple a basic XPath 2 engine can be. So, who’s up for some XPath hacking?
Code download. (Coming to SourceForge with CVS, etc., in however many days it takes them to approve a new project) I hope this inspires more developers to work on similar projects, or better yet, on this one! -m
Tuesday, January 1st, 2008
My new year resolution
Holding steady at 1280 x 854 but due for an upgrade soon.
Seriously, if you find yourself setting various goals just because something on the calendar changed, you probably don’t have the long-term motivation needed to see it through, which is why so many new years’ resolutions lie in broken heaps by mid February. If you think something is worth doing (like this for example), then forget the calendar and do it.
-m
Sunday, December 16th, 2007
Slides from XML 2007: WebPath: Querying the Web as XML
Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon.
WebPath (html)
WebPath (OpenDocument, 4.7 megs)
Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text versons? -m
Thursday, November 22nd, 2007
Reducing my online profile
Due to some unauthorized activities on my webspace, I’m trimming my online profile, notably the Brain Attic sites. These were my home base for consulting, which I haven’t been doing for 2+ years. Less surface area exposed means less exposure to the bad guys. This site, and XForms Institute are staying up for now, as should be the email address you are currently using. There will be a few broken links that will take some time to eradicate.
If you notice anything amiss, any unseemly references to ‘viagra’ in my pages etc., email me at “mdubinko” in the reversed “com.yahoo” domain. -m
Update: whoops, looks like I cut a little too deep. Turns out that all my @dubinko.info mail was routing through one of the domains I chopped. For several hours overnight email sent to me was bouncing. If you ran into that, please re-sent. -m
Tuesday, October 16th, 2007
Come to mead appreciation and brewing class
If you’re in the South Bay and like mead, you need to check this out. -m
Monday, October 8th, 2007
XML 2007 Schedule
As widely reported by now, the final schedule for XML 2007 this December in Boston is up. All I have to add is the suggestion of careful attention to the Tuesday program at 4:00. :) If you can’t wait, some technical details are forthcoming in this space. That is all. -m
Friday, October 5th, 2007
Playing with microformats
I’ll be doing some experimenting around here over maybe the next week or two. Specifically, setting up hAtom within these pages. Watch for falling debris and report any unusual observations. -m
Friday, September 21st, 2007
Come see me at XML 2007
Watch this space for details. I’ll be speaking about something related to Python and XPath 2.0. Watch this blog for tidbits on the subject. :) -m
Wednesday, August 29th, 2007
2 years at Yahoo!
Today is my 2nd anniversary at Yahoo!. Looking back, it’s been a great time. Since I don’t know how long ago, I’ve fantasized about being involved in research. Check. Since sitting across from the mobile guys for 5 years in W3C meetings, I’ve fantasized about working in mobile. Check. And since I wrote Web search, without the web (demo), I’ve fantasized about working on web-scale search.
Check.
What will the next two years bring? I don’t know, but I’m certain they will be even better than the previous two. -m
Wednesday, August 22nd, 2007
What I’m reading
- Everyday Life in Ancient Mesopotamia, Jean Bottéro
- Mesopotamia: The Invention of the City, Gwendolyn Leick
- The Structure of Scientific Revolutions, Thomas S. Kuhn
Yeah, they’re related. -m
Friday, June 22nd, 2007
Get yer Go
Still more mobile news. Yahoo! Go is shipping. No alpha, beta, gamma, etc.–the real deal. Give it a whirl. If your phone, like mine, can’t handle the awesomeness, you can visit the slick web-only version at m.yahoo.com. -m
Tuesday, June 12th, 2007
The Writing Show: First-Chapter-of-a-Novel Contest
Once again, I am a judge for this year’s First-Chapter-of-a-Novel Contest hosted by The Writing Show. We’re looking for unpublished, less-than 4000 word entries.
Final deadline for submissions is June 15, so there’s just enough time left to put together your masterpiece and get it in. This year, there’s some serious prizes, and the top award will be chosen by popular crime fiction author C.J. Box. Go have a look at the rules for all the details.
There is a small entry fee, but every valid submission will get a 750+ word critique. For aspiring authors, you win either way. Now get to work! -m
Sunday, June 3rd, 2007
Search On
The approximately seven readers of this blog have probably already heard this, but just in case: I have a new role at Yahoo!–working on next generation search.
Lots of details are still falling into place. For now I describe it: “Imagining, specifying, prototyping, developing, and evangelizing next-generation web search experiences leveraging the full and unique capabilities available within Yahoo!”
In many ways, this is a logical stepping stone after oneSearch, and I’ll be dealing with lowercase semantic web issues more now. Expect the focus of this blog to shift accordingly (though I’m still interested in mobile and will make note of important happenings.)
Search On! -m
Thursday, April 26th, 2007
Email meltdown in progress
Maybe it’s a coincidence, but just after installing Thunderbird 2, deleting emails started taking 5 seconds, then 15, then 30, then a full minute. Then it quit working alltogether. Also 14,000 old mailing list messages materialized in my Junk folder. My inbox has hundreds of unread, and drastic measures might be needed to get things working again… -m
Monday, April 2nd, 2007
My simple comment moderation policy
I don’t remember ever spelling this out, so:
- Any posting that adds to the discussion shall be accepted
- Any posting by a spammer/robot/pay-per-post flunkie shall be rejected
- Any posting that would offend my grandma shall be rejected
- Any posting that takes too long for me to categorize per above MAY be rejected
These aren’t hard-and-fast rules. It’s getting increasingly difficult to discern postings that come from actual personalities. As a general rule, you should include a link back to your personal site, which should present itself in a way that makes it obvious that it’s not put together by some toxic SEO-gaming, advert-farming, internet-poisoning aggregation program.
But that’s just good advice no matter which blogs you comment on. -m
Monday, March 19th, 2007
Yahoo oneSearch launches on mobile web
Today Yahoo! launched oneSearch on their other front page, m.yahoo.com. OneSearch has been available for a while, but only from within Yahoo! Go. Now it’s available to millions of mobile devices equipped with a data connection and XHTML browser.
The basic premise behind oneSearch is to replace the tri-modal search box, where you have to say whether you are searching the web, local, or images, with a single all-knowing search box. Available context information, such as your zip code, is used to guide the search. Internally, the application is smart about figuring out what kind of things you might be looking for. For example, someone searching for “pizza” in a mobile context is probably more interested in a list of restaurants (with reviews) than in a list of hyperlinks. Behind the simplicity of a single search box, there is a great deal of work going on to make your life easier.
Ever since Yahoo! Go betas (and gammas) started coming out, folks have been asking me how else they could get access to this application. Now it’s easy.
Not too long ago, the front page relaunched simultaneously in 19 countries. The new design was simple, and based on a new platform called Sushi, as mentioned in published sources. OneSearch shows off the power of this approach, even though this launch didn’t cover 19 countries…yet. (Getting access to local data for movies, restaurants, sporting events, and so on is no small feat.)
As I said before, this is only a small part of an overall strategy that has been years in the making. Much more to come. Watch this space. -m
Monday, January 8th, 2007
Yahoo! + Opera = Crazy Delicious
(Press release) Starting today, Y! is the exclusive search partner for Opera Mini across more than 100 countries. The release also names “oneSearch”, going live later in Q1–definitely something to keep an eye on. -m
Wednesday, January 3rd, 2007
XForms for UBL
Here’s a great new project on Sourceforge: XForms for UBL. In my book, I started in on something like this. Here is a more complete, more up-to-date, fleshed out solution. -m
Sunday, December 31st, 2006
What I did on winter break
For the last several years, I’ve taken some time off around the end of the year to work on a special project. In 2004 I ported some of Rick Jelliffe’s code from Java to Python. In 2005, I made an editing pass over a novel I wrote the previous November during NaNoWriMo. This year was a little different. I:
- Worked. Enough stuff is going on with the day job that I couldn’t take a full week off.
- Got sick.
- Caught up on some homebrewing. Reorganized my brewery.
- Wrote. More stuff coming soon on xml.com.
- Pimped a babyswing. (photoset)
- Started reviewing WWW2007 and XTech papers.
- Started taking a video MIT class on differential equations. (If you have OS X 10.4+, fire up Grapher and try y’=cos xy, y(0)={-5,-4.5…5})
-m