I wanted to say something snarky about Microsoft’s new slogan, but the comments on the linked article did a pretty good job already. Ahh snark, the unthinking-man’s eloquence. -m
July 26th, 2010
July 21st, 2010
Meade Classe August 7
Join me for another Meade Classe at the Los Altos MoreFlavor brew shop.
Saturday, Aug. 7, 2010 2:00 – 4:00 pm
MoreFlavor 991 N. San Antonio Road
Los Altos, CA 94022
We will taste some meads, focusing on sensory evaluation, then walk through the steps of brewing up a batch. As usual, seating is limited, so email me to reserve a spot. To help the brew shop recover the costs of the honey, yeast, and light snacks available, a $10 donation will help make sure these events can continue.
On a personal note, I’ll be traveling back from a conference in Montreal the day before, so I might be a little jet lagged. Could get interesting. :-) -m
July 15th, 2010
VP XIV bound
Thrilled, THRILLED to announce that I’ve been accepted to the 2010 Viable Paradise workshop. I sent in the first 8000 words of a manuscript that about half of the 7 readers of this blog have looked at. You know, the one that is Science Fiction–literally, fiction about science. So I’ll be spending some time in early October at Martha’s Vineyard studying at the feet of published authors and honing my craft.
Class size is limited, so I’ve been actively psyching myself down for the last month, not getting my hopes up too high. Then when the acceptance came, I had computer down time, and nearly exploded from holding the news in for 3 days. :-)
Ahhhh. I should say more, but I believe I may still be in shock. -m
P. S. OK, how about 25 great opening lines.
July 12th, 2010
Hard drive failing
My personal machine is ailing. It freezes up for 30 seconds at a time–even iTunes stops playing. FireFox crashes before it’s done launching. I’m scared to reboot for fear the machine won’t come back up. SMARTReporter lists an operating age of 15k hours, a suspicious Power-Off_Retract_Count of over 25 billion–whatever this represents, it’s happened an impressive average of several hundred times per second for the life of the drive. Also a Load_Cycle_Count of over 866k (where typical lifespan is 300-600k). New HDD ordered.
Changing to a new hard drive is the digital equivalent of moving. Disruptive to your routine, and you’re living out of boxes for months afterward. Not much fun, but also an opportunity to reorganize a few things, to start fresh. -m
July 7th, 2010
Grokking Selenium
As the world of web apps gets more framework-y, I need to get up to speed on contemporary automation testing tools. One of the most popular ones right now is the open source Selenium project. From the look of it, that project is going through an awkward adolescent phase. For example:
- Selenium IDE lets you record tests in a number of languages, but only HTML ones can be played back. For someone using only Selenium IDE, it’s a confusing array of choices for no apparent reason.
- Selenium RC has bindings for lots of different languages but not for the HTML tests that are most useful in Selenium IDE. (Why not include the ability to simply play through an entire recorded script in one call, instead of fine grained commands like selenium.key_press(input_id, 110), etc.?)
- The list of projects prominently mentions Selenium Core (a JavaScript implementation), but when you click through to the documentation, it’s not mentioned. Elsewhere on the site it’s spoken of in deprecating terms.
- If you look at the developer wiki, all the recent attention is on Web Drivers, a new architecture for remote-controlling browsers, but those aren’t mentioned in the docs (yet) either.
So yeah, right now it’s awkward and confusing. The underlying architecture of the project is undergoing a tectonic shift, something that would never see public light of day in a proprietary project. In the end it will come out leaner and meaner. What the project needs in the short term is more help from fresh outsiders who can visualize the desirable end state and help the ramped and productive developers on the project get there.
By the way, if this kind of problem seems interesting to you, let me know. We’re hiring. If you have any tips for getting up to speed in Selenium, comment below.
-m
June 26th, 2010
Steve Martin mead joke
Steve Martin leaves an awesome list of demands for venue staff when he’s on tour, including
BEVERAGE SERVICE must include a thoughtful assortment of meads and bendy straws.
IMPORTANT NOTE: Bendy straws must be strong enough to be able to be used as blowguns. ADDITIONAL IMPORTANT NOTE: Local paramedic aid may be required.
Read the rest, it’s great. -m
June 20th, 2010
RDBMS Alternatives
For anyone trying to get up to speed on the technology side of non-traditional databases, including NoSQL concepts and not-your-father’s-XML, this webinar looks like a good start. Tuesday June 29, 2pm EST, 11am PST. -m
June 19th, 2010
The primary virtue of the meadmaker is patience
I have a batch of chocolate mead that’s been brewing since 2007. Mead bulk ages well, but this is a new personal record. Today, I started siphoning it into the bottling bucket when I noticed that it wasn’t completely clear. I use a mineral called sparkalloid which causes any haze/protein/particulate to settle to the bottom, and there it was–a boundary layer between clear and sparkling.
So back into the carboy it goes, with a fresh dose of fining agents. What’s a few more months at this point? -m
June 12th, 2010
Command Lines on the frontier of user interface
This came from a comment on the prior post, and it’s worth a shout of its own. Don Norman on the importance of command lines, including the ubiquitous search box, in modern UI. -m
June 9th, 2010
“Google syntax” for semantic queries?
Thought experiment: are there any commonly-expressed semantic queries–the kind of queries you’d run over a triple store, or perhaps a SearchMonkey-annotated web site–expressible in common type-in-a-searchbox query grammar?
As a refresher, here’s some things that Google and other search engines can handle. The square brackets represent the search box into which the queries are typed, not part of the queries themselves.
[term]
[term -butnotthis]
[term1 OR term2]
["phrase term"]
[tem1 OR term2 -"but not this" site:dubinko.info filetype:html]
So what kind of semantic queries would be usefully expressed in a similar way, avoiding SPARQL and the like? For example, maybe [by:"Micah Dubinko"] could map to a document containing a triple like <this document> <dc:author> “Micah Dubinko”. What other kinds of graph queries are interesting, common, and simple to express like this? Comments welcome.
-m
June 4th, 2010
The Swinger
I’m enjoying the results of this Python project from Music Hack Day way too much. It analyzes an audio clip to detect the beats, then uses time stretching and compression techniques (that don’t alter the pitch) to rearrange each measure into a “swung” groove. Fantastic. I wish they’d take more requests! -m
Try this one on for size: Just What I Needed by The Cars:
June 3rd, 2010
Reverse Engineering Corexit 9500
If you dig a bit, there’s all kinds of interesting background material about the terrible disaster ongoing in the Gulf of Mexico. For example, a map of the thousands of rigs and tens-of-thousands of miles of pipelines. Some of the best infographics are from BP itself. And for when you can no longer stand the overwhelming sense of disaster, a fake twitter feed.
But this really caught my eye, from Nalco, the manufacturer of the oil dispersant Corexit 9500 which is being used both in unprecedented quantities and depths in the Gulf. Here’s how they cleverly describe the ingredients of their product, an ingredient list they protect as a trade-secret:
- One ingredient is used as a wetting agent in dry gelatin, beverage mixtures, and fruit juice drinks.
- A second ingredient is used in a brand-name dry skin cream and also in a body shampoo.
- A third ingredient is found in a popular brand of baby bath liquid.
- A fourth ingredient is found extensively in cosmetics and is also used as a surface-active agent and emulsifier for agents used in food contact.
- A fifth ingredient is used by a major supplier of brand name household cleaning products for “soap scum” removal.
- A sixth ingredient is used in hand creams and lotions, odorless paints and stain blockers.
That is one impressive bit of verbal agility, my complements to their staff writer(s). It would be a fun exercise some day to see what kinds of toxic sludge could be described in similar terms. But let’s see if we can figure out the exact ingredient list: here’s the MSDS for the substance. According to it Propylene Glycol is clearly one of the ingredients, as are “Distillates, petroleum, hydrotreated light” and “Organic sulfonic acid salt”. “Wetting agent” and “surface-acting” are both code words for a surfactant. A little knowledge of chemistry along with household product label reading might go a long way… Got insight? Add a comment here to describe what you find.
-m
6/10 Update: Nalco released the full ingredient list and cheat sheet:
| CAS # | Name | Common Day-to-Day Use Examples |
| 1338-43-8 | Sorbitan, mono-(9Z)-9-octadecenoate | Skin cream, body shampoo, emulsifier in juice |
| 9005-65-6 | Sorbitan, mono-(9Z)-9-octadecenoate, poly(oxy-1,2-ethanediyl) derivs. | Baby bath, mouth wash, face lotion, emulsifier in food |
| 9005-70-3 | Sorbitan, tri-(9Z)-9-octadecenoate, poly(oxy-1,2-ethanediyl) derivs | Body/Face lotion, tanning lotions |
| 577-11-7 | * Butanedioic acid, 2-sulfo-, 1,4-bis(2-ethylhexyl) ester, sodium salt (1:1) | Wetting agent in cosmetic products, gelatin, beverages |
| 29911-28-2 | Propanol, 1-(2-butoxy-1-methylethoxy) | Household cleaning products |
| 64742-47-8 | Distillates (petroleum), hydrotreated light | Air freshener, cleaner |
| 111-76-2 | ** Ethanol, 2-butoxy | Cleaners |
The * footnote indicates, essentially, “contains propylene glycol”.
The ** footnote indicates that this chemical is found only in Corexit 9527, not the one most commonly used in the Deepwater Horizon cleanup.
May 30th, 2010
Balisage contest: solving the wikiml problem
I wish I could say I had something to do with the planning of this: part of Balisage 2010 is a contest to “encourage markup experts to review and to research the current state of wiki markup languages and to generate a proposal that serves to de-babelize the current state of affairs for the long haul.” To enter, you must propose a set of concrete steps (organizational, social, and/or technological) that will enable wiki content interchange, a real WYSIWYG editor, and/or wiki syntax standardization.
This pushes all of my buttons. It’s got structured documents, Web, parser geekery, writing, engineering, and standards. There’s a bunch of open source prior art, including PyXMLWiki, which I adapted from some fantastic earlier work from Rick Jelliffe.
Sadly, MarkLogic employees aren’t eligible to enter. Get your write-up done by July 15 and sent to balisage-2010-contest at marklogic dot com. The winner will be announced at Balisage and will take home some serious prize winnings, and also will be strongly encouraged (but not required) to give a brief summary (~10 minutes) of their winning entry.
Can’t wait to see what comes out of this. -m
May 23rd, 2010
Martin Gardner will be missed
I first ran in to Martin’s work in back-issues of Scientific American. He stopped writing his Mathematical Games column in 1981, but my mentor Virgil Matheson had all the older issues and had a free hand in lending them out, albeit one-at-a-time. From my mentor, I also got the best math book I’ve ever read, Calculus Made Easy by Silvanus P. Thompson. A newer edition of the book came out in 1998–and who came on board as the 2nd author to revise and modernize the text? Yep, that’s who. -m
May 14th, 2010
Geek Thoughts: verbing facebook
Facebook (v): to deliberately create an impenetrable computer user interface for purposes of manipulating users.
More collected Geek Thoughts at http://geekthoughts.info.
May 11th, 2010
XProc is ready
Brief note: The W3C XProc specification, edited by my partner-in-crime Norm Walsh, has advanced to Recommendation status. Now go use it. -m
May 5th, 2010
Geek Thoughts: no-fly lists and CAP Theorem
According to this article, a recent terror suspect almost got on a plane despite being recently added to the no-fly list. Why is it so difficult to administer a no-fly list? The CAP Theorem has answers. (Disclaimer: as always, this blog is apolitical–this isn’t about whether no-fly lists are a good idea or not, only a matter of technical interest)
Without stretching the imagination too much, one can think of a no-fly list as a distributed database. The list apparently changes frequently, and it needs to be accessible from thousands of airport gates and reservation desks. Thus CAP Theorem applies. In a nutshell, that theorem states that of Consistency, Availability, and Partition-tolerance, you can only pick, at most, two. Hit the link above for a much better, more complete description.
If there was one centralized list, the system would be Consistent and Available, but every time a name needed to be checked it would require an immediate network round-trip–should the connection to that central list go down, no further checks would be possible–no Partition tolerance.
Of course, the airline could set a policy that if said network connection goes down, no passengers at all would be able to get on planes. This would be a case of lack of Availability.
Or, the complete list could be periodically copied to each location that needs it. This provides good Availability and Partition tolerance, but fails Consistency, since it’s possible to miss out on late-breaking updates. Apparently, something like this is what happened.
More collected Geek Thoughts at http://geekthoughts.info.
April 29th, 2010
DMC = developer.marklogic.com
The new MarkLogic developer site is up, cleaner, better organized, and more social. Even cooler, it’s an XSLT-heavy application running on a pre-release version of MarkLogic. The new blog gives some of the details of the new site and transition.
So, if you’re already a MarkLogic developer, this is a great resource. And if you’re not, the site itself shows how fast and simple it is to put together a XSLT and XQuery-powered app. -m
April 25th, 2010
Bananas Foster in a glass
The 60th anniversary of the creation of Bananas Foster is around the corner, and the project I started this weekend should be ready just in time. I’m keeping the recipe under wraps for now, but it involves ripe bananas, a particularly buttery variety of honey, brown sugar, homemade caramel, vanilla, and cinnamon. This should turn out to be a 3 gallon batch when all is said and done. It smells amazing in the primary. -m
April 19th, 2010
The Rick Wakeman clause?
Phrase seen in this article about whether video games are art, and Roger Ebert’s opinions thereon.
“Video games by their nature require player choices, which is the opposite of the strategy of serious film and literature…”
Hmm, Mr. Ebert doesn’t seem to be up on the concept of hypertext, which has manifold connections with cinema. See for instance the scholarly paper Cinematic Paradigms for Hypertext. In fact, making a hypertext or branching narrative requires even greater amounts of authorial skill.
But I’m still curious, what is the Rick Wakeman clause? From where did that term originate? -m
April 18th, 2010
The challenge of an XProc GUI
I’ve been thinking lately about what a sleek UI for creating XProc would look like. There’s plenty of big-picture inspiration to go around, from Yahoo Pipes to Mac OSX Automator, but neither of these are as XML-focused as something working with XProc would be.
XML, or to be really specific, XML Namespaces, comes with its own set of challenges. Making an interface that’s usable is no small task, particularly when your target audience includes the 99.9% of people that don’t completely understand namespaces. Take for example a simple step, like p:delete.
In brief, that step takes an XSLTMatchPattern, following the same rules as @match in XSLT, which ends up selecting various nodes from the document, then returns a document without any of those nodes. An XSLTMatchPattern has a few limitations, but it is a very general-purpose selection mechanism. In particular, it could reference an arbitrary number of XML Namespace prefix mappings. Behind a short string like a:b lies a much longer namespace URI mapping to each prefix.
What would an intuitive user interface look like to allow entry of these kinds of expressions? How can a user keep track of unbound prefixes and attach them properly? A data-driven approach could help, say offering a menu of existing element, attribute, or namespace names taken from a pool of existing content. But by itself this falls short in 1) richer selectors, like xhtml:p[@class = "invalid"] and 2) doesn’t help in the general case, when the nodes you’re manipulating might have come from the pipeline, not your original content. (Imagine one step in the pipeline translates your XML to XHTML followed by a delete step that cleans out some unwanted nodes).
So yeah, this seems like a Really Hard Problem, but one that’s worth taking a crack at. If this sounds like the kind of thing you’d enjoy working on, my team is hiring–drop me a note.
-m
April 2nd, 2010
Recalibrating expectations of XML performance
Working at MarkLogic has forced me to recalibrate my expectations around XML-related performance issues. Not to brag or anything, but it’s screaming fast. Conventional wisdom of avoiding // in paths doesn’t apply, since that’s the sort of thing the indexes are made to do, and that’s just the start. Single milliseconds are now a noteworthy amount of time for something showing up in the profiler.
This is what XML was supposed to be like. Now that XML has fallen off the hype cycle, we’re getting some serious work done. -m
March 18th, 2010
Kindle for Mac scores low on usability
Here’s my first experience with Amazon’s new Kindle client for Mac: After digging up my password and logging in, I was presented with a bunch of books. I picked the last one I’d been reading. It downloaded slowly, without a progress bar, then dumped me on some page in the middle. Apparently my farthest-read location, but I honestly don’t remember.
A cute little graphic on the screen said I could use my scroll wheel. I’m on a laptop, so I tried the two-finger drag–the equivalent gesture sans mouse… and flipped some dozens of pages in half a second. Now, hopelessly lost I searched for a ‘back’ button to no avail. Perversely, there is a prominent ‘back’ button, but disabled. Mocking me.
This feels rushed. I wonder what could be pushing Amazon to release something so unfinished? -m
March 5th, 2010
A Hyperlink Offering revisited
The xml-dev mailing list has been discussing XLink 1.1, which after a long quiet period popped up as a “Proposed Recommendation”, which means that a largely procedural vote is is all that stands between the document becoming a full W3C Recommendation. (The previous two revisions of the document date to 2008 and 2006, respectively)
In 2005 I called continued development of XLink a “reanimated spectre”. But even earlier, in 2002 I wrote one of the rare fiction pieces on xml.com, A Hyperlink Offering, which using the format of a Carrollian dialog between Tortoise and Achilles, explained a few of the problems with the XLink specification. It ended with this:
What if the W3C pushed for Working Groups to use a future XLink, just not XLink 1.0?
Indeed, this version has minor improvements. In particular, “simple” links are simpler now–you can drop an xlink:href attribute where you please and it’s now legit. The spec used to REQUIRE additional xlink:type=”simple” attributes all over the place. But it’s still awkward to use for multi-ended links, and now even farther away from the mainstream hyperlinking aspects of HTML5, which for all of its faults, embodies the grossly predominant description of linking on the web.
So in many ways, my longstanding disappointment with XLink is that it only ever became a tiny sliver of what it could have been. Dashed visions of Xanadu dance through my head. -m
March 1st, 2010
Newsweek should never have been free
Andrew Zolli argues in Newsweek that online content should never have been free. I’m probably not the first one to make this profound observation–but if it were not for the free online edition of Newsweek (and link aggregator sites like Digg) I wouldn’t have read a single word of Newsweek in years, nor would I be linking to it as my previous sentence does… Maybe Newsweek is OK with that. -m
February 22nd, 2010
February 16th, 2010
There is no honor in namespaces
As heard from my friend and Mark Logic contractor Ryan Grimm. -m
February 13th, 2010
Geek Thoughts: Dora the Spamadora
Dora: Oh no! Lawrence Fawusu, 52, Operational Manager of the International Commercial Bank Ghana Limited is in trouble! He needs to move the sum of US$22, 000.000 (TWENTY TWO MILLION UNITED STATES DOLLAR) outside the country, but doesn’t know where to turn.
Dora: Who do we call when we don’t know the way to go? That’s right, the map! (He’s the map, he’s the map, he’s the map!)
Map: Dora and Mr. Fawusu need to 1) get your bank account info, 2) transfer funds, and 3) proft!
Dora: Say it with me: Bank account, transfer funds, profit!
Dora: We need YOUR help to complete the transaction.
(clicking sound)
We did it, yay, lo hicimos, etc.
More collected Geek Thoughts at http://geekthoughts.info.
February 8th, 2010
Economics 101 question
Let’s say you have a box that (completely legally) spits out 1 dollar per day. I’m using “box” in an abstract sense here: maybe it’s an investment or a business opportunity. How much would you pay for this box? In other words, what’s its fair market value?
What if it spit out one dollar per hour? Would you pay exactly 24x as much for it then? Or one per week–would you pay 1/7th as much?
What if it’s hard to measure how much money comes out of it–maybe sometimes it emits a dollar, but sometimes you have to put one in. Then what? -m
February 2nd, 2010
Larry Masinter on overspecification
Some thoughts worth considering on state of HTML development today. -m