<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MicahLogic</title>
	<atom:link href="http://dubinko.info/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://dubinko.info/blog</link>
	<description>From an XML geek, a reader, a writer, a connector, a man of the people (says keep hope alive)</description>
	<lastBuildDate>Tue, 21 May 2013 06:28:42 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Five years at MarkLogic</title>
		<link>http://dubinko.info/blog/2013/05/20/five-years-at-marklogic/</link>
		<comments>http://dubinko.info/blog/2013/05/20/five-years-at-marklogic/#comments</comments>
		<pubDate>Tue, 21 May 2013 06:28:42 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[Mark Logic]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1027</guid>
		<description><![CDATA[This past weekend marked my five-year anniversary at MarkLogic. It&#8217;s been a fun ride, and I&#8217;m proud of how much I&#8217;ve accomplished. It was the technology that originally caught my interest: I saw the MarkMail demo at an XML conference, and one thing led to another. The company was looking to expand the product beyond [...]]]></description>
				<content:encoded><![CDATA[<p>This past weekend marked my five-year anniversary at MarkLogic. It&#8217;s been a fun ride, and I&#8217;m proud of how much I&#8217;ve accomplished.</p>
<p>It was the technology that originally caught my interest: I saw the <a title="MarkMail" href="http://markmail.org/">MarkMail</a> demo at an XML conference, and one thing led to another. The company was looking to expand the product beyond the core database&#8211;they had plans for something called a &#8220;utility layer&#8221; though in reality it wasn&#8217;t really a utility nor a separate layer. It started with <a href="http://developer.marklogic.com/learn/2009-07-search-api-walkthrough">Search API</a>, though the very first piece of code I wrote was an <a href="https://github.com/marklogic/commons/tree/master/rdfa">RDFa parser</a>.</p>
<p>But what&#8217;s really held my interest for these years is a truly unmatched set of peers. This place is brimming with brilliant minds, and that keeps me smiling every day on my way in to work.</p>
<p>Which leads my thoughts back to semantics again. This push in a new direction has a lot of echoes with the events that originally brought me on board. This is going to be huge, and will move the company in a new direction. Stay tuned. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2013/05/20/five-years-at-marklogic/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Semantics!</title>
		<link>http://dubinko.info/blog/2013/04/13/semantics/</link>
		<comments>http://dubinko.info/blog/2013/04/13/semantics/#comments</comments>
		<pubDate>Sat, 13 Apr 2013 19:35:01 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[keynote]]></category>
		<category><![CDATA[marklogic]]></category>
		<category><![CDATA[mlw]]></category>
		<category><![CDATA[mlw13]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1024</guid>
		<description><![CDATA[This week marked the MarkLogic World conference and with it some exciting news. Without formally &#8220;announcing&#8221; a new release, the company showed off a great deal of semantic technology in-progress. Part of that came from me, on stage during the Wednesday technical keynote. I&#8217;ve been at MarkLogic five years next month, and the first piece [...]]]></description>
				<content:encoded><![CDATA[<p>This week marked the MarkLogic World conference and with it some exciting news. Without formally &#8220;announcing&#8221; a new release, the company showed off a great deal of semantic technology in-progress. Part of that came from me, on stage during the Wednesday technical keynote. I&#8217;ve been at MarkLogic five years next month, and the first piece of code I wrote there was an <a title="RDFa on github" href="https://github.com/marklogic/commons/tree/master/rdfa">RDFa parser</a>. This has been a long time coming.</p>
<p>It was an amazing experience. I was responsible for sifting through the huge amounts of public data&#8211;both in RDF formats and on public web pages&#8211;and writing the semantic code to pull everything together, culminating in those ten minutes on stage.</p>
<p>Picture this: just behind the big stage and the projected screens was a hive of impressive activity. I counted 8 A/V people backstage, plus 4 more at the back of the auditorium. The conference has reached  a level of production values that wouldn&#8217;t be vastly different if it was a stadium affair. So in back there&#8217;s a curtained-off &#8220;green room&#8221; with some higher-grade snacks (think PowerBars and Red Bull) with a flatscreen that shows the stage. From back there you can&#8217;t see the projected slides or demos, but if you step just outside, you&#8217;re at the reverse side of the screen, larger-than-life. The narrow walkway leads to the &#8220;chute&#8221;, right up the steps onto the main stage. As David Gorbet went through the opening moments of his talk in fine form, I did some stretches and did everything I could think of to prepare myself.</p>
<p>Then he called me up and the music blasted out from the speakers. I had been playing through my mind all the nightmare scenarios&#8211;tripping on the stairs and falling on my face as I come onstage (etc.)&#8211;but none of that happened. I&#8217;ve done public speaking many times before so I had an idea what to expect, though on a stage like that the lights are so bright that it&#8217;s hard to see beyond about the third row. So despite the 300-400 people in the room, it didn&#8217;t even feel much different than addressing an intimate group of peers. It was fun. On with the demos:</p>
<p>The first showed our internal MarkMail cluster with a simple &#8216;infobox&#8217; of the sort that all the search engines are doing these days. This was an icebreaker to talk about semantics and how it works&#8211;in this case locate the concept of Hadoop in the database, and from there find all the related labels, abstracts, people, projects, releases, and so on. During the construction of the demo, we uncovered some real world facts about the author of the top-ranked message for the query, including a book he wrote. The net effect was that these additional facts made the results a lot more useful by providing a broader context for them.</p>
<p>The second demo showed improved recall&#8211;that is finding things that would otherwise slip under the radar. The existing [from:IBM] query in MarkMail does a good job finding people that happen to have the letters i-b-m in their email address. The semantic query [affiliation:IBM] in contrast knows about the concept of IBM, the concept of people, and the relationship of is-affiliated-with (technically foaf:affiliation) to run a query that more closely models how a person would ask the question: &#8220;people that work for IBM&#8221; as opposed to &#8220;people that have i-b-m in their email address&#8221;. This the results included folks posting from gmail accounts and other personal addresses, and the result set jumped from about 277k messages to 280k messages.</p>
<p>At this point, a pause to talk about the architecture underlying the technology. It turns out that a system that already supports shared-nothing scale out, full ACID transactions, multiple HA/DR options, and a robust security model is a good starting point for building semantic capabilities.  (I got so excited at this point that I forgot to use the clicker for a few beats and had to quickly catch-up the slides.) SPARQL code on the screen.</p>
<p>Then the third demo, a classic semantic app with a twist. Pulling together triples from several different public vocabularies, we answered the question of &#8220;find a Hadoop expert&#8221; with each row of the results representing not a document, as in MarkMail results, but an actual person. We showed location data (which was actually randomized to avoid privacy concerns) and aggregate cost-of-living data for each city. When we added in a search term, we drew histograms of MarkMail message traffic over time and skipped over the result that had no messages. The audience was entranced.</p>
<p>This is exciting work. I had several folks come up to me afterwards with words to the effect of they hadn&#8217;t realized it before, but boy do they ever need semantics. I can&#8217;t think of a better barometer for a technical keynote. So back to work I go. There&#8217;s a lot to do.</p>
<p>Thanking by name is dangerous, because inevitably people get left out, but I would like to shout out to David Gorbet who ran the keynote, John Snelson who&#8217;s a co-conspirator in the development effort, Eric Bloch who helped with the MarkMail code more than anyone will ever know, Denis Shehan who was instrumental in wrangling the cloud and data, and Stephen Buxton who patiently and repeatedly offered feedback that helped sharpen the message.</p>
<p>I&#8217;ll post a pointer to the video when it&#8217;s available. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2013/04/13/semantics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing node-node:node.node</title>
		<link>http://dubinko.info/blog/2013/03/31/introducing-node-nodenode-node/</link>
		<comments>http://dubinko.info/blog/2013/03/31/introducing-node-nodenode-node/#comments</comments>
		<pubDate>Mon, 01 Apr 2013 02:38:15 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[annoyance]]></category>
		<category><![CDATA[everythingismiscellaneous]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[web20]]></category>
		<category><![CDATA[html5]]></category>
		<category><![CDATA[js]]></category>
		<category><![CDATA[naming]]></category>
		<category><![CDATA[node]]></category>
		<category><![CDATA[semanticweb]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1019</guid>
		<description><![CDATA[Naming is hard to do well, almost as hard as designing good software in the first place. Take for instance the term &#8216;node&#8217; which depending on the context can mean A fundamental unit of the DOM (Document Object Model) used in creating rich HTML5 applications. A basic unit of the Semantic Web&#8211;a thing you can say stuff [...]]]></description>
				<content:encoded><![CDATA[<p>Naming is hard to do well, almost as hard as designing good software in the first place. Take for instance the term &#8216;node&#8217; which depending on the context can mean</p>
<ol>
<li><span style="line-height: 13px;">A fundamental unit of the DOM (Document Object Model) used in creating rich HTML5 applications.</span></li>
<li>A basic unit of the Semantic Web&#8211;a thing you can say stuff about. Some nodes are even unlabeled, and hence &#8216;blank nodes&#8217;.</li>
<li>In operations, a node means, roughly, a machine on the network. E.g. &#8220;sixteen-node cluster&#8221;</li>
<li>A <a href="http://en.wikipedia.org/wiki/Node.js">software library</a> for event-driven, asynchronous development with JavaScript.</li>
</ol>
<p>I find myself at the forefront of a growing chorus of software architects and API designers that are fed up with this overloading of a perfectly good term. So I&#8217;m happy today to announce node-node:node.node.</p>
<p>The system is still in pre-alpha, but it solves all of the most pressing problems that software developers routinely run in to. In this framework, every node represents a node, for the ultimate in scalable distributed document storage. In addition, every node additionally serves as a node, which provides just enough context to make open-world assumption metadata assertions at node-node-level granularity. Using the power of Node, every node modeled as a node has instant access to other node-node:nodes. The network really is the computer. You may never write a program the old way again. Follow my progress on <a href="http://sourceforge.net/projects/aprilfools">Sourceforge</a>, the latest and most cutting-edge social code-sharing site. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2013/03/31/introducing-node-nodenode-node/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WFH</title>
		<link>http://dubinko.info/blog/2013/03/01/wfh/</link>
		<comments>http://dubinko.info/blog/2013/03/01/wfh/#comments</comments>
		<pubDate>Fri, 01 Mar 2013 07:48:40 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[annoyance]]></category>
		<category><![CDATA[trends]]></category>
		<category><![CDATA[yahoo]]></category>
		<category><![CDATA[marissa]]></category>
		<category><![CDATA[slackers]]></category>
		<category><![CDATA[wfh]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1014</guid>
		<description><![CDATA[The valley is buzzing about Marissa&#8217;s edict putting the kibosh on Yahoos working from home. I don&#8217;t have any first-hand information, but apparently this applies somewhat even to one-day-a-week telecommuters. Some are saying Marissa&#8217;s making a mistake, but I don&#8217;t think so. She&#8217;s too smart for that. There&#8217;s no better way to get extra hours [...]]]></description>
				<content:encoded><![CDATA[<p>The valley is buzzing about Marissa&#8217;s edict putting the kibosh on Yahoos working from home. I don&#8217;t have any first-hand information, but apparently this applies somewhat even to one-day-a-week telecommuters. Some are saying Marissa&#8217;s making a mistake, but I don&#8217;t think so. She&#8217;s too smart for that. There&#8217;s no better way to get extra hours of work out of a motivated A-lister than letting them skip the commute, and I work regularly with several full-time telecommuters. It works out just fine.</p>
<p>This is a sign that Y is still infested with slackers. From what I&#8217;ve seen, a B-or-C-lister will ruthlessly take advantage of a WFH policy. If that dries up, they&#8217;ll move on.</p>
<p>If I&#8217;m right, the policy will indeed go into effect at Yahoo starting this summer, and after a respectable amount of time has passed (and the slackers leave) it will loosen up again. And Yahoo will be much stronger for it. Agree? -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2013/03/01/wfh/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nerve-wracking</title>
		<link>http://dubinko.info/blog/2013/02/18/nerve-wracking/</link>
		<comments>http://dubinko.info/blog/2013/02/18/nerve-wracking/#comments</comments>
		<pubDate>Mon, 18 Feb 2013 23:47:36 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[ASL]]></category>
		<category><![CDATA[asl]]></category>
		<category><![CDATA[comfortzone]]></category>
		<category><![CDATA[deaf]]></category>
		<category><![CDATA[hearing]]></category>
		<category><![CDATA[presentation]]></category>
		<category><![CDATA[public speaking]]></category>
		<category><![CDATA[signing]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1011</guid>
		<description><![CDATA[So I did it. I stood up on a platform in front of a room of native signers, and delivered a (pre-prepared) five minute presentation without making a sound. In front of cameras, with my ugly face beamed out to multiple large screens. That was stressful, though less so then many different public speaking engagements [...]]]></description>
				<content:encoded><![CDATA[<p>So I did it.</p>
<p>I stood up on a platform in front of a room of native signers, and delivered a (pre-prepared) five minute presentation without making a sound. In front of cameras, with my ugly face beamed out to multiple large screens.</p>
<p>That was stressful, though less so then many different public speaking engagements I&#8217;ve participated in. It was a different <em>kind</em> of stress. I&#8217;m sure I made all kinds of mistakes of which I wasn&#8217;t even aware of. ASL books, videos, and web sites tend to focus on particular signs, and vocabulary is one important part of learning the language&#8211;but not the only part. A huge amount of the communication comes through facial expression, body shifting and language, and other &#8220;non-manual markers.&#8221; I&#8217;m learning, if slowly.</p>
<p>It&#8217;s also helping me in everyday situations, among hearing folks. I&#8217;m better able to express myself and I&#8217;ve picked up some new gestures (like non-dominant-hand indexing&#8230;more on that later), and I tend to, even if in the back of my mind, think about how you&#8217;d express such-and-such an idea in ASL, and having thought it through more, better express it in writing or speech.</p>
<p>It&#8217;s also helping to finally tame my inner-introvert. When a fundamental part of communication involves displaying play-by-play emotions on your face (and indeed, entire body) it changes you. Better than acting lessons.</p>
<p>What have you done lately to push yourself out of your comfort zone? -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2013/02/18/nerve-wracking/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>New Year&#8217;s Resolution</title>
		<link>http://dubinko.info/blog/2012/12/31/new-years-resolution/</link>
		<comments>http://dubinko.info/blog/2012/12/31/new-years-resolution/#comments</comments>
		<pubDate>Tue, 01 Jan 2013 03:24:12 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[annoyance]]></category>
		<category><![CDATA[2013]]></category>
		<category><![CDATA[resolution]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1008</guid>
		<description><![CDATA[Holding steady at 1440 x 900. Relevant. -m]]></description>
				<content:encoded><![CDATA[<p>Holding steady at 1440 x 900.</p>
<p><a href="http://www.quickmeme.com/meme/3se7jl/">Relevant</a>. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/12/31/new-years-resolution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fluency</title>
		<link>http://dubinko.info/blog/2012/12/25/fluency/</link>
		<comments>http://dubinko.info/blog/2012/12/25/fluency/#comments</comments>
		<pubDate>Wed, 26 Dec 2012 05:57:47 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[ASL]]></category>
		<category><![CDATA[acquisition]]></category>
		<category><![CDATA[asl]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[fluency]]></category>
		<category><![CDATA[language]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=1006</guid>
		<description><![CDATA[My journey into ASL continues. I&#8217;ve been reading Oliver Sacks _Seeing Voices_ and Harlan Lane, Robert Hoffmeister, and Ben Bahan&#8217;s _A Journey into the DEAF-WORLD_. In short, learning a language in your thirties is a whole different ballgame than learning as a toddler. There are a few different brain plasticity cliffs you drop off especially [...]]]></description>
				<content:encoded><![CDATA[<p>My journey into ASL continues. I&#8217;ve been reading Oliver Sacks _Seeing Voices_ and Harlan Lane, Robert Hoffmeister, and Ben Bahan&#8217;s _A Journey into the DEAF-WORLD_. In short, learning a language in your thirties is a whole different ballgame than learning as a toddler. There are a few different brain plasticity cliffs you drop off especially at around age 6 and again at age 12.</p>
<p>And I&#8217;m completely OK with this. I don&#8217;t expect to ever get confused for a native signer, which is fine with me. I do expect, however, to become a better communicator&#8211;to develop sufficient skill to be clearly understood in ASL. I prefer to think of it like someone with a suave British accent in America. You&#8217;d never mistake them for a native and yet they are a joy to converse with. In the right circumstances, they can even grab your attention moreso than someone with a native accent.</p>
<p>This can only do good things for my spoken communication skills as well. It&#8217;s a lot like acting classes in some respects, which is a marked departure from my normally taciturn personality. This is encouraging me to quit holding everything inside quite so much, with encouraging results. If you see me walking a little taller, speaking a bit more emphatically, or better conveying emotion to get my point across, now you know what&#8217;s behind that. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/12/25/fluency/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mistakes</title>
		<link>http://dubinko.info/blog/2012/12/08/mistakes/</link>
		<comments>http://dubinko.info/blog/2012/12/08/mistakes/#comments</comments>
		<pubDate>Sun, 09 Dec 2012 04:39:27 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[ASL]]></category>
		<category><![CDATA[asl]]></category>
		<category><![CDATA[learning]]></category>
		<category><![CDATA[mistakes]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=999</guid>
		<description><![CDATA[I&#8217;ve been learning a new language lately: American Sign Language aka ASL. Along with the language, I&#8217;ve picked up lots of new friends as part of a thriving culture. A big part of learning is through mistakes, and a big part of said culture is helpful bluntness. The combination of these can be a little [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve been learning a new language lately: American Sign Language aka ASL. Along with the language, I&#8217;ve picked up lots of new friends as part of a thriving culture. A big part of learning is through mistakes, and a big part of said culture is helpful bluntness. The combination of these can be a little rough on your ego sometimes.</p>
<p>Sometimes I notice that, when I&#8217;m corrected&#8211;say I make a sign incorrectly and my conversational partner demonstrates the correct way to do it&#8211;I often can&#8217;t tell any difference between what I was supposed to do and what my hands actually did. This kind of fundamental error in cognition seems to happen all the time with me. My helpful friends tell me that&#8217;s a good sign. (no pun intended)</p>
<p>A less-bruising kind of error is the &#8220;oops&#8221; kind&#8211;the instant you commit the error, you know you messed up. This, however, can sometimes throw you off if you get self-conscious about it. A third kind of error is when you know exactly what to do, but your physiology holds you back&#8211;for instance the ASL sign for either &#8217;6&#8242; or &#8216;W&#8217; (made the way most hearing people show a &#8217;3&#8242; on their fingers; thumb holding down the pinky) is difficult for me to make without slowing way down. And to think, only 13 years ago I was playing keyboards in a little garage band. Guess I need some stretches. It&#8217;s good to loosen up.</p>
<p>In ASL, though, there&#8217;s a weird kind of middle ground. Sometimes people who don&#8217;t know Spanish kind of &#8216;fake it&#8217; &#8212; &#8220;Yo no speako español&#8221; and the like, which has always come across to me as vaguely offensive. Being overly terrified of making a mistake is itself a fourth kind of mistake. ASL is remarkably flexible; even though it&#8217;s a complete language, it has aspects based on pantomime and sometimes &#8220;classifiers&#8221;, where your hands and fingers can stand in for people, vehicles, or many other things of particular shapes/sizes. I watch some very well-made ASL productions that have equally well-made English paragraphs alongside, and the ASL version uses all of these techniques and more. No word-for-word correspondence here: every time, I&#8217;m surprised by the versitility of the language. My theory is that for an earnest student, it&#8217;d be a lot harder to accidentally come across as offensive or mocking the language in ASL compared to other spoken languages. And thus, I&#8217;m probably committing the fourth kind of error too much.</p>
<p>It&#8217;s good to loosen up. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/12/08/mistakes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hedgehogs and Foxes</title>
		<link>http://dubinko.info/blog/2012/11/20/hedgehogs-and-foxes/</link>
		<comments>http://dubinko.info/blog/2012/11/20/hedgehogs-and-foxes/#comments</comments>
		<pubDate>Wed, 21 Nov 2012 06:46:59 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[commercialism]]></category>
		<category><![CDATA[geekthoughts]]></category>
		<category><![CDATA[trends]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[foxes]]></category>
		<category><![CDATA[hedgehogs]]></category>
		<category><![CDATA[listening]]></category>
		<category><![CDATA[natesilver]]></category>
		<category><![CDATA[punditry]]></category>
		<category><![CDATA[tolstoy]]></category>
		<category><![CDATA[understanding]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=996</guid>
		<description><![CDATA[In Nate Sliver&#8217;s new book, he mentions a classification system for experts, originally from Berkeley professor Philip Tetlock, along a spectrum of Fox &#60;&#8212;&#62; Hedgehog. (The nomenclature comes from an essay about Tolstoy.) Hedgehogs are type A personalities who believe in Big Ideas. The are ideologues and go &#8220;all-in&#8221; on whatever they&#8217;re espousing. A great [...]]]></description>
				<content:encoded><![CDATA[<p>In <a title="The signal and the noise: why so many predictions fail -- but some don't" href="http://www.amazon.com/dp/159420411X">Nate Sliver&#8217;s new book</a>, he mentions a classification system for experts, originally from Berkeley professor Philip Tetlock, along a spectrum of Fox &lt;&#8212;&gt; Hedgehog. (The nomenclature comes from an <a href="http://en.wikipedia.org/wiki/The_Hedgehog_and_the_Fox">essay about Tolstoy</a>.)</p>
<p>Hedgehogs are type A personalities who believe in Big Ideas. The are ideologues and go &#8220;all-in&#8221; on whatever they&#8217;re espousing. A great many pundits fall into this category.</p>
<p>Foxes are scrappy creatures who believe in a plethora of little ideas and in taking different approaches toward a problem, and are more tolerant of nuance, uncertainty, complexity, and dissent.</p>
<p>There are a lot of social situations (broadly construed) where hedgehogs seem to have the upper hand. Talking heads on TV are a huge example, but so are many fixtures in the tech world, Malcolm Gladwell, say. Most of the places I&#8217;ve worked at have at least a subtle hedgehog-bias toward hiring, promotions, and career development.</p>
<p>To some degree, I think this stems from a lack of self-awareness. Brash pundits come across better on the big screen; they grab your attention and take a bold stand for sometihing&#8211;who wouldn&#8217;t like that? But if you take pause and think about what they&#8217;re saying or (horror) go back an measure their predictions after-the-fact, they don&#8217;t look nearly so good. Foxes are better at getting things right.</p>
<p>It seems like we&#8217;ve just been through a phase of more-obnoxious-than-usual punditry, and I found this spectrum a useful way to look at things. How about you? Are you paying more attention to hedgehogs when you probably should be listening to the foxes?</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/11/20/hedgehogs-and-foxes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Virgil Matheson: mentor</title>
		<link>http://dubinko.info/blog/2012/09/28/virgil-matheson-mentor/</link>
		<comments>http://dubinko.info/blog/2012/09/28/virgil-matheson-mentor/#comments</comments>
		<pubDate>Sat, 29 Sep 2012 01:38:57 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[aswemaythink]]></category>
		<category><![CDATA[stuff]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=990</guid>
		<description><![CDATA[I&#8217;ve mentioned Virgil Matheson in these pages a few times, but never made a full accounting. When I had my O&#8217;Reilly book published, I submitted a simple dedication in the manuscript: for Virgil But for whatever reason, it didn&#8217;t make it into the printed edition. This post is a small step toward letting the world [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve mentioned Virgil Matheson in these pages a few times, but never made a full accounting. When I had my <a href="http://www.xformsinstitute.com">O&#8217;Reilly book</a> published, I submitted a simple dedication in the manuscript:</p>
<blockquote><p><em>for Virgil</em></p></blockquote>
<p>But for whatever reason, it didn&#8217;t make it into the printed edition. This post is a small step toward letting the world know about someone important to me.</p>
<p>We first met in 1985 or thereabouts. One day while riding my bike through a back-alley, I stopped to look at an equipment rack set outside a spare garage. Virgil came out to give a get-off-my-lawn kind of speech, and somehow we ended up talking about electronics.  This led to discussions about crystal radios, and in a subsequent visit, we built one, he explaining the principles of operation. Virgil, it turns out, was a retired teacher at the North Dakota State School of Science, where he taught AC theory and thermodynamics. I was going through some rough times, and Virgil ended up being a much-needed role model.</p>
<p>Around that time, I had ttempted to build a Heathkit radio set, but couldn&#8217;t quite get it working. I brought it to Virgil, and we traced through the schematic diagrams, eventually getting it working. Along the way, Virgil introduced me to all kinds of electronic test equipment, including oscilloscopes and galvanometers that he had hand-wound in his younger days.</p>
<p>The next year, I needed a science project, and I had become fixated on Tesla Coils. Virgil had worked at Westinghouse (but not in overlap with the good N. Tesla) and found this project right up his alley. We used his wood lathe to turn a base for the coil, and a standard lathe to wind a primary and two perfectly-spaced secondary coils on PVC pipe, after which we sprayed them down with insulating paint. We built a high-voltage power supply out of a car battery, ignition coil, and relay-type regulator from the junkyard. The thing would turn out serious spark on the primary side, and at one point, I accidentally made contact with it, knocking me clear off the metal bench I was sitting on. We used a spark gap and high-voltage capacitors from old equipment to make a resonator, and got the coil working. It could light a fluorescent tube from my full arm-span away. It was a smash hit at the science fair, too.</p>
<p>For one so knowledgable about the foundations of technology, he was awfully curmudgeonly about it. He bemoaned the day students started showing up in his class with hand-calculators instead of slide rules. He would never answer the phone (but would speak on it, if you could get his brother to pick up).</p>
<p>We kept meeting on and off, and we would have epic discussions/debates about technology, thermodynamics, perpetual motion machines, higher mathematics, theology, building test equipment, and logic puzzles. He taught me, in short, how to think.</p>
<p>A non-exhaustive list of things he taught me:</p>
<ul>
<li>How to build a crystal radio set</li>
<li>How to troubleshoot a conventional radio (hint&#8211;check for signal at the volume control&#8211;that will narrow down the problem to either the front-end or back-end)</li>
<li>How to compute resonant LC circuits</li>
<li>How to use a slide rule</li>
<li>How to pick locks</li>
<li>How to compute power factor and plot phasor diagrams for AC circuits</li>
<li>The value of good tools and how to care for them</li>
<li>How to build a Tesla Coil</li>
<li>How to debate</li>
<li>Respect for high voltage</li>
<li>The joy of back-issues of Scientific American</li>
<li>The trouble with Pascal&#8217;s wager</li>
<li>How to debunk perpetual motion claims</li>
<li>How (and why) to use a planimeter</li>
</ul>
<p>On a recent vacation, I went to see Virgil again&#8211;now in his 90s. He&#8217;s still vigorous and feisty, though his memory is starting to slip a little. It was difficult to come to terms with the possibility that, given the frequency with which I make it to that part of the country, it may be the last time I see him. Since this is posted online, he&#8217;ll probably never see it. But if he could speak to each one of you, I think he&#8217;d offer advice something like this:</p>
<p>Cherish the people in your life. Treat every meeting as if it might be the one that sets you on a new course&#8211;one that you&#8217;ll look back at years later in wonder. Don&#8217;t worry what others think of you, and never stop learning.</p>
<p>Thank you, Virgil, for all you&#8217;ve given me. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/09/28/virgil-matheson-mentor/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MarkLogic 6 is here</title>
		<link>http://dubinko.info/blog/2012/09/17/marklogic-6-is-here/</link>
		<comments>http://dubinko.info/blog/2012/09/17/marklogic-6-is-here/#comments</comments>
		<pubDate>Mon, 17 Sep 2012 19:55:14 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[xml]]></category>
		<category><![CDATA[XQuery]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[marklogic]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[ryandew]]></category>
		<category><![CDATA[shipping]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[widgets]]></category>
		<category><![CDATA[xquery]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=987</guid>
		<description><![CDATA[MarkLogic 6 launched today, and it&#8217;s full of new and updated goodies. I spent some time designing the new Application Builder including the new Visualization Widgets. If you&#8217;ve used Application Builder in the past, you&#8217;ll be pleasantly surprised at the changes. It&#8217;s leaner and faster under the hood. I&#8217;d love to hear what people think [...]]]></description>
				<content:encoded><![CDATA[<p>MarkLogic 6 <a title="MarkLogic 6" href="http://www.marklogic.com/products-and-services/marklogic-6/">launched</a> today, and it&#8217;s full of <a href="http://www.marklogic.com/products-and-services/whats-new-in-marklogic-6/">new</a> and updated goodies. I spent some time designing the new Application Builder including the new Visualization Widgets. If you&#8217;ve used Application Builder in the past, you&#8217;ll be pleasantly surprised at the changes. It&#8217;s leaner and faster under the hood. I&#8217;d love to hear what people think of the new architecture, and how they&#8217;re using it in new and awesome ways.</p>
<p>If I had to pick out a common theme for the release, it&#8217;s all about expanding the appeal of the server to reach new audiences. The Java API makes working with the server feel like a native extension to the language, and the REST API makes it easy to extend the same to other languages.</p>
<p>XQuery support is stronger than ever. I liked Ryan Dew&#8217;s take on some of the smaller, but still useful <a href="http://maxdewpoint.blogspot.cz/2012/09/marklogic-60-released.html">features</a>.</p>
<p>This wouldn&#8217;t be complete without thanking my teammates who really made this possible. I had the great pleasure of working with some top-notch front-end people recently, and it&#8217;s been a great experience. -m</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/09/17/marklogic-6-is-here/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Super simple tokenizer in XQuery</title>
		<link>http://dubinko.info/blog/2012/08/23/super-simple-tokenizer-in-xquery/</link>
		<comments>http://dubinko.info/blog/2012/08/23/super-simple-tokenizer-in-xquery/#comments</comments>
		<pubDate>Fri, 24 Aug 2012 01:10:56 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[languages]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[xml]]></category>
		<category><![CDATA[XQuery]]></category>
		<category><![CDATA[lexer]]></category>
		<category><![CDATA[marklogic]]></category>
		<category><![CDATA[parser]]></category>
		<category><![CDATA[recursion]]></category>
		<category><![CDATA[searchapi]]></category>
		<category><![CDATA[sparql]]></category>
		<category><![CDATA[tokenizer]]></category>
		<category><![CDATA[xquery]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=982</guid>
		<description><![CDATA[A lexer might seem like one of the boringest pieces of code to write, but every language brings it&#8217;s own little wrinkles to the problem. Elegant solutions are more work, but also more rewarding. There is, of course, a large body of work on table-driven approaches, several of them listed here (and bigger list), though [...]]]></description>
				<content:encoded><![CDATA[<p>A lexer might seem like one of the boringest pieces of code to write, but every language brings it&#8217;s own little wrinkles to the problem. Elegant solutions are more work, but also more rewarding.</p>
<p>There is, of course, a large body of work on table-driven approaches, several of them listed <a href="http://en.wikipedia.org/wiki/Lexical_analysis">here</a> (and <a href="http://en.wikipedia.org/wiki/List_of_parser_generators">bigger list</a>), though XQuery seems to have been largely left out of the fun.</p>
<p>In MarkLogic Search API, we implemented a recursive tokenizer. Since a search string can contain quoted pieces which need to be carefully maintained, first we split (in the <a href="http://www.xqueryfunctions.com/xq/fn_tokenize.html">fn:tokenize</a>-sense, discarding matched delimiters) on the quote character, then iterate through the pieces. Odd-numbered pieces are chunks of tokens outside of any quoting, and even-numbered pieces are a single quoted string, to be preserved as-is. We recurse through the odd chunks, further breaking them down into individual tokens, as well as normalizing whitespace and a few other cleanup operations. This code is aggressively optimized, and it removes any searches for tokens known to not appear in the overall string. It also preserves the character offset positions of each token relative to the starting string, which gets used downstream, so this makes for some of the most complicated code in the Search API. But it&#8217;s blazingly fast.</p>
<p>When prototyping, it&#8217;s nice to have something simpler and more straightforward. So I came up with an approach using fn:analyze-string. This function, introduced in XSLT 2.0 and later ported to XQuery 3.0, takes a regular expression, and returns all of the target string, neatly divided into match and non-match portions. This is great, but difficult to apply across the entire string. For example, potential matches can have different meaning depending on where they fall (again, quoted strings as an example.) But if every regex starts with ^ which anchors the match to the front of the string, the problem simplifies to peeling off a single token from the front of the string. Keep doing this until there&#8217;s no string left.</p>
<p>This is a particularly nice approach when parsing a grammar that&#8217;s formally defined in EBNF. You can pretty much take the list of terminal expressions, port them to XQuery-style regexes, add a ^ in front of each, and roll.</p>
<p>Take SPARQL for example. It&#8217;s a reasonably rich grammar. The <a href="http://www.w3.org/TR/sparql11-query/#rGroupGraphPattern">W3C draft spec</a> has 35 productions for terminals. I sketched out some of the terminal rules (note these are simplified):</p>
<pre>declare variable $spq:WS     := "^\s+";
declare variable $spq:QNAME  := "^[a-zA-Z][a-zA-Z0-9]*:[a-zA-Z][a-zA-Z0-9]*";
declare variable $spq:PREFIX := "^[a-zA-Z][a-zA-Z0-9]*:";
declare variable $spq:NAME   := "^[a-zA-Z][a-zA-Z0-9]*";
declare variable $spq:IRI    := "^&lt;[^&gt;]+&gt;";
...</pre>
<p>Then going through the input string, and seeing which of these expressions match, and if so, calling analyze-string and adding the matched portion as a token, and recursing on the non-matched portion. Note that we need to go through longer matches first, so the rule for &#8216;prefix:qname&#8217; comes before the rule for &#8216;prefix:&#8217; which comes before the rule for &#8216;string&#8217;</p>
<pre>declare function spq:tokenize-recurse($in as xs:string, $tl as json:array) {
    if ($in eq "")
    then ()
    else spq:tokenize-recurse(
        switch(true())
        case matches($in, $spq:WS)     return spq:discard-tok($in, $spq:WS)
        case matches($in, $spq:QNAME)  return spq:peel($in, $spq:QNAME, $tl, "qname")
        case matches($in, $spq:PREFIX) return spq:peel($in, $spq:PREFIX, $tl, "prefix", 0, 1)
        case matches($in, $spq:NAME)   return spq:peel($in, $spq:NAME, $tl, "name")
        ...</pre>
<p>Here, we&#8217;re co-opting a json:array mutable object as a convenient way to store tokens as we peel them off. There&#8217;s not actually any JSON involved here. The actual peeling looks like this:</p>
<pre>declare function spq:peel(
    $in as xs:string,
    $regex as xs:string,
    $toklist as json:array,
    $type as xs:string,
    $triml, $trimr) {
    let $split := analyze-string($in, $regex)
    let $match := string($split/str:match)
    let $match := if ($triml gt 0) then substring($match, $triml + 1) else $match
    let $match := if ($trimr gt 0) then substring($match, 1, string-length($match) - $trimr) else $match
    let $_ := json:array-push($toklist, &lt;searchdev:tok type="{$type}"&gt;{$match}&lt;/searchdev:tok&gt;)
    let $result := string($split/str:non-match)
    return $result
};</pre>
<p>Some productions, like a &lt;iri&gt; inside angle brackets, contain fixed delimiters which get trimmed off. Some productions, like whitespace, get thrown away. And that&#8217;s it. As it stands, it&#8217;s pretty close to a table-driven approach. It&#8217;s also more flexible than the recursive approach above&#8211;even for things like escaped quotes inside a string, if you can write a regex for it, you can lex it.</p>
<h2>Performance</h2>
<p>But is it fast? Short answer is that I don&#8217;t know. A full performance analysis would take some time. But a few quick inspections shows that it&#8217;s not terrible, and certainly good enough for prototype work. I have no evidence for this, but I also suspect that it&#8217;s amenable to server-side optimization&#8211;inside the regular expression matching code, paths that involve start-anchored matches should be easy to identify and in many cases avoid work farther down the string. There&#8217;s plenty of room on the XQuery side for optimization as well.</p>
<p>If you&#8217;ve experimented with different lexing techniques, or are interested in more details of this approach, drop me a line in the comments. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/08/23/super-simple-tokenizer-in-xquery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Balisage Bound</title>
		<link>http://dubinko.info/blog/2012/08/07/balisage-bound-3/</link>
		<comments>http://dubinko.info/blog/2012/08/07/balisage-bound-3/#comments</comments>
		<pubDate>Tue, 07 Aug 2012 14:24:32 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[annoyance]]></category>
		<category><![CDATA[xml]]></category>
		<category><![CDATA[balisage]]></category>
		<category><![CDATA[delay]]></category>
		<category><![CDATA[exploring]]></category>
		<category><![CDATA[flight]]></category>
		<category><![CDATA[montreal]]></category>
		<category><![CDATA[travel]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=977</guid>
		<description><![CDATA[I&#8217;m en route to Balisage 2012, though beset by multiple delays. The first leg of my flight was more than two hours delayed, which made the 90 minute transfer window&#8230;problematic. My rebooked flight, the next day (today, that is) is also delayed. Then through customs. Maybe all I&#8217;ll get out of Tuesday is Demo Jam. [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m en route to Balisage 2012, though beset by multiple delays. The first leg of my flight was more than two hours delayed, which made the 90 minute transfer window&#8230;problematic. My rebooked flight, the next day (today, that is) is also delayed. Then through customs. Maybe all I&#8217;ll get out of Tuesday is <a href="http://www.balisage.net/2012/DemoJam.html">Demo Jam</a>. But I will make it.</p>
<p>I&#8217;m speaking on Thursday about exploring large XML datasets. Looking forward to it!</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/08/07/balisage-bound-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Relax NG vs XML Schema: ten year anniversary</title>
		<link>http://dubinko.info/blog/2012/06/04/relaxng-vs-xml/</link>
		<comments>http://dubinko.info/blog/2012/06/04/relaxng-vs-xml/#comments</comments>
		<pubDate>Mon, 04 Jun 2012 15:00:08 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[stuff]]></category>
		<category><![CDATA[trends]]></category>
		<category><![CDATA[xml]]></category>
		<category><![CDATA[anniversary]]></category>
		<category><![CDATA[jamesclark]]></category>
		<category><![CDATA[relaxng]]></category>
		<category><![CDATA[xmlschema]]></category>
		<category><![CDATA[xsd]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=973</guid>
		<description><![CDATA[Today is the 10-year anniversary of this epic message from James Clark on the relative merits of Relax NG vs. XML Schema, and whether the latter should receive preferential treatment. Still relevant today&#8211;the discussion is still going, although an increasing number of human-readable web specifications have adopted RelaxNG in some form. -m]]></description>
				<content:encoded><![CDATA[<p>Today is the 10-year anniversary of this <a title="RELAX NG and W3C XML Schema" href="http://www.imc.org/ietf-xml-use/mail-archive/msg00217.html">epic message</a> from James Clark on the relative merits of Relax NG vs. XML Schema, and whether the latter should receive preferential treatment. Still relevant today&#8211;the discussion is still going, although an increasing number of human-readable web specifications have adopted RelaxNG in some form. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/06/04/relaxng-vs-xml/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MarkLogic World 2012</title>
		<link>http://dubinko.info/blog/2012/04/26/marklogic-world-2012/</link>
		<comments>http://dubinko.info/blog/2012/04/26/marklogic-world-2012/#comments</comments>
		<pubDate>Fri, 27 Apr 2012 05:25:44 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[trends]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[marklogic]]></category>
		<category><![CDATA[MLW12]]></category>
		<category><![CDATA[tag]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=970</guid>
		<description><![CDATA[I&#8217;m getting ready to leave for MarkLogic World, May 1-3 in Washington, DC, and it&#8217;s shaping up to be one fabulous conference. I&#8217;ve always enjoyed the vibe at these events&#8211;it has a, well, cool-in-a-data-geeky-way thing going on (like the XML conference in the early 2000&#8242;s where I got to have lunch with James Clark, but that&#8217;s [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m getting ready to leave for <a href="http://marklogicworld.com/">MarkLogic World</a>, May 1-3 in Washington, DC, and it&#8217;s shaping up to be one fabulous conference. I&#8217;ve always enjoyed the vibe at these events&#8211;it has a, well, <em>cool</em>-in-a-data-geeky-way thing going on (like the XML conference in the early 2000&#8242;s where I got to have lunch with James Clark, but that&#8217;s a different story). Lots of people with big data problems will be here, and I always enjoy talking to these kinds of people.</p>
<p>I&#8217;m speaking on Wednesday at 3:30 with Product Manager extraordinaire Justin Makeig about big data visualization. If you&#8217;ll be at the conference, come look me up. And if you won&#8217;t, well, forgive me if I need a few extra days to get back to any email you send this way.</p>
<p>Follow me on Twitter and look for the #MLW12 tag for live coverage.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2012/04/26/marklogic-world-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
