<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MicahLogic &#187; metadata</title>
	<atom:link href="http://dubinko.info/blog/tags/metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://dubinko.info/blog</link>
	<description>From an XML geek, a reader, a writer, a connector, a man of the people (says keep hope alive)</description>
	<lastBuildDate>Thu, 02 Feb 2012 06:43:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Explosive growth of RDFa</title>
		<link>http://dubinko.info/blog/2011/01/26/explosive-growth-of-rdfa/</link>
		<comments>http://dubinko.info/blog/2011/01/26/explosive-growth-of-rdfa/#comments</comments>
		<pubDate>Thu, 27 Jan 2011 02:19:49 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[intentional web]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[microformats]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[erdf]]></category>
		<category><![CDATA[hatom]]></category>
		<category><![CDATA[hreview]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=917</guid>
		<description><![CDATA[Some great data from my one-time colleague Peter Mika. Based on data culled from 12 billion web pages, RDFa is on 3.5 percent of them, even after discounting &#8220;trivial&#8221; uses of it. Just look at how much that dark blue bar shot up since the last measurement, some 18 months earlier. Also of note: eRDF [...]]]></description>
			<content:encoded><![CDATA[<p>Some great <a href="https://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/">data</a> from my one-time colleague Peter Mika. Based on data culled from 12 billion web pages, RDFa is on 3.5 percent of them, even after discounting &#8220;trivial&#8221; uses of it. Just look at how much that dark blue bar shot up since the last measurement, some 18 months earlier.</p>
<p>Also of note: eRDF has dropped off the map. hAtom and hReview are continuing their climb.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2011/01/26/explosive-growth-of-rdfa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FCC opens its databases</title>
		<link>http://dubinko.info/blog/2010/09/09/fcc-opens-its-databases/</link>
		<comments>http://dubinko.info/blog/2010/09/09/fcc-opens-its-databases/#comments</comments>
		<pubDate>Thu, 09 Sep 2010 16:41:06 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[broadband]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[db]]></category>
		<category><![CDATA[fcc]]></category>
		<category><![CDATA[linkeddata]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=894</guid>
		<description><![CDATA[Good news for big data fans. The FCC has released APIs to several large databases involving broadband statistics, spectrum licenses, and some related topics. I haven&#8217;t had a chance for a close look yet, perhaps we can do that together. Link. -m]]></description>
			<content:encoded><![CDATA[<p>Good news for big data fans. The FCC has <a title="Calling all developers! FCC releases APIs for key databases " href="http://arstechnica.com/web/news/2010/09/calling-all-developers-fcc-releases-apis-for-key-databases.ars">released</a> APIs to several large databases involving broadband statistics, spectrum licenses, and some related topics. I haven&#8217;t had a chance for a close look yet, perhaps we can do that together. <a href="http://reboot.fcc.gov/developer">Link</a>. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2010/09/09/fcc-opens-its-databases/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Eulogy for SearchMonkey</title>
		<link>http://dubinko.info/blog/2010/08/22/eulogy-for-searchmonkey/</link>
		<comments>http://dubinko.info/blog/2010/08/22/eulogy-for-searchmonkey/#comments</comments>
		<pubDate>Mon, 23 Aug 2010 06:07:18 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[microformats]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[yahoo]]></category>
		<category><![CDATA[bing]]></category>
		<category><![CDATA[RIP]]></category>
		<category><![CDATA[searchmonkey]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=887</guid>
		<description><![CDATA[This is indeed a sad day for all of us, for on October 1, a great app will be gone. Though we hardly had enough time during his short life to get to know him, like the grass that withers and fades, this monkey will finish his earthly course. I know he left many things [...]]]></description>
			<content:encoded><![CDATA[<p>This is indeed a sad day for all of us, for on October 1, a great app will be <a href="http://www.ysearchblog.com/2010/08/17/news-about-our-searchmonkey-program/">gone</a>. Though we hardly had enough time during his short life to get to know him, like the grass that withers and fades, this monkey will finish his earthly course.</p>
<div class="wp-caption alignright" style="width: 190px"><a title="Updated SearchMonkey logo by mdubinko, on Flickr" href="http://www.flickr.com/photos/mdubinko/4911814062/"><img title="SearchMonkey updated logo" src="http://farm5.static.flickr.com/4101/4911814062_c7dd2a2c17_m.jpg" alt="Updated SearchMonkey logo" width="180" height="240" /></a><p class="wp-caption-text">Photo by Micah</p></div>
<p>I know he left many things undone, for example only enhancing 60% of the delivered result pages. He never got a chance to finish his life&#8217;s ambition of promoting RDFa and microformats to the masses or to be the killer app of the (lower-case) semantic web. You could say he will live on as &#8220;some of this structured data processing will be supported natively by the Microsoft platform&#8221;. Part of the monkey we loved will live on as enhanced results continue to flow forth from the Yahoo/Bing alliance.</p>
<p>The SearchMonkey Alumni group on LinkedIn is filled with wonderful mourners. Micah Alpern wrote there</p>
<p style="padding-left: 30px;">I miss the team, the <a href="http://vimeo.com/3288386  ">songs</a>, and the aspiration to solve a hard problem. Everything else is just code.</p>
<p>Isaac Asimov was reported to have said &#8220;<em>If my doctor told me I had only six minutes to live, I wouldn&#8217;t brood. I&#8217;d type a little faster.</em>&#8221; Today we can identify with that sentiment. Keep typing.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2010/08/22/eulogy-for-searchmonkey/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>&#8220;Google syntax&#8221; for semantic queries?</title>
		<link>http://dubinko.info/blog/2010/06/09/google-syntax-for-semantic-queries/</link>
		<comments>http://dubinko.info/blog/2010/06/09/google-syntax-for-semantic-queries/#comments</comments>
		<pubDate>Wed, 09 Jun 2010 07:14:42 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[everythingismiscellaneous]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[stuff]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[searchmonkey]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[semweb]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=831</guid>
		<description><![CDATA[Thought experiment: are there any commonly-expressed semantic queries&#8211;the kind of queries you&#8217;d run over a triple store, or perhaps a SearchMonkey-annotated web site&#8211;expressible in common type-in-a-searchbox query grammar? As a refresher, here&#8217;s some things that Google and other search engines can handle. The square brackets represent the search box into which the queries are typed, [...]]]></description>
			<content:encoded><![CDATA[<p>Thought experiment: are there any commonly-expressed semantic queries&#8211;the kind of queries you&#8217;d run over a triple store, or perhaps a <a href="http://developer.yahoo.com/searchmonkey/">SearchMonkey</a>-annotated web site&#8211;expressible in common type-in-a-searchbox query grammar?</p>
<p>As a refresher, here&#8217;s some things that Google and other search engines can handle. The square brackets represent the search box into which the queries are typed, not part of the queries themselves.</p>
<p>[term]</p>
<p>[term -butnotthis]</p>
<p>[term1 OR term2]</p>
<p>["phrase term"]</p>
<p>[tem1 OR term2 -"but not this" site:dubinko.info filetype:html]</p>
<p>So what kind of semantic queries would be usefully expressed in a similar way, avoiding SPARQL and the like? For example, maybe [by:"Micah Dubinko"] could map to a document containing a triple like &lt;this document&gt; &lt;dc:author&gt; &#8220;Micah Dubinko&#8221;. What other kinds of graph queries are interesting, common, and simple to express like this? Comments welcome.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2010/06/09/google-syntax-for-semantic-queries/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Balisage contest: solving the wikiml problem</title>
		<link>http://dubinko.info/blog/2010/05/30/balisage-contest-solving-the-wikiml-problem/</link>
		<comments>http://dubinko.info/blog/2010/05/30/balisage-contest-solving-the-wikiml-problem/#comments</comments>
		<pubDate>Sun, 30 May 2010 20:00:48 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[languages]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=829</guid>
		<description><![CDATA[I wish I could say I had something to do with the planning of this: part of Balisage 2010 is a contest to &#8220;encourage markup experts to review and to research the current state of wiki markup languages and to generate a proposal that serves to de-babelize the current state of affairs for the long [...]]]></description>
			<content:encoded><![CDATA[<p>I wish I could say I had something to do with the planning of this: part of Balisage 2010 is a <a href="http://www.balisage.net/contest.html">contest</a> to &#8220;encourage markup experts to review and to  research the current state of wiki      markup languages and to generate a proposal that serves to  de-babelize the current state of affairs for the long haul.&#8221;  To enter, you must propose a set of concrete steps (organizational,  social, and/or      technological) that will enable wiki content interchange, a real      WYSIWYG editor, and/or wiki syntax standardization.</p>
<p>This pushes all of my buttons. It&#8217;s got structured documents, Web, parser geekery, writing, engineering, and standards. There&#8217;s a bunch of open source prior art, including <a href="http://sourceforge.net/projects/pyxmlwiki/">PyXMLWiki</a>, which I adapted from some fantastic earlier work from Rick Jelliffe.</p>
<p>Sadly, MarkLogic employees aren&#8217;t eligible to enter. Get your write-up done by July 15 and sent to <strong><em>balisage-2010-contest at marklogic dot com</em></strong>. The winner will be announced at Balisage and will take home some serious prize winnings, and also will be strongly encouraged (but not required) to give a brief summary (~10 minutes) of their winning entry.</p>
<p>Can&#8217;t wait to see what comes out of this. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2010/05/30/balisage-contest-solving-the-wikiml-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Metadata FTW</title>
		<link>http://dubinko.info/blog/2009/11/05/metadata-ftw/</link>
		<comments>http://dubinko.info/blog/2009/11/05/metadata-ftw/#comments</comments>
		<pubDate>Fri, 06 Nov 2009 05:22:55 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[everythingismiscellaneous]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[disclosure]]></category>
		<category><![CDATA[documents]]></category>
		<category><![CDATA[public]]></category>
		<category><![CDATA[records]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=688</guid>
		<description><![CDATA[Link credit goes to Joho. This looks pretty significant. The AZ Supreme Court ruled that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including data.gov and the Federal Register, this fits in well. Quite often metadata [...]]]></description>
			<content:encoded><![CDATA[<p>Link credit goes to <a href="http://www.hyperorg.com/blogger/">Joho</a>.</p>
<p>This looks pretty significant. The AZ Supreme Court <a href="http://tech.yahoo.com/news/ap/20091029/ap_on_hi_te/us_hidden_records">ruled</a> that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including <a href="http://data.gov">data.gov</a> and the Federal Register, this fits in well. Quite often metadata including creation date and author and the like make for much better searching and faceting. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/11/05/metadata-ftw/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Speaking at Northern Virginia Mark Logic User Group Oct 27</title>
		<link>http://dubinko.info/blog/2009/10/12/novamug-oct-27/</link>
		<comments>http://dubinko.info/blog/2009/10/12/novamug-oct-27/#comments</comments>
		<pubDate>Mon, 12 Oct 2009 18:49:24 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[dc]]></category>
		<category><![CDATA[iswc]]></category>
		<category><![CDATA[marklogic]]></category>
		<category><![CDATA[semanticweb]]></category>
		<category><![CDATA[usergroup]]></category>
		<category><![CDATA[virginia]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=670</guid>
		<description><![CDATA[Come learn more about Mark Logic and get a behind-the-scenes look at the new Application Builder. I&#8217;ll be speaking at the NOVA MUG (Northern Virginia Mark Logic User Group) on October 27. This turns out to be pretty close to the big Semantic Web conference, so I&#8217;ll stick my head in there too. Stop by [...]]]></description>
			<content:encoded><![CDATA[<p>Come learn more about Mark Logic and get a behind-the-scenes look at the new Application Builder. I&#8217;ll be speaking at the NOVA MUG (Northern Virginia Mark Logic User Group) on October 27. This turns out to be pretty close to the big <a href="http://iswc2009.semanticweb.org/">Semantic Web conference</a>, so I&#8217;ll stick my head in there too. Stop by and look me up!</p>
<p>Details at the <a href="http://developer.marklogic.com/">developer site</a>.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/10/12/novamug-oct-27/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Billion triples challenge</title>
		<link>http://dubinko.info/blog/2009/09/16/billion-triples-challenge/</link>
		<comments>http://dubinko.info/blog/2009/09/16/billion-triples-challenge/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 06:07:00 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[redland]]></category>
		<category><![CDATA[scale]]></category>
		<category><![CDATA[semanticweb]]></category>
		<category><![CDATA[semweb]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=644</guid>
		<description><![CDATA[I had been asking around earlier for large RDF datasets. Here&#8217;s one. Looks like a great contest to build an app around this, but unfortunately, the deadline looks like it&#8217;s soonish (1 Oct). What is it? The major part of the dataset was crawled during February/March 2009 based on datasets provided by Falcon-S, Sindice, Swoogle, [...]]]></description>
			<content:encoded><![CDATA[<p>I had been asking around earlier for large RDF datasets. Here&#8217;s <a href="http://vmlion25.deri.ie/">one</a>. Looks like a great contest to build an app around this, but unfortunately, the deadline looks like it&#8217;s soonish (1 Oct).</p>
<p>What is it?</p>
<blockquote><p>The major part of the dataset was crawled during February/March 2009 based on datasets provided by Falcon-S, Sindice, Swoogle, SWSE, and Watson using the MultiCrawler/SWSE framework. To ensure wide coverage, we also included a (bounded) breadth-first crawl of depth 50 starting from http://www.w3.org/People/Berners-Lee/card.</p>
<p>The downloaded content was parsed using the <a href="http://librdf.org/">Redland toolkit</a> with rdfxml, rss-tag-soup, rdfa parsers. We rewrote blank node identifiers to include the data source in order to provide unique blank nodes for each data source, and appended the data source to the output file. The data is encoded in <a href="http://sw.deri.org/2008/07/n-quads/">NQuads format</a> and split into chunks of 10m statements each.</p></blockquote>
<p>The page includes some fairly detailed statistics on the data breakdown. Cool. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/09/16/billion-triples-challenge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RDFa List Apart</title>
		<link>http://dubinko.info/blog/2009/06/23/rdfa-list-apart/</link>
		<comments>http://dubinko.info/blog/2009/06/23/rdfa-list-apart/#comments</comments>
		<pubDate>Wed, 24 Jun 2009 01:33:59 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[browsers]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[alistapart]]></category>
		<category><![CDATA[birbeck]]></category>
		<category><![CDATA[mark]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[rdfa]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=569</guid>
		<description><![CDATA[A great introduction article. Maybe it&#8217;s just the crowd I hang with, but RDFa looks like it&#8217;s moving from trendy to serious tooling. -m]]></description>
			<content:encoded><![CDATA[<p>A great <a href="http://www.alistapart.com/articles/introduction-to-rdfa/">introduction</a> article. Maybe it&#8217;s just the crowd I hang with, but RDFa looks like it&#8217;s moving from trendy to serious tooling. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/06/23/rdfa-list-apart/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VoCamp Wrap-up</title>
		<link>http://dubinko.info/blog/2009/06/19/vocamp-wrap-up/</link>
		<comments>http://dubinko.info/blog/2009/06/19/vocamp-wrap-up/#comments</comments>
		<pubDate>Sat, 20 Jun 2009 05:45:08 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[aswemaythink]]></category>
		<category><![CDATA[everythingismiscellaneous]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[yahoo]]></category>
		<category><![CDATA[celik]]></category>
		<category><![CDATA[rdbms]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[semantic]]></category>
		<category><![CDATA[tantek]]></category>
		<category><![CDATA[vocamp]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=567</guid>
		<description><![CDATA[I spent 2 days at the Yahoo! campus at a VoCamp event, my first. Initially, I was dismayed at the schedule. Spend all the time the first day figuring out why everybody came? It seemed inefficient. But having gone through it, the process seems productive, exactly the way that completely decentralized groups need to get [...]]]></description>
			<content:encoded><![CDATA[<p>I spent 2 days at the Yahoo! campus at a <a title="Sunnyvale VoCamp 2009" href="http://vocamp.org/wiki/VoCampSunnyvale2009">VoCamp</a> event, my first. Initially, I was dismayed at the schedule. Spend all the time the first day figuring out why everybody came? It seemed inefficient. But having gone through it, the process seems productive, exactly the way that completely decentralized groups need to get things done. Peter Mika did a great job moderating.</p>
<p>Attendees numbered about 35, and came from widely varying backgrounds from librarian to linguist to professor to student to CTO, though uniformly geeky. With <a href="http://www.semantic-conference.com/">SemTech</a> this week, the timing was right, and the number of international attendees was impressive.</p>
<p>In community development, nothing gets completely decided just because a few people met. But progress happens. The first day was largely exploratory, but also covered plenary topics that nearly everyone was interested in. Namely:</p>
<ul>
<li>Finding, choosing, and knowing when to create vocabularies</li>
<li>Mapping from one vocabulary to another</li>
<li>RDBMS to RDF mapping</li>
</ul>
<p>Much of the shared understanding of these discussions is captured on various wiki pages connected to the one at the top of this article.</p>
<p>For day 2, we split into smaller working groups with more focused topics. I sat in on a discussion of Common Tag (which still feels too complex to me, but does fulfill a richer use case than rel-tag). Next, some vocabulary design, planning a microformat (and eventual RDF vocab) to represent code documentation: classes, functions, parameters, and the like. Tantek Çelik espoused the &#8220;scientific method&#8221; of vocab design: would a separate group, in similar circumstances, come up with the same design? If the answer is &#8216;yes&#8217;, then you probably designed it right. The way to make that happen is to focus on the basics, keeping everything as simple as possible. If any important features are missed, you will find out quickly. The experience of getting the simple thing out the door will provide the education needed to make the more complicated follow-on version a success.</p>
<p>From the wrap-up: if you are designing a vocabulary, the most useful thing you can do is NOT to unleash a fully-formed proposal on the world, but rather to capture the discussion around it. What were the initial use cases? What are people currently doing? What design goals were explicitly left off the table, or deferred to a future verson, or immediately shot down? It&#8217;s better to capture multiple proposals, even if fragmentary, and let lots of people look them over and gravitate toward the best design.</p>
<p>Lastly, some cool things overheard:</p>
<p>&#8220;Relational databases? We call those &#8216;legacy&#8217;.&#8221;</p>
<p>&#8220;The socially-accepted schema is fairly consistent.&#8221;</p>
<p>&#8220;It&#8217;s just a map, it&#8217;s not the territory.&#8221;</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/06/19/vocamp-wrap-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A nugget from _A Canticle for Leibowitz_</title>
		<link>http://dubinko.info/blog/2009/05/15/a-nugget-from-leibowitz/</link>
		<comments>http://dubinko.info/blog/2009/05/15/a-nugget-from-leibowitz/#comments</comments>
		<pubDate>Sat, 16 May 2009 04:14:26 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[everythingismiscellaneous]]></category>
		<category><![CDATA[languages]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[canticle]]></category>
		<category><![CDATA[dialog]]></category>
		<category><![CDATA[doubt]]></category>
		<category><![CDATA[knowability]]></category>
		<category><![CDATA[leibowitz]]></category>
		<category><![CDATA[satire]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=536</guid>
		<description><![CDATA[This brilliant bit is almost a throwaway paragraph on page 304, near the end. [Two men in a satirical dialog] managed only to demonstrate that the mathematical limit of an infinite sequence of &#8220;doubting the certainty with which something doubted is known to be unknowable  when the &#8216;something doubted&#8217; is still a preceding statement &#8216;unknowability&#8217; [...]]]></description>
			<content:encoded><![CDATA[<p>This brilliant bit is almost a throwaway paragraph on page 304, near the end.</p>
<p style="padding-left: 30px;">[Two men in a satirical dialog] managed only to demonstrate that the mathematical limit of an infinite sequence of &#8220;doubting the certainty with which something doubted is known to be unknowable  when the &#8216;something doubted&#8217; is still a preceding statement &#8216;unknowability&#8217; of something doubted,&#8221; that the limit of this process at infinity can only be equivalent to a statement of <em>absolute certainty</em>, even though phrased ans an infinite series of negations of certainty.</p>
<p>It&#8217;s not like the whole book is like this&#8230;far from it. But it is chock full of little gems.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/05/15/a-nugget-from-leibowitz/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Rich Snippets powered by RDFa</title>
		<link>http://dubinko.info/blog/2009/05/12/google-rich-snippets-powered-by-rdfa/</link>
		<comments>http://dubinko.info/blog/2009/05/12/google-rich-snippets-powered-by-rdfa/#comments</comments>
		<pubDate>Wed, 13 May 2009 04:43:09 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[commercialism]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[languages]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[microformats]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[yahoo]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[rich]]></category>
		<category><![CDATA[searchmonkey]]></category>
		<category><![CDATA[snippets]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=532</guid>
		<description><![CDATA[The new feature called rich snippets shows that SearchMonkey has caught the eye of the 800 pound gorilla. Many of the same microformats and RDF vocabularies are supported. It seems increasingly inevitable that RDFa will catch on, no matter what the HTML5 group thinks. -m]]></description>
			<content:encoded><![CDATA[<p>The new feature called <a href="http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html">rich snippets</a> shows that SearchMonkey has caught the eye of the 800 pound gorilla. Many of the same microformats and RDF vocabularies are supported. It seems increasingly inevitable that RDFa will catch on, no matter what the HTML5 group thinks. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/05/12/google-rich-snippets-powered-by-rdfa/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Playing with Wolfram Alpha</title>
		<link>http://dubinko.info/blog/2009/05/03/playing-with-wolfram-alpha/</link>
		<comments>http://dubinko.info/blog/2009/05/03/playing-with-wolfram-alpha/#comments</comments>
		<pubDate>Mon, 04 May 2009 02:14:19 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[commercialism]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[alpha]]></category>
		<category><![CDATA[inference]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[wolfram]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=517</guid>
		<description><![CDATA[I&#8217;ve been experimenting with the preview version of Wolfram Alpha. It&#8217;s not like any current search engine because it&#8217;s not a search engine at all. Others have already written more eloquent things about it. The key feature of it is that it doesn&#8217;t just find information, it infers it on the fly. Take for exmple [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been experimenting with the preview version of <a href="http://www.wolframalpha.com">Wolfram Alpha</a>. It&#8217;s not like any current search engine because it&#8217;s not a search engine at all. Others have already <a href="http://searchengineland.com/wolfram-alpha-fact-engine-18431">written</a> more eloquent things about it.</p>
<p>The key feature of it is that it doesn&#8217;t just find information, it infers it on the fly. Take for exmple the query</p>
<p style="padding-left: 30px;">next solar eclipse in Sunnyvale</p>
<p>AFAIK, nobody has ever written a regular web page describing this important (to me) topic. Try it in <a href="http://search.yahoo.com/search?p=next+solar+eclipse+in+Sunnyvale">Yahoo!</a> or <a href="http://www.google.com/search?q=next+solar+eclipse+in+Sunnyvale">Google</a> and see for yourself. There are a few potentially interesting links based on the abstracts, but they turn out to be spammy. Wolfram Alpha figures out that I&#8217;m talking about the combination of a concept (&#8220;solar eclipse&#8221;) and a place (&#8220;Sunnyvale, CA&#8221;, but with an offer to switch to Sunnyvale, TX) and combines the two. The result is a simple answer&#8211;4:52 pm PDT  |  Sunday, May 20, 2012 (3.049 years from now). Hey, that&#8217;s sooner than I thought! Besides the date, there&#8217;s many related facts and a cool map.</p>
<p>This is in contrast to SearchMonkey, which I helped create, in two main areas:</p>
<ol>
<li>Wolfram Alpha uses metadata to produce the result, then renders it through a set of pre-arranged renderers. The response is facts, not web pages.</li>
<li>SearchMonkey focuses on sites providing their own metadata, while Wolfram Alpha focuses on hand-curation.</li>
</ol>
<p>Search engines have been striving to do a better job at fact-queries. Wolfram&#8217;s approach shows that an approach disjoint from finding web pages from an index can be hugely useful.</p>
<p>The engineers working on this have a sense of humor too. The query</p>
<p style="padding-left: 30px;">1.21GW</p>
<p>returns a page that includes the text &#8220;power required to operate the flux capacitor in the DeLorean DMC-12 time machine&#8221; as well as a useful comparison (~ 0.1 x the power of space shuttle at launch).</p>
<p>Yahoo! and Google do various kinds of internal &#8220;query rewriting&#8221;, but usually don&#8217;t let you know other than in the broadest terms (&#8220;did you mean &#8230;&#8221;). Wolfram Alpha shows a diagram of what it understood the query to be. The diagrams make it evident that <em>something like</em> the RDF model is in use, but without peeking under the hood, it&#8217;s hard to say something definitive.</p>
<p>One thing I wonder about is whether Wolfram Alpha creates dynamic (as was a major goal of SearchMonkey) of giving web authors a reason to put more metadata in their sites&#8211;a killer app if you will. It&#8217;s not clear at this early date how much web crawling or site metadata extraction (say RDFa) plays into the curation process.</p>
<p>In any case Wolfram Alpha is something to watch. It&#8217;s set to launch publicly this month. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/05/03/playing-with-wolfram-alpha/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Wolfram Alpha</title>
		<link>http://dubinko.info/blog/2009/03/08/wolfram-alpha/</link>
		<comments>http://dubinko.info/blog/2009/03/08/wolfram-alpha/#comments</comments>
		<pubDate>Sun, 08 Mar 2009 20:47:54 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[aswemaythink]]></category>
		<category><![CDATA[commercialism]]></category>
		<category><![CDATA[intentional web]]></category>
		<category><![CDATA[languages]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[yahoo]]></category>
		<category><![CDATA[alpha]]></category>
		<category><![CDATA[anewkindofscience]]></category>
		<category><![CDATA[mathematica]]></category>
		<category><![CDATA[nks]]></category>
		<category><![CDATA[nlp]]></category>
		<category><![CDATA[practice]]></category>
		<category><![CDATA[query]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[semweb]]></category>
		<category><![CDATA[wolfram]]></category>
		<category><![CDATA[wolframalpha]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=456</guid>
		<description><![CDATA[The remarkable (and prolific) Stephen Wolfram has an idea called Wolfram Alpha. People used to assume the &#8220;Star Trek&#8221; model of computers: that one would be able to ask a computer any factual question, and have it compute the answer. Which has proved to be quite distant from reality. Instead But armed with Mathematica and [...]]]></description>
			<content:encoded><![CDATA[<p>The remarkable (and prolific) Stephen Wolfram has an <a href="http://blog.wolfram.com/2009/03/05/wolframalpha-is-coming/">idea</a> called Wolfram Alpha. People used to assume the &#8220;Star Trek&#8221; model of computers:</p>
<p style="padding-left: 30px;">that one would be able to ask a computer any factual question, and have it compute the answer.</p>
<p>Which has proved to be quite distant from reality. Instead</p>
<p style="padding-left: 30px;">But armed with <em>Mathematica</em> and NKS [<a title="Own it. But never have been able to justify picking up a copy of Mathematica (yet)" href="http://www.amazon.com/exec/obidos/ASIN/1579550088/dubinkoinfo-20">A New Kind of Science</a>] I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable.</p>
<p style="padding-left: 30px;">It’s not easy to do this. Every different kind of method and model—and data—has its own special features and character. But with a mixture of <em>Mathematica</em> and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.</p>
<p>I&#8217;m still a SearchMonkey guy at heart, so I wonder how much Wofram&#8217;s team is familiar with existing Semantic Web research and practice&#8211;because at a high level this seems very much like RDF with suitable queries thereupon. If that&#8217;s a good characterization, that&#8217;s A Good Thing, since practical application has been one of SemWeb&#8217;s weak spots.</p>
<p>-m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/03/08/wolfram-alpha/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Defining the Prime RDFa use case (without mentioning RDFa)</title>
		<link>http://dubinko.info/blog/2009/01/10/defining-the-prime-rdfa-use-case-without-mentioning-rdfa/</link>
		<comments>http://dubinko.info/blog/2009/01/10/defining-the-prime-rdfa-use-case-without-mentioning-rdfa/#comments</comments>
		<pubDate>Sun, 11 Jan 2009 01:23:44 +0000</pubDate>
		<dc:creator>mdubinko</dc:creator>
				<category><![CDATA[intentional web]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[html5]]></category>
		<category><![CDATA[microformats]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[webstandards]]></category>

		<guid isPermaLink="false">http://dubinko.info/blog/?p=417</guid>
		<description><![CDATA[At least, that&#8217;s how I&#8217;ve summarized John Allsopp&#8217;s article on HTML5 semantics. -m]]></description>
			<content:encoded><![CDATA[<p>At least, that&#8217;s how I&#8217;ve summarized John Allsopp&#8217;s <a href="http://alistapart.com/articles/semanticsinhtml5">article</a> on HTML5 semantics. -m</p>
]]></content:encoded>
			<wfw:commentRss>http://dubinko.info/blog/2009/01/10/defining-the-prime-rdfa-use-case-without-mentioning-rdfa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

