metadata | MicahLogic Queryopticon

Explosive growth of RDFa

By : mdubinko January 26, 2011

0 Comment

Some great data from my one-time colleague Peter Mika. Based on data culled from 12 billion web pages, RDFa is on 3.5 percent of them, even after discounting “trivial” uses of it. Just look at how much that dark blue bar shot up since the last measurement, some 18 months earlier. Also of note: eRDF…

announcement

FCC opens its databases

By : mdubinko September 9, 2010

1 Comment

Good news for big data fans. The FCC has released APIs to several large databases involving broadband statistics, spectrum licenses, and some related topics. I haven’t had a chance for a close look yet, perhaps we can do that together. Link. -m

announcement

Eulogy for SearchMonkey

By : mdubinko August 22, 2010

0 Comment

This is indeed a sad day for all of us, for on October 1, a great app will be gone. Though we hardly had enough time during his short life to get to know him, like the grass that withers and fades, this monkey will finish his earthly course. I know he left many things…

everythingismiscellaneous

“Google syntax” for semantic queries?

By : mdubinko June 9, 2010

1 Comment

Thought experiment: are there any commonly-expressed semantic queries–the kind of queries you’d run over a triple store, or perhaps a SearchMonkey-annotated web site–expressible in common type-in-a-searchbox query grammar? As a refresher, here’s some things that Google and other search engines can handle. The square brackets represent the search box into which the queries are typed,…

announcement

Balisage contest: solving the wikiml problem

By : mdubinko May 30, 2010

0 Comment

I wish I could say I had something to do with the planning of this: part of Balisage 2010 is a contest to “encourage markup experts to review and to research the current state of wiki markup languages and to generate a proposal that serves to de-babelize the current state of affairs for the long…

announcement

Metadata FTW

By : mdubinko November 5, 2009

0 Comment

Link credit goes to Joho. This looks pretty significant. The AZ Supreme Court ruled that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including data.gov and the Federal Register, this fits in well. Quite often metadata…

announcement

Speaking at Northern Virginia Mark Logic User Group Oct 27

By : mdubinko October 12, 2009October 12, 2009

0 Comment

Come learn more about Mark Logic and get a behind-the-scenes look at the new Application Builder. I’ll be speaking at the NOVA MUG (Northern Virginia Mark Logic User Group) on October 27. This turns out to be pretty close to the big Semantic Web conference, so I’ll stick my head in there too. Stop by…

announcement

Billion triples challenge

By : mdubinko September 16, 2009September 16, 2009

0 Comment

I had been asking around earlier for large RDF datasets. Here’s one. Looks like a great contest to build an app around this, but unfortunately, the deadline looks like it’s soonish (1 Oct). What is it? The major part of the dataset was crawled during February/March 2009 based on datasets provided by Falcon-S, Sindice, Swoogle,…

browsers

RDFa List Apart

By : mdubinko June 23, 2009

0 Comment

A great introduction article. Maybe it’s just the crowd I hang with, but RDFa looks like it’s moving from trendy to serious tooling. -m

aswemaythink

VoCamp Wrap-up

By : mdubinko June 19, 2009

0 Comment

I spent 2 days at the Yahoo! campus at a VoCamp event, my first. Initially, I was dismayed at the schedule. Spend all the time the first day figuring out why everybody came? It seemed inefficient. But having gone through it, the process seems productive, exactly the way that completely decentralized groups need to get…

everythingismiscellaneous

A nugget from _A Canticle for Leibowitz_

By : mdubinko May 15, 2009

0 Comment

This brilliant bit is almost a throwaway paragraph on page 304, near the end. [Two men in a satirical dialog] managed only to demonstrate that the mathematical limit of an infinite sequence of “doubting the certainty with which something doubted is known to be unknowableÂ when the ‘something doubted’ is still a preceding statement ‘unknowability’…

commercialism

Google Rich Snippets powered by RDFa

By : mdubinko May 12, 2009

1 Comment

The new feature called rich snippets shows that SearchMonkey has caught the eye of the 800 pound gorilla. Many of the same microformats and RDF vocabularies are supported. It seems increasingly inevitable that RDFa will catch on, no matter what the HTML5 group thinks. -m

commercialism

Playing with Wolfram Alpha

By : mdubinko May 3, 2009May 3, 2009

1 Comment

I’ve been experimenting with the preview version of Wolfram Alpha. It’s not like any current search engine because it’s not a search engine at all. Others have already written more eloquent things about it. The key feature of it is that it doesn’t just find information, it infers it on the fly. Take for exmple…

AI

Wolfram Alpha

By : mdubinko March 8, 2009

0 Comment

The remarkable (and prolific) Stephen Wolfram has an idea called Wolfram Alpha. People used to assume the “Star Trek” model of computers: that one would be able to ask a computer any factual question, and have it compute the answer. Which has proved to be quite distant from reality. Instead But armed with Mathematica and…

intentional web

Defining the Prime RDFa use case (without mentioning RDFa)

By : mdubinko January 10, 2009

0 Comment

At least, that’s how I’ve summarized John Allsopp’s article on HTML5 semantics. -m

announcement

RDFa parser in XQuery now open source

By : mdubinko December 30, 2008

0 Comment

After a delay, the code to my RDFa parser in XQuery is now available under an Apache license. Go get it. This is some of the earliest XQuery code I ever wrote, so go easy on me. It follows the earlier work on a functional definition of RDFa. And feel free to send in patches….

everythingismiscellaneous

XML 2008 liveblog: Using RDFa for Government Information

By : mdubinko December 9, 2008

0 Comment

Mark Birbeck, Web Backplane. Problem statement: You shouldn’t have to “scrape” government sites. Solution: RDFa <div typeof=”arg:Vacancy”> Job title: <span property=”dc:title”>Assistant Officer</span> Description: <span property=”dc:description”>To analyse… </span> </div> This resolves to two full RDF triples. No separate feeds, uses existing publishing systems. Two of the most ambitious RDFa projects are taking place in the UK….

everythingismiscellaneous

XML 2008 liveblog: Sentiment Analysis in Open Source Information for the US Government

By : mdubinko December 9, 2008

0 Comment

Ronald Reck, SAP; Kenneth Sall, SAIC “I wish I knew when people were saying bad things about me.” Sentiment analysis. Kapow used initially. From 800k news articles (from 1996 and 1997), extracted 450M RDF assertions. The 13 Reuters standard metadata elements not used in this case. Used Redland for heavy RDF lifting. Inxight ThingFinder (commercial)…

everythingismiscellaneous

Online etymology database

By : mdubinko October 24, 2008

0 Comment

I’ve been playing lately with this site, and it’s a fantastic resource. The word carboy probably comes from Persian qarabah “large flagon.” Who knew? -m

announcement

MarkLogic RDFa parser

By : mdubinko August 23, 2008

1 Comment

This post will be continuously updated to contain the most recent details about an XQuery 1.0 RDFa parser I wrote for Mark Logic. It follows the Functional RDFa pattern. At present there is little to say, but eventually code and more will be available. Stay tuned. -m

browsers

It would be awesome if somebody…

By : mdubinko August 8, 2008

1 Comment

It would be awesome of someone made a site that catalogued all the common mis-encodings. Even in 2008, I see these things all over the web–mangled quotation marks, apostrophes, em-dashes. I’d love to see a pictoral guide. curly apostrophe looks like ?Ã¢â‚¬â„¢ – original encoding=_________ mislabeled as __________ . That sort of thing. Surely somebody…

intentional web

Great comment on the eRDF 1.1 discussion

By : mdubinko August 7, 2008August 23, 2008

0 Comment

On the eRDF discussion posting, Toby Inkster, an implementer of eRDF, talks about why it’s bad to steal the id attribute, and why RDFa is better suited for general purpose metadata. Worth a read. -m

announcement

Implementing RDFa in XQuery

By : mdubinko August 4, 2008

1 Comment

Through the weekend I put most of the final touches on an implementation of RDFa in XQuery. The implementation is based on the functional specification of RDFa, an offshoot of the excellent work coming out of the W3C task force. The spec contains a procedural description of the parsing algorithm, and several have successfully followed…

browsers

eRDF 1.1 Proposal Discussion

By : mdubinko July 28, 2008

3 Comment

The W3C RDFa specification is now in Candidate Recommendation phase, with an explicit call for implementations (of which there are several). Momentum for RDFa is steadily building. What about eRDF, which favors the existing HTML syntax over new attributes? There’s still a place for a simpler syntactic approach to embedding RDF in HTML, as evidenced…

everythingismiscellaneous

Yahoo! now indexes RDFa

By : mdubinko July 3, 2008

2 Comment

I haven’t seen an announcement about this, but try the following query on Yahoo Search: [searchmonkeyid:com.yahoo.rdf.rdfa] (link). It shows documents containing RDFa, with Digg at the top. Since this is a Searchmonkey ID, it’s also usable in Searchmonkey to actually extract the metadata and use it to customize search results. Does your site use RDFa…

metadata

RDFa is a Candidate Recommendation

By : mdubinko June 20, 2008

0 Comment

The result of tons of work by lots of smart people. Go forth and implement. And I need to put in a plug for Metadata for Grandma which (indirectly, as it turned out) influenced the spec. RDFa is already a big deal, used in places like SearchMonkey. The subset of RDFa used by SearchMonkey is…

announcement

Reminder: SearchMonkey developer launch party Thursday

By : mdubinko May 14, 2008May 15, 2008

0 Comment

Reminder: Thursday evening at Yahoo! Sunnyvale headquarters is the launch party for the developer-facing side of SearchMonkey. In case you haven’t been paying attention, SearchMonkey is a new platform that lets developers craft their own awesomized search results. If you’re interested in SEO or general lowercase semantic web tools, you’ll love it. Meet me there….

announcement

SearchMonkey in private beta

By : mdubinko April 28, 2008

0 Comment

I haven’t mentioned it yet, but SearchMonkey (now an official name, not just a project name) is in external limited beta. Keep an eye on ysearchblog, lots more technical content is on the way. -m

announcement

The (lowercase) semantic web goes mainstream

By : mdubinko March 13, 2008March 13, 2008

0 Comment

So today Yahoo! announced a major facet of what I’ve been working on lately: making the web more meaningful. Lots of fantastic coverage, including TechCrunch and ReadWriteWeb (and others, please link in the comments), and supportive responses and blog posts across the board. It’s been a while since I’ve felt this good about being a…

announcement

Dear readers…

By : mdubinko March 10, 2008

0 Comment

You are awesome. Just sayin’. -m

Category: metadata