XML 2008 liveblog: Sentiment Analysis in Open Source Information for the US Government

Ronald Reck, SAP; Kenneth Sall, SAIC

“I wish I knew when people were saying bad things about me.” Sentiment analysis. Kapow used initially. From 800k news articles (from 1996 and 1997), extracted 450M RDF assertions. The 13 Reuters standard metadata elements not used in this case. Used Redland for heavy RDF lifting. Inxight ThingFinder (commercial) for entity extraction, supplemented with enumerated lists (Bush Cabinet, Intellegence Agencies, negative adjectives, positive admire verbs, etc.) End result was RDF/XML.

(Kenneth takes the mic) SPARQL Sentiment Query Web UI. Heavy SPARQL ahead… Redland hasn’t implemented the UNION operator yet, making the examples more convoluted.

PREFIX sap: <http://iama.rrecktek.com/ont/sap#>
SELECT ?ent ?type ?name
WHERE {
?ent sap:Method "Name Catalog" .
?ent sap:Type ?type .
?ent sap:Name ?name
}

Difficult learning curve. Need ability to do substring from entity URI -> article URI.

Next steps: current news stories. Leverage existing metadata. RDF at the sentence level. Improve name catalogs. Use rule-based pattern matching engine. Slides.

-m

Comments are closed.

MicahLogic is Stephen Fry proof thanks to caching by WP Super Cache