Archive for the 'commercialism' Category
Wednesday, October 7th, 2009
Fed Thread is a front end for the newly XMLified Federal Register. Why is this a big deal? It’s a daily publication of the goings-on of the US government. It’s a primary source for all kinds of things that normally only get rehashed through news organizations. And it is bulky–nobody can read through it on a regular basis. A yearly subscription (printed) would cost nearly $1000 and fill over 80,000 pages.
Having it in XML enables all kinds of searching, syndication, and annotation via flexible front ends like this one. Yay for transparency. -m
Permalink
Filed under announcement, IPR, search, xml
Sunday, September 27th, 2009
An editor’s view on the modern publishing market, how it’s changing, and challenges any book faces in running the gauntlet of publication. Worth a read. -m
Permalink
Filed under commercialism, writing
Sunday, August 9th, 2009
Here’s the scenario:
The night before a long flight, I upload my personal files into a freshly charged Kindle 2. To preserve the battery, I switch off wireless and in the bag it goes. The next day, on the plane, I open the Kindle…and it’s showing an entirely depleted battery, exclamation point and all. Can you spot the design flaw?
-m
Permalink
Filed under amazon, annoyance
Thursday, August 6th, 2009
According to this page, it’s here. At least the source code is. You heard it here first. -m
Permalink
Filed under amazon, ebook
Saturday, July 11th, 2009
Several folks have been pointing to this article which has some choice quotes along the lines of
If we examine the nontrivial-sized DBMS markets, it turns out that current relational DBMSs can be beaten by approximately a factor of 50 in most any market I can think of.
My employer is specifically mentioned:
Even in XML, where the current major vendors have spent a great deal of energy extending their engines, it is claimed that specialized engines, such as Mark Logic or Tamino, run circles around the major vendors
And it’s true, but don’t take my word for it. :-) The DBMS world has lots of inertia, but don’t let that blind you to seeing another way to solve problems. Particularly if that extra 50x matters. -m
Permalink
Filed under commercialism, Mark Logic, xml
Tuesday, July 7th, 2009
Come join me at the Demo Jam at Balisage this year. August 11 at 6:30 pm. There will be lots of cool demos, judged by audience participation. I’d love to see you there. -m
Permalink
Filed under announcement, Mark Logic, xml
Saturday, June 27th, 2009
The Steorn 300 program is underway, and yes, I am one of the 300 looking at their information which is coming out in once-a-week bursts in the form of educational modules. So far, nothing interesting. Some basic physics lessons, and somewhat more interesting forum activity.
But all signs seem to be pointing in the wrong direction for a miraculous breakthrough. A jury of members selected by Steorn recently unanimously stated:
The unanimous verdict of the Jury is that Steorn’s attempts to demonstrate the claim have not shown the production of energy. The jury is therefore ceasing work.
Even this announcement raises more questions, and at this point in the game, more questions is not a good thing. Of the 22 original jury members, apparently only 16 were left at the end. Those 16 were unanimous, but what did the other 6 think? Were they booted as dissenters? Also allegedly the jury was never presented with actual hardware, which seems completely crazy and counterproductive from the standpoint of the company that convened the jury.
This kind of story has unfolded many times before, and it doesn’t end well. I’ve spent many hours debunking energy claims and perpetual motion devices. But hey, the company says they are proceeding with plans to commercialize the technology by the end of 2009. No matter what happens, it will be interesting to watch. -m
Previously: How Orbo works, When the experimenter wants to believe, and The downside of free energy.
Permalink
Filed under announcement, commercialism
Thursday, June 25th, 2009
I’m thrilled to announce MarkLogic 4.1 and with it my project App Services, is here. Top-of-the-post props go out to Colleen, David, and Ryan who made it happen.
You might already know that MarkLogic Server is a super-powerful database slash search engine powering projects like MarkMail. (But did you know there’s a free-as-in-beer edition?) The next step is to make it easier to use and build your own apps on top of the server.
The first big piece is the Search API, which lets you do “Google-style” searches over your content like this:
search:search(“MP3 OR iPod AND color:black -Zune”)
The built-in grammar includes AND, OR, parens for grouping, – for negation, quotations for phrases, and easy ways to define facets like date:today or author:”Bill Shakespeare” or GPA:3.95. By passing in additional options, you can redefine the grammar and control all aspects of the search and how the results are returned. Numerous grass-roots efforts at doing someting like this had begun to spring up, so the time was right to come out with an officially-sanctioned API. For those developers who haven’t seen the light yet and don’t fancy XQuery, an API like this is a huge benefit.
The next piece builds on the Search API to offer a graphical App Builder tool that produces a simplified MarkMail-type app around your content. It looks like this:

The App Builder itself is based on XForms via the excellent XSLTForms library and REST, making it a full-blown XRX application.
Lots more info, videos, screencasts, articles, and more are coming soon.
You can start playing with this now by visiting the download page. Under the Community License, you can put 10 gigs of content into it for noncommercial production free-as-in-beer.
Enjoy! I’ll be catching my breath for the next two months*. -m
* Not really
Permalink
Filed under announcement, Mark Logic
Friday, June 19th, 2009
I spent 2 days at the Yahoo! campus at a VoCamp event, my first. Initially, I was dismayed at the schedule. Spend all the time the first day figuring out why everybody came? It seemed inefficient. But having gone through it, the process seems productive, exactly the way that completely decentralized groups need to get things done. Peter Mika did a great job moderating.
Attendees numbered about 35, and came from widely varying backgrounds from librarian to linguist to professor to student to CTO, though uniformly geeky. With SemTech this week, the timing was right, and the number of international attendees was impressive.
In community development, nothing gets completely decided just because a few people met. But progress happens. The first day was largely exploratory, but also covered plenary topics that nearly everyone was interested in. Namely:
- Finding, choosing, and knowing when to create vocabularies
- Mapping from one vocabulary to another
- RDBMS to RDF mapping
Much of the shared understanding of these discussions is captured on various wiki pages connected to the one at the top of this article.
For day 2, we split into smaller working groups with more focused topics. I sat in on a discussion of Common Tag (which still feels too complex to me, but does fulfill a richer use case than rel-tag). Next, some vocabulary design, planning a microformat (and eventual RDF vocab) to represent code documentation: classes, functions, parameters, and the like. Tantek Çelik espoused the “scientific method” of vocab design: would a separate group, in similar circumstances, come up with the same design? If the answer is ‘yes’, then you probably designed it right. The way to make that happen is to focus on the basics, keeping everything as simple as possible. If any important features are missed, you will find out quickly. The experience of getting the simple thing out the door will provide the education needed to make the more complicated follow-on version a success.
From the wrap-up: if you are designing a vocabulary, the most useful thing you can do is NOT to unleash a fully-formed proposal on the world, but rather to capture the discussion around it. What were the initial use cases? What are people currently doing? What design goals were explicitly left off the table, or deferred to a future verson, or immediately shot down? It’s better to capture multiple proposals, even if fragmentary, and let lots of people look them over and gravitate toward the best design.
Lastly, some cool things overheard:
“Relational databases? We call those ‘legacy’.”
“The socially-accepted schema is fairly consistent.”
“It’s just a map, it’s not the territory.”
-m
Permalink
Filed under aswemaythink, everythingismiscellaneous, intentional web, metadata, yahoo
Sunday, June 14th, 2009
I’m sticking around Sunnyvale, but am selling my house. It’s a smaller “starter home”place good for a small family. It’s close to Yahoo!, Google, Ebay, Cisco, and lots of other South Bay companies. In a great neighborhood with lots of parks, restaurants (Giovanni’s Pizza just down the street is fantastic), and a nearby movie theater. If you know anyone moving into the area and looking for a place, here’s a chance to short-circuit a lot of the hassle and get straight into well-cared-for place from a reputable seller.
I’m hesitant to post my address and pictures of my house, etc. here. Email me if you want to see more. -m
Permalink
Filed under announcement, commercialism
Wednesday, June 10th, 2009
The central thesis of The Inmates are Running the Asylum by Alan Cooper is dead on: engineers get too wrapped up in their own worlds, and left entirely to their own whims can easily make a product incomprehensible to ordinary folks. For this reason alone, it’s worth reading.
But I do question parts of his thesis. He (with tongue in cheek) posits the existence of another species of human, called Homo Logicus. Stepping on to an airplane, Homo Logicus turns left into the cockpit with a million buttons but ultimate control over every aspect of the plane. Regular Homo Sapiens, on the other hand, turn right and tuck themselves into a chair–no control but at least they can relax.
But if there was only one “species” of Homo Logicus, members (like me) would never experience usability issues in software created by fellow Logicians. But ordinary fax machines give me fits. The touch-screen copier at work instills dread in my heart. And the software I need to use to file expense reports–written by enterprise software geeks probably very similar to me–is a usability nightmare. Words fail me in expressing my disdain for this steaming heap of fail.
The book is sub-titled “Why High-Tech Products Drive Us Crazy”, but one doesn’t have to look very far to find similar usability bugs in the low-tech world. Seth Godin, for example, likes to talk about different things in life that Just Don’t Work, along with reasons why. Some examples:
- airport cab stand (75 cabs, 75 people, and it takes an hour)
- “don’t operate heavy machinery” warning on dog’s prescription medicine
- excessive fine print on liability agreements–intentionally hard to read and figure out
- official “Vote for Pedro” shirts that look nothing like the ones in the movie
- more examples on the web site
If anything, I think Cooper’s work doesn’t go far enough. It is relatively short on good examples, stretching out only four examples over four chapters. If properly-designed software is so hard to come up with examples of, then there are bigger problems in play (that would need to be dealt with by something more manifesto than book).
The book now 5 years old. Perhaps it’s time for an update. Particularly in the world of web software, lots has happend in 5 years. Flickr. Gmail. Yahoo Pipes. Google Docs. Even SearchMonkey. Instead of focusing on pointing at crappy software, I’d like to see more emphasis on properly-done interfaces. More delving into nuance, and common factors behind why both high-tech and low-tech products miss the mark.
But maybe that’s just me. -m
Permalink
Filed under commercialism, everythingismiscellaneous, hardware, review, stuff
Thursday, June 4th, 2009
I was shocked today to find out that one of my old friends from the Yahoo Search days was let go in the last round. He’s simply brilliant and would have been one of the last people I would have expected that the managers-in-purple could do without.
At the same time, I’m getting hounded by recruiters–five so far just this week.
So let me put these two forces against each other and see if they cancel out. To any former Yahoos: get in touch with me and I’ll do what I can to hook you up with a cool opportunity. This offer is good for June and July–after that I can’t reasonably say I’ll have time for matchmaking. Send me your CV via email and I’ll get started. No promises on results, but I’ll do what I can. :-)
-m
Permalink
Filed under announcement, commercialism, yahoo
Wednesday, June 3rd, 2009
Balisage, formerly Extreme Markup, is the kind of conference I’ve always wanted to attend.
Historically my employers have been not quite enough involved in the deep kinds of topics at this conference (or too cash-strapped, but let’s not go there) to justify spending a week on the road. So I’m glad that’s no longer the case: Mark Logic is sponsoring the conference this year. I’m looking forward to the show, and since I’m not speaking, I might be able to relax a little and soak in some of the knowledge.
See you there! -m
Permalink
Filed under announcement, Mark Logic, xml
Friday, May 22nd, 2009
From Brewster Kahle. Good read, so to speak. -m
Permalink
Filed under annoyance, commercialism, google, IPR
Thursday, May 21st, 2009
Another anniversary this week, one year at Mark Logic. Much of it in stealth mode, but more details of what I’ve been up to are forthcoming. -m
Permalink
Filed under announcement, commercialism, Mark Logic
Tuesday, May 12th, 2009
The new feature called rich snippets shows that SearchMonkey has caught the eye of the 800 pound gorilla. Many of the same microformats and RDF vocabularies are supported. It seems increasingly inevitable that RDFa will catch on, no matter what the HTML5 group thinks. -m
Permalink
Filed under commercialism, google, intentional web, languages, metadata, microformats, search, yahoo
Sunday, May 10th, 2009
As of today, I have been out of Yahoo! for a full year. And what a year it’s been… I guess that means I’m now free to recruit…any good XML people still wearing purple? -m
Permalink
Filed under announcement, yahoo
Sunday, May 3rd, 2009
I’ve been experimenting with the preview version of Wolfram Alpha. It’s not like any current search engine because it’s not a search engine at all. Others have already written more eloquent things about it.
The key feature of it is that it doesn’t just find information, it infers it on the fly. Take for exmple the query
next solar eclipse in Sunnyvale
AFAIK, nobody has ever written a regular web page describing this important (to me) topic. Try it in Yahoo! or Google and see for yourself. There are a few potentially interesting links based on the abstracts, but they turn out to be spammy. Wolfram Alpha figures out that I’m talking about the combination of a concept (“solar eclipse”) and a place (“Sunnyvale, CA”, but with an offer to switch to Sunnyvale, TX) and combines the two. The result is a simple answer–4:52 pm PDT | Sunday, May 20, 2012 (3.049 years from now). Hey, that’s sooner than I thought! Besides the date, there’s many related facts and a cool map.
This is in contrast to SearchMonkey, which I helped create, in two main areas:
- Wolfram Alpha uses metadata to produce the result, then renders it through a set of pre-arranged renderers. The response is facts, not web pages.
- SearchMonkey focuses on sites providing their own metadata, while Wolfram Alpha focuses on hand-curation.
Search engines have been striving to do a better job at fact-queries. Wolfram’s approach shows that an approach disjoint from finding web pages from an index can be hugely useful.
The engineers working on this have a sense of humor too. The query
1.21GW
returns a page that includes the text “power required to operate the flux capacitor in the DeLorean DMC-12 time machine” as well as a useful comparison (~ 0.1 x the power of space shuttle at launch).
Yahoo! and Google do various kinds of internal “query rewriting”, but usually don’t let you know other than in the broadest terms (“did you mean …”). Wolfram Alpha shows a diagram of what it understood the query to be. The diagrams make it evident that something like the RDF model is in use, but without peeking under the hood, it’s hard to say something definitive.
One thing I wonder about is whether Wolfram Alpha creates dynamic (as was a major goal of SearchMonkey) of giving web authors a reason to put more metadata in their sites–a killer app if you will. It’s not clear at this early date how much web crawling or site metadata extraction (say RDFa) plays into the curation process.
In any case Wolfram Alpha is something to watch. It’s set to launch publicly this month. -m
Permalink
Filed under commercialism, intentional web, metadata, search
Wednesday, April 29th, 2009
I found this explanation the most readable I’ve seen yet. She has slides too. The settlement itself has been recently delayed, which seems like a good idea for something of this magnitude. -m
Permalink
Filed under commercialism, google, IPR
Sunday, April 26th, 2009
Thanks to those who wrote in with bug reports about the XForms Validator: something changed recently and made the inserted Google Ads script confuse browsers, resulting in a blank page where you’d expect results. I’ve turned off the response-page ads, which were only getting in the way, and the problem seems to have vanished. Carry on. :-) -m
Permalink
Filed under announcement, google, standards, XForms
Saturday, April 25th, 2009
Lots of news reports about Geocities claim it was purchaed for “4 billion” dollars. But not really–that’s a pretty hefty rounding from 3.57 B. Also, that wasn’t cash, but magic boom time inflated stock. Yahoo was at $335.875 on announcement, so the deal amounted to about 10.6 million shares. Or at today’s values, a little over $150 million. Your call on whether they got their money’s worth. -m
Permalink
Filed under commercialism, yahoo
Friday, April 24th, 2009
I’ve always thought that the EXSLT model of developing community specifications worked well. Now a critical mass of folks has come together on a similar effort, aimed at providing extensions usable in XPath 2.0, XSLT 2.0, XQuery, and other XPath-based languages like XProc. Maybe even XForms.
Check it out, subscribe to the mailing list, and help out if you can. -m
Permalink
Filed under announcement, Mark Logic, standards, XQuery
Sunday, April 19th, 2009
60 Minutes covers it. Disclaminer: haven’t seen it, the video doesn’t even play in my browser. Let me know if you have better success in viewing. -m
Permalink
Filed under commercialism, hardware
Saturday, April 11th, 2009
Google for RIAA, get this first result:
Trade group that claims to represent the US recording industry. Details on services, members, executives profiles, statistics, and contact information.
“Claims to” represent the US recording industry? The word “claims”, accurate as it may be, appears nowhere on their front page. :-)
-m
Permalink
Filed under commercialism, google, IPR
Tuesday, April 7th, 2009
I enjoyed this post, from Jeremy Allison as it turns out. It talks about how GPL software is “the new BSD” when it comes to cloud computing, since redistribuion of the software doesn’t happen, and thus doesn’t trigger the relevant clauses of the GPL. Any old company can use, re-use, and modify the software without sharing the code in the original spirit of the license. The community’s response–something I need to keep a closer eye on–is the AGPL, or Affero license. It works similarly to the GPL, but is triggered by remote use of the software, not just distribution, preserving the work’s copylefedness even in cloud computing situations. -m
Permalink
Filed under commercialism, IPR, standards, trends
Saturday, April 4th, 2009
This article states:
The analysts determined YouTube’s bandwidth costs by assuming that 375 million unique visitors would visit the site in 2009, with 20 percent of those users consuming 400 kilobits per second of video at any given time. That works out to 30 million megabits being served up per second. That’s a heck of a lot of bandwidth to devote to videos of sneezing pandas.
Do you honestly believe that YouTube is sending out 30 petabits per second (to put it another way, fully saturating over 200,000 OC3 connections)? That on average, every single user who counts as a unique visitor in 2009 spends 20% of 24hrs = 4.8 hours actually downloading video, every day of every week?
Gesundheit. -m
Update: the quoted article indeed gets it wrong, though it appears the original Credit Suisse analyst report was estimating peak usage, not a running average. Still doesn’t smell right. Updating the article and title to point the finger at the right people.
Permalink
Filed under annoyance, commercialism, google, stuff, trends
Sunday, March 8th, 2009
The remarkable (and prolific) Stephen Wolfram has an idea called Wolfram Alpha. People used to assume the “Star Trek” model of computers:
that one would be able to ask a computer any factual question, and have it compute the answer.
Which has proved to be quite distant from reality. Instead
But armed with Mathematica and NKS [A New Kind of Science] I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable.
It’s not easy to do this. Every different kind of method and model—and data—has its own special features and character. But with a mixture of Mathematica and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.
I’m still a SearchMonkey guy at heart, so I wonder how much Wofram’s team is familiar with existing Semantic Web research and practice–because at a high level this seems very much like RDF with suitable queries thereupon. If that’s a good characterization, that’s A Good Thing, since practical application has been one of SemWeb’s weak spots.
-m
Permalink
Filed under AI, aswemaythink, commercialism, intentional web, languages, Mark Logic, math, metadata, software, yahoo
Saturday, March 7th, 2009
Your search – :-) – did not match any documents.
Suggestions:
More collected Geek Thoughts at http://geekthoughts.info.
Permalink
Filed under geekthoughts, google
Tuesday, March 3rd, 2009
With apologies to a real news site. (02-27) 16:14 PST SEATTLE, (AP)
Amazon.com Inc. changed course Friday and said it would allow copyright holders to decide whether they will permit their works to be read aloud by the latest laryngeal apparatus, a feature that has been under development for several thousand years.
The move comes nearly two weeks after a group representing authors expressed concern that the feature, which was intended to be able to read every book, blog, magazine and newspaper out loud, would undercut separate audiobook sales. The average American can use their larynx to read text in a somewhat stilted voice.
Amazon said in a statement that it, too, has a stake in the success of the audiobook market, and pointed to its Brilliance Audio and Audible subsidiaries, which publish and sell professionally recorded readings.
“Nevertheless, we strongly believe many rights holders will be more comfortable with the text-to-speech feature if they are in the driver’s seat,” the company said.
Amazon is working on the technical changes needed for authors and publishers to turn text-to-speech off for individual titles.
The Web retailer also said the text-to-speech feature is legal — and wouldn’t require Amazon to pay out additional royalties — because a book read aloud doesn’t constitute a copy, a derivative work or a performance.
More collected Geek Thoughts at http://geekthoughts.info.
Permalink
Filed under amazon, announcement, annoyance, commercialism, geekthoughts, IPR
Tuesday, March 3rd, 2009
Dear Amazon, Speaking as an author myself, you not only made a bad choice, you set a precedent in the wrong direction. The Author’s Guild doesn’t speak for me, nor do I want them to. TTS is only going to get better. The last thing we need is another backward industry fighting progress. -m
Permalink
Filed under amazon, annoyance, IPR