At first glance, this seems to be the Snow Leopard of Tinderbox releases–lots of behind-the-scenes technology updates and largely the same core features. If you’re looking for a way to get more organized, it’s worth a look. Link. -m
Archive for the 'announcement' Category
Friday, December 11th, 2009
Thursday, December 10th, 2009
Celebrating 500 posts since I went to WordPress in May 2006. Prior to that, an additional 730 posts as I floated through a typical evolution of blogging platforms:
- Easy start: blogger (299 posts in 24 months)
- Succumbing to the desire to roll your own (259 posts in 12 months)
- Realizing that rolling your own is too difficult: Pyblosxom (172 posts in 12 months)
- Moving to a mature platform you don’t need to worry about much: WordPress (500 posts in 42+ months)
Friday, November 27th, 2009
Richard Attenborough’s epic biopic is available to watch instantly on Netflix, but only until November 30. Recommended viewing for the weekend. -m
Sunday, November 8th, 2009
If this site is accurate, it’s now possible to have superconducting material at household freezer temperatures: 254k, or a tiny bit below 0F. From power lines to maglevs to supercolliders to energy storage, the potential applications boggle the mind. -m
Note: I’m having trouble finding independent verification of this, other than what appears to be re-hashes of the superconductor.org article. If you have any additional proof or refutation, please post it in the comments.
Thursday, November 5th, 2009
Link credit goes to Joho.
This looks pretty significant. The AZ Supreme Court ruled that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including data.gov and the Federal Register, this fits in well. Quite often metadata including creation date and author and the like make for much better searching and faceting. -m
Wednesday, October 21st, 2009
I’ll be speaking next Tuesday (Oct 27) at the Northern Virginia MarkLogic User Group (NOVAMUG). Here’s what I’ll be talking about.
Application Builder consists of two main parts: Search API to enable Google-style search string processing, and the actual UI wizard that steps users through building a complete search app. It uses a number of technologies that have not (at least not up until now!) been widely associated with MarkLogic. Why some technologies that seem like a perfect fit for XML apps are less used in the Mark Logic ecosystem is anyone’s guess, but one thing App Builder can contribute to the environment is some fresh DNA. Maybe your apps can benefit from these as well.
Relax NG. XML Schema has its roots deep into the XML infrastructure. The type system of XQuery and XSLT 2.0 is based on it. Even XForms has ties to it. But for its wide reach, XML Schema 1.0 has some maddening limitations, and “takes some getting used to” before one can sight read it. In the appendices of many recent W3C specifications use the highly-readable compact syntax to describe content models is a way equally human and machine-readable.
What are these limitations I speak of? XML Schema 1.1 goes a long way toward resolving these, but isn’t yet widely in use. Take this example, the content model of the <options> element from Search API:
start = Options | Response # Root element OptionsType = ( AdditionalQuery? & Annotation* & ConcurrencyLevel? & Constraint* & Debug? & DefaultSuggestionSource? & Forest* & Grammar? & Operator* & PageLength? & QualityWeight? & ReturnConstraints? & ReturnFacets? & ReturnMetrics? & ReturnQtext? & ReturnQuery? & ReturnResults? & ReturnSimilar? & SearchOption* & SearchableExpression? & SortOrder* & SuggestionSource* & Term? & TransformResults? )
The start line indicates that, within this namespace, there are two possible root elements, either <options> or <response> (not shown here). An instance with a root of, say search:annotation is by definition not valid. Try representing that in XML Schema.
The definition of OptionsType allows a wide variety of child elements, some zeroOrMore times, other optional (zero or one occurrence), with no ordering restrictions at all between anything. XML Schema can’t represent this either. James Clark’s trang tool converts Relax NG into XML Schema, and has to approximate this as an xsd:choice with maxOccurs=”unbounded”, thus the elements that can only occur once are not schema-enforced. Thus the Relax NG description of the content model, besides being more readable, actually contains more information than the closest XML Schema. So particularly for XML vocabularies that are optimized for human use, Relax NG is a good choice for schema development.
Out of line validation. So if XML Schema doesn’t fully describe the <options> node, how can authors be sure they have constructed one correctly? Take a step back: even if XML Schema could fully represent the content model, for performance reasons you wouldn’t want to repeatedly validate the node on every query. The options node tends to change infrequently, mainly during a development cycle. Both of these problems can be solved with out-of-line validation: a separate function call search:check-options().
Inside this function you’ll find a validate expression that will make as much use of the Schema as it can, but also much more. The full power of XQuery can be leveraged against the proposed <options> node to check for errors or inconsistencies, and provide helpful feedback to the developer. Since it happens out-of-line, these checks can take substantially longer than actually handing the query based on them. The code can go as in-depth as it needs to without performance worries. This is a useful technique in many situations. One potential shortfall is that people might forget to call your validation function, but in practice this hasn’t been too much trouble.
Higher-order functions. The predecessor to Search API had a problem that it was so popular that users would modify it to suit their unique requirements, which lead to dozens of minor variations floating around in the wild. Different users have different needs and expectations for the library, and making a one-size-fits-all solution is perhaps not possible. One way to relieve this kind of feature pressure is to provide enough extension hotspots to allow all the kinds of tweaks that users will want, preserving the mainline code. This involves prediction, which is difficult (especially about the future). But a good design makes this possible.
Look inside the built-app and you will find a number of function pointers, implemented as a new datatype xdmp:function. XQuery 1.1 will have a more robust mechanism for this, but it might be a while before this is widespread. By modifying one file, changing a pointer to different code, nearly every aspect of the application can be adjusted.
Similarly, a few hotspots in the Search API can be customized, to hook in different kinds of parsers or snippet-generators. This powerful technique can take your own apps to the next level.
Wednesday, October 21st, 2009
XForms 1.1 is now a full W3C Recommendation. Compared to version 1.0, which went live a bit more than 6 years ago, version 1.1 offers lots of road-tested tools that make development easier and more powerful, including new datatypes and XPath functions, a significantly more powerful submission subsystem, and a more flexible event model.
And XSLTForms already supports almost all of the new goodies. -m
Wednesday, October 14th, 2009
In loving memory.
Monday, October 12th, 2009
Come learn more about Mark Logic and get a behind-the-scenes look at the new Application Builder. I’ll be speaking at the NOVA MUG (Northern Virginia Mark Logic User Group) on October 27. This turns out to be pretty close to the big Semantic Web conference, so I’ll stick my head in there too. Stop by and look me up!
Details at the developer site.
Wednesday, October 7th, 2009
Fed Thread is a front end for the newly XMLified Federal Register. Why is this a big deal? It’s a daily publication of the goings-on of the US government. It’s a primary source for all kinds of things that normally only get rehashed through news organizations. And it is bulky–nobody can read through it on a regular basis. A yearly subscription (printed) would cost nearly $1000 and fill over 80,000 pages.
Having it in XML enables all kinds of searching, syndication, and annotation via flexible front ends like this one. Yay for transparency. -m
Tuesday, September 22nd, 2009
Wednesday, September 16th, 2009
I had been asking around earlier for large RDF datasets. Here’s one. Looks like a great contest to build an app around this, but unfortunately, the deadline looks like it’s soonish (1 Oct).
What is it?
The major part of the dataset was crawled during February/March 2009 based on datasets provided by Falcon-S, Sindice, Swoogle, SWSE, and Watson using the MultiCrawler/SWSE framework. To ensure wide coverage, we also included a (bounded) breadth-first crawl of depth 50 starting from http://www.w3.org/People/Berners-Lee/card.
The downloaded content was parsed using the Redland toolkit with rdfxml, rss-tag-soup, rdfa parsers. We rewrote blank node identifiers to include the data source in order to provide unique blank nodes for each data source, and appended the data source to the output file. The data is encoded in NQuads format and split into chunks of 10m statements each.
The page includes some fairly detailed statistics on the data breakdown. Cool. -m
Thursday, August 27th, 2009
As part of the 300 program, Steorn recently released specific details about their technology, which was pretty much the whole point of the 300. The general reaction has been vaguely positive and appreciative (like this posting), though there is a huge self-selection bias in play.
Their key operating principle is clever and unlike anything I’ve seen in my armchair studies of supposed magnetic motion machines. But it’s complicated, in a way that is like the EM equivalent of the three-body problem. In other words, their description is neither obviously wrong nor right. Any time you have moving magnetic fields and pulsating electromagnet currents, hard-to-predict interactions tend to happen. There’s also a host of measurement difficulties, including properly accounting for power factors and complex number phasors for power input/output in inductive circuits.
Friday, August 7th, 2009
I’m heading off to beautiful downtown Montréal this weekend for Balisage, my first appearance at this particular conference. If you’re heading there too, look me up. -m
Monday, August 3rd, 2009
I hate moving at high speed with multiple large chunks of metal in close formation.
I hate the sound of traffic. The smell.
I hate it when people jump in a car to drive somewhere a block away.
I hate driving. I hate parking. I hate SUVs.
Also, getting a root canal leaves me in a foul mood.
More collected Geek Thoughts at http://geekthoughts.info.
Friday, July 24th, 2009
I’m noodling around with requirements and exploring existing work toward a solution for “decentralized extensability” on xml-dev, particularly for HTML. The notion of “Java-style” syntax, with reverse dns names and all, has come up many times in the context of these kinds of discussions, but AFAICT never been fully fleshed out. This is ongoing, slowly, in available time–which as been a post or two per week. (In case there is any doubt, this is a spare-time effort not connected with my employer)
Check it out and add your knowledge to the thread. -m
Tuesday, July 7th, 2009
Come join me at the Demo Jam at Balisage this year. August 11 at 6:30 pm. There will be lots of cool demos, judged by audience participation. I’d love to see you there. -m
Saturday, June 27th, 2009
The Steorn 300 program is underway, and yes, I am one of the 300 looking at their information which is coming out in once-a-week bursts in the form of educational modules. So far, nothing interesting. Some basic physics lessons, and somewhat more interesting forum activity.
But all signs seem to be pointing in the wrong direction for a miraculous breakthrough. A jury of members selected by Steorn recently unanimously stated:
The unanimous verdict of the Jury is that Steorn’s attempts to demonstrate the claim have not shown the production of energy. The jury is therefore ceasing work.
Even this announcement raises more questions, and at this point in the game, more questions is not a good thing. Of the 22 original jury members, apparently only 16 were left at the end. Those 16 were unanimous, but what did the other 6 think? Were they booted as dissenters? Also allegedly the jury was never presented with actual hardware, which seems completely crazy and counterproductive from the standpoint of the company that convened the jury.
This kind of story has unfolded many times before, and it doesn’t end well. I’ve spent many hours debunking energy claims and perpetual motion devices. But hey, the company says they are proceeding with plans to commercialize the technology by the end of 2009. No matter what happens, it will be interesting to watch. -m
Thursday, June 25th, 2009
I’m thrilled to announce MarkLogic 4.1 and with it my project App Services, is here. Top-of-the-post props go out to Colleen, David, and Ryan who made it happen.
You might already know that MarkLogic Server is a super-powerful database slash search engine powering projects like MarkMail. (But did you know there’s a free-as-in-beer edition?) The next step is to make it easier to use and build your own apps on top of the server.
The first big piece is the Search API, which lets you do “Google-style” searches over your content like this:
search:search(“MP3 OR iPod AND color:black -Zune”)
The built-in grammar includes AND, OR, parens for grouping, – for negation, quotations for phrases, and easy ways to define facets like date:today or author:”Bill Shakespeare” or GPA:3.95. By passing in additional options, you can redefine the grammar and control all aspects of the search and how the results are returned. Numerous grass-roots efforts at doing someting like this had begun to spring up, so the time was right to come out with an officially-sanctioned API. For those developers who haven’t seen the light yet and don’t fancy XQuery, an API like this is a huge benefit.
The next piece builds on the Search API to offer a graphical App Builder tool that produces a simplified MarkMail-type app around your content. It looks like this:
The App Builder itself is based on XForms via the excellent XSLTForms library and REST, making it a full-blown XRX application.
Lots more info, videos, screencasts, articles, and more are coming soon.
You can start playing with this now by visiting the download page. Under the Community License, you can put 10 gigs of content into it for noncommercial production free-as-in-beer.
Enjoy! I’ll be catching my breath for the next two months*. -m
* Not really
Tuesday, June 23rd, 2009
When I get time, I want to watch all of this program on fora.tv from Dave McLean in SF who talks about how to make beer, why it tastes like it does, and why some people prefer various styles of beer.
It’s a good follow-up to the NHC reception I made it to last week, with a 3 course dinner (each made with and served with a different beer), a lecture by the highly entertaining Brewing Scientist Charlie Bamforth, and a tasting panel of 20 different additives as palate training.
Even if you’re busy, take some time to appreciate the things you might otherwise enjoy without thinking. -m
Sunday, June 14th, 2009
I’m sticking around Sunnyvale, but am selling my house. It’s a smaller “starter home”place good for a small family. It’s close to Yahoo!, Google, Ebay, Cisco, and lots of other South Bay companies. In a great neighborhood with lots of parks, restaurants (Giovanni’s Pizza just down the street is fantastic), and a nearby movie theater. If you know anyone moving into the area and looking for a place, here’s a chance to short-circuit a lot of the hassle and get straight into well-cared-for place from a reputable seller.
I’m hesitant to post my address and pictures of my house, etc. here. Email me if you want to see more. -m
Thursday, June 4th, 2009
I was shocked today to find out that one of my old friends from the Yahoo Search days was let go in the last round. He’s simply brilliant and would have been one of the last people I would have expected that the managers-in-purple could do without.
At the same time, I’m getting hounded by recruiters–five so far just this week.
So let me put these two forces against each other and see if they cancel out. To any former Yahoos: get in touch with me and I’ll do what I can to hook you up with a cool opportunity. This offer is good for June and July–after that I can’t reasonably say I’ll have time for matchmaking. Send me your CV via email and I’ll get started. No promises on results, but I’ll do what I can. :-)
Wednesday, June 3rd, 2009
Balisage, formerly Extreme Markup, is the kind of conference I’ve always wanted to attend.
Historically my employers have been not quite enough involved in the deep kinds of topics at this conference (or too cash-strapped, but let’s not go there) to justify spending a week on the road. So I’m glad that’s no longer the case: Mark Logic is sponsoring the conference this year. I’m looking forward to the show, and since I’m not speaking, I might be able to relax a little and soak in some of the knowledge.
See you there! -m
Saturday, May 30th, 2009
About a week ago I moved XForms Institute over to Subversion. Now the entire site is under version control, with a local copy I can edit. Publishing is as easy as logging in and running the command ‘svn up’. Honestly, I should have done this long ago. And any future sites I work on will use this approach too–it’s fantastic.
If you notice any glitches, let me know. -m
Thursday, May 21st, 2009
Another anniversary this week, one year at Mark Logic. Much of it in stealth mode, but more details of what I’ve been up to are forthcoming. -m
Friday, May 15th, 2009
My plan is to start unsubstantiated rumors about the SUV Flu, and by association cause people to distance themselves from said implicated devices. You can help.
Despite denials from the CDC (you know they’re in the pockets of the auto industry), SUV flu is serious and spreading fast. You might already have it and not realize it. The SUV flu spreads primarily through close contact with gas-guzzling vehicles, such as so-called Sport Utility Vehicles. California has been hit the hardest, with sources reporting that in a small hamlet outside of Oxnard over 3000 drivers have been seen staggering away from their parked vehicles, and further reports indicate that up to 80 top epidemiologists nationwide are stranded and unable to commute to work.
Transmission occurs primarily via exhaust emission and requires close contact between source and recipient because contaminants do not remain suspended in the air and generally rise directly to the ozone layer. Contact with contaminated surfaces (including bucket seats and 4-wheel-drive shifters) is another possible source of transmission.
The estimated incubation period is unknown and could range from 1-7 days, but more likely 3 years or 36,000 miles.
Patients with uncomplicated disease due to confirmed (or unconfirmed) SUV flu virus infection have experienced inflated ego, increased road rage, chronic lack of consideration for others, decreased awareness of nearby traffic, fatigue, vomiting, or diarrhea. In West Palm Beach, 95% of patients with SUV flu met the case definition of opprobrism.
Anyone showing signs–however faint–of possible SUV flu should pull over, immediately self-diagnose, and proclaim the results on Twitter, Facebook, MySpace, or a nearby blog. If you are somehow still disease-free, carefully avoid contamination vectors mentioned above. Please help spread the warning about this dangerous disease, using the hashtag #suvflu.
Be careful out there.
More collected Geek Thoughts at http://geekthoughts.info.
Sunday, May 10th, 2009
As of today, I have been out of Yahoo! for a full year. And what a year it’s been… I guess that means I’m now free to recruit…any good XML people still wearing purple? -m
Monday, May 4th, 2009
The universe is deeply, fundamentally weird. At the quantum level, all kinds of non-intuitive effects are the building blocks of, well everything. So what if not just observing, but believing in a particular outcome could influence the actual outcome of an experiment?
Something like that could explain a lot: many of the claims of perpetual motion machines, cold fusion a la Stanley and Pons, the placebo effect, Steorn Orbo technology (previous discussion), and numerous similar endeavors. Who’s to say that some aspect of what we call consciousness doesn’t involve some kind of probability manipulation?
The conventional scientific method would be at a loss to deal with such a situation. True Believers would proclaim miraculous results from their experiments, but Skeptics would be unable to reproduce the results. Strong skeptics would set up million dollar rewards to prove crackpottish claims under “controlled conditions”, and nobody would ever collect.
Such a conceit is the basis for a story I’m working on. The first drafts were written 18 months ago, as part of NaNoWriMo 2007. I may be ready for some early reviewers by the summer. Interested? -m
Monday, April 27th, 2009
- Don’t panic. Panic == not thinking clearly.
- Avoid Twitter until symptoms subside. Probably HuffPost and Drudge too.
- Think ahead. If you don’t already have an Emergency Preparedness Kit assembled, well, that was kind of dumb. Over your next few trips to the grocery store, gradually get stuff for one.
- Don’t believe everything you read on the internet. If in doubt, ask a doctor.
- The Time.com article is pretty even-handed, worth a read.
Sunday, April 26th, 2009
Thanks to those who wrote in with bug reports about the XForms Validator: something changed recently and made the inserted Google Ads script confuse browsers, resulting in a blank page where you’d expect results. I’ve turned off the response-page ads, which were only getting in the way, and the problem seems to have vanished. Carry on. :-) -m