At first glance, this seems to be the Snow Leopard of Tinderbox releases–lots of behind-the-scenes technology updates and largely the same core features. If you’re looking for a way to get more organized, it’s worth a look. Link. -m
Archive for the 'software' Category
Sunday, November 29th, 2009
The Model Endpoint Template (MET) organizational pattern for XRX apps
One of the lead bullets describing why XForms is cool always mentions that it is based on a Model View Controller framework. When building a full XRX app, though, MVC might not be the best choice to organize things overall. Why not?
Consider a typical XRX app, like MarkLogic Application Builder. (You can download a your copy of MarkLogic, including Application Builder, under the community license at the developer site.) For each page, the cycle goes like this:
- The browser requests a particular page, say the one that lets you configure sorting options in the app you’re building
- The page loads, including client-side XForms via JavaScript
- XForms requests the project state as XML from a designated endpoint; this becomes the XForms Instance Data
- Stuff happens on the page that changes the client-side state
- Just before leaving the page, XML representing the updated state is HTTP PUT back to the endpoint
The benefit of this approach is that you are dealing with XML all the way through, no impedance mismatches like you might find on an app that awkwardly transitions from (say) relational data to Java objects to urlencoded name/value pairs embedded in HTML syntax.
So why not do this in straight MVC? Honestly, MVC isn’t a bad choice, but it can get unwieldy. If an endpoint consists of a separate model+view+controller files, and each individual page consists of separate model+view+controller files, it adds up to a lot of stuff to keep track of. In truly huge apps, this much attention to organization might be worth it, but most apps aren’t that big. Thus the MET pattern.
Model: It still makes sense to keep the code that deals with particular models (closely aligned with Schemas) as a separate thing. All of Application Builder, for example, has only one model.
Endpoint: The job of an endpoint is to GET and PUT (and possibly POST and DELETE) XML, or other equivalent resource bundles depending on how many media types you want to deal with. It combines an aspect of controllers by being activated by a particular URL and views by providing the data in a consistent format.
Template: Since XForms documents already contain MVC mechanics, it not a high-payoff situation to further use MVC to construct the XForms and XHTML wrapper themselves. The important stuff happens within XForms, and then you need various templating mechanisms for example to provide consistent headers, footers, and other pieces across multiple pages. For this, an ordinary templating mechanism suffices. I can imagine dynamic assembly scenarios where this wouldn’t be the case, but again, many apps don’t need this kind of flexibility, and the complexity that comes along with it.
What about separation of concerns? Oh yeah, what about it? :-) Technically both Endpoints and Templates violate classical SOC. In an XRX app, this typically doesn’t lead to the kinds of spaghetti situations that it might otherwise. Endpoints are self contained, and can focus on doing just one thing well; with limited scope comes limited ability to get into trouble. For those times when you need to dig into the XQuery code of an endpoint, it’s actually helpful to see both the controller and view pieces laid out in one file.
As for Templates, simplicity wins. With the specifics of models and endpoints peeled away, the remaining challenge in developing individual pages is getting the XForms right, and again, it’s helpful to minimize the numbers of files one XForms page are split across. YAGNI applies to what’s left, at least in the stuff I’ve built.
So, I’ve been careful in the title to call this an “organizational pattern”, not a “design pattern” or an (ugh) “architectural pattern”. Nothing too profound here. I’d be happy to start seeing XRX apps laid out with directory names like “models”, “endpoints”, and “templates”.
What do you think? Comments welcome.
-m
Saturday, September 26th, 2009
Geek Thoughts: stability theory
My personal stability theory, as it applies to software engineering: in a multilayered software architecture, the likelihood layer N works well can be expressed as a probability (less than 1 in practice) relative to the lower level layer N-1. For example, if you attempt to write a mission critical Tcl app on a flaky Tcl interpreter, you’re in for some long nights. Via multiplication, a corollary is that the more layers a system has, the less likely it is to work well. (As an aside, I’m not arguing that all software architectures should have fewer layers–other forces outside the scope of this article work against systems with too few layers.)
Joel said something similar lately in the article The Duct Tape Programmer. There is a strong tendency for many coders to over-engineer a system, building towering heights of abstraction. In contrast, a Duct Tape Programmer gets the job done by making something ugly (and with fewer layers) but at least it works. So far this is a fit with what stability theory predicts.
But then he speaks out against unit testing, referring to it in similar terms to the extravagant tower. Quoting JWZ: “If there’s no unit test the customer isn’t going to complain about that.” Here stability theory makes a different prediction. Particularly in the lower levels of the system, flakiness is disastrous. You have to be sure that your foundation is stable before building upon it, or you’re in for keyboard-on-forehead-induced head trauma. This is true no matter how tight the deadlines are or how much pressure is on. In fact, when you don’t have time for a write-over, its even more important to get it right the first time.
The top accomplishment for a coder is shipping software. Duct Tape Programmers make this happen by avoiding needless complexity, which is a great principle to live by. I’m reminded of what Brian Kernighan is attributed as saying:
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Debugging, or more generally making software that works well all the way to the user-facing layer, is hard. Anything that provides fundamental assertions about the stability of your foundation is a useful tool, so don’t slack off on the unit testing.
What about you? Have you found stability theory to be supported by the facts? Comment below.
-m
Wednesday, April 22nd, 2009
TextWrangler and special characters
Hey readers, all seven of you, can you help me out?
I’m perhaps finally switching to a Mac-native text editor, TextWrangler, or if I really like it, BBEdit. Within that app, what’s the easiest way to enter unusual characters not found on a keyboard, say š (Latin s with háček) or ḫ (h-breve below)? In jEdit, one can set up longer strings that get automatically converted into harder-to-type ones. What’s the equivalent in TextWrangler or BBEdit? -m
Sunday, March 8th, 2009
Wolfram Alpha
The remarkable (and prolific) Stephen Wolfram has an idea called Wolfram Alpha. People used to assume the “Star Trek” model of computers:
that one would be able to ask a computer any factual question, and have it compute the answer.
Which has proved to be quite distant from reality. Instead
But armed with Mathematica and NKS [A New Kind of Science] I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable.
It’s not easy to do this. Every different kind of method and model—and data—has its own special features and character. But with a mixture of Mathematica and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.
I’m still a SearchMonkey guy at heart, so I wonder how much Wofram’s team is familiar with existing Semantic Web research and practice–because at a high level this seems very much like RDF with suitable queries thereupon. If that’s a good characterization, that’s A Good Thing, since practical application has been one of SemWeb’s weak spots.
-m
Wednesday, January 7th, 2009
On porting WebPath to Python 3k
I’ve started looking into porting the WebPath code (and eventually XForms Validator) over to Python 3. The first step is external libraries, of which there is only one. WebPath uses the lex.py module from PLY. I had got it into my head that Python 2.x and 3.x were thoroughly incompatible, but leave it to the remarkable David Beazley to blow that assumption out of the water: the latest version of lex.py from SVN works in both 2.x and 3.x.
From there the included 2to3 tool was easy enough to run. (Relatively more difficult was getting 2.6 and 3.0 versions of Python frameworks installed on Mac, but even that wasn’t too bad.) The tool made some moderate changes, and I can run the unit tests, and a few even pass!
The primary remaining problem stems from code where the documentation is a little unclear, and my inexperience is severe. The part of the code in platonicweb.py that reads nasty, grotty HTML via Tidy and produces a clean DOM throws an exception every time. Seems to be a mismatch between String and Byte (encoded string) types, but manifested as a failed XML parse. Sans exception handling, the code looks like:
page = urllib.request.urlopen(fullurl) markup = page.read() dom = xml.dom.minidom.parseString(markup)
urlopen() returns a file-like object, but the docs didn’t seem clear on whether it’s like a file opened in byte or string mode. In any case, I’m almost certainly doing it wrong. Suggestions?
-m
Tuesday, December 30th, 2008
RDFa parser in XQuery now open source
After a delay, the code to my RDFa parser in XQuery is now available under an Apache license. Go get it. This is some of the earliest XQuery code I ever wrote, so go easy on me. It follows the earlier work on a functional definition of RDFa. And feel free to send in patches. -m
Monday, December 8th, 2008
XML 2008 non-liveblog: Content Authoring Schemas
I was on the panel with Bob DuCharme, Frank Miller, and Evan Lenz discussing content authoring, from DITA to DocBook with some WordML sprinkled in for good measure. It was a good discussion, nothing earth-shaking. This session was laptopless, so I don’t have any significant notes. -m
Friday, December 5th, 2008
Mystery Python Theatre 3K
Thursday, October 30th, 2008
XiX (XForms in XQuery)
I’m pondering implementing the computational parts of the XForms Model in XQuery. Doing so in a largely functional environment poses some challenges, though. Has anybody tackled this before? How about in any functional language, including ML, Haskell, Scheme, XSLT, or careful Python?
I borrowed the book Purely Functional Data Structures from a friend–this looks to be a good start. What else is out there? Comment below. -m
Monday, October 20th, 2008
Running jEdit on Mac Java 1.6
I haven’t seen this anywhere else: jEdit doesn’t start up under the recent Mac Java 1.6. It bounces in the dock a few times then goes away.
The solution: manually run the main jar with java -jar path-to/jedit.jar, which will work. Go to the plugin manager and delete the MacOSX plugin. Java integration is good enough in 1.6 that this really isn’t needed anyway. Quit jEdit and now it will start up fine the usual way. -m
Monday, October 13th, 2008
Software narratives: write better software by watching movies
Without any exception I can think of: every top-notch software developer I know is also a skilled technical writer. Technical writing requires skill in choosing words, constructing sentences and paragraphs, and putting together the pieces in the right order to most effectively present the material.
In contrast, narrative writing requires an eye towards the bigger picture, an overall story arc. To put it another way, beginnings, middles, and ends. Hollywood screenwriters have got this down to a science, dividing screenplays into three acts. Next time you visit the movies, look for the parts and how the connect.
Act I, comprising about 1/4 of the whole work, introduces the characters and situation. Between Act I and Act II a key even happens to propel the story forward. Neo swallows the pill. Luke Skywalker finds his Aunt and Uncle killed. In Act II, comprising about 1/2 of the story, the “real story” begins. Another key moment happens to introduce the final Act III, which culminates during the final 1/4 of the story. Three acts: beginning, middle, and end. Other aspects of fiction writing, say characterization, are relatively less important in technical narratives.
A great introduction to these concepts is Syd Field’s Screenplay, to give one a broader view on what story is really all about, and why some stories move people more than others. Many of the concepts apply equally to software narratives. And like I wrote about earlier, such narratives are a powerful (if underused) tool in software development. -m
Friday, October 10th, 2008
More mobile XForms goodness
I haven’t tried this, but these guys claim to have a solution where
The form definitions are saved and exchanged as XForms, and the data as XForm[s] models. The data can be exchanged over http (if the phone users can afford GPRS and have a data connection) or over compressed SMS messages.
Sounds like they have the right idea… -m
Wednesday, October 1st, 2008
Evernote: the final piece of “it’s the data, stupid” clicks into place
Evernote now has import/export (in an XML format), meaning it now passes the generation test for data availability and lock-in-avoidance, as I wrote about some years ago. There’s a server API, as well as client-side scripting. I need to look into the details more, but as a start it looks like a home run. -m
Update: looking at the actual export XML, I’m disappointed. Each note is CDATA-escaped XML? Why???
Thursday, September 25th, 2008
The power of narrative in software development
I’m working on a piece of software that, while not the answer to world peace, is still pretty neat and approaches a specific problem in a fresh way. The project is at the stage where it needs to get unveiled to early adopters in the target audience. So how does one introduce possibly unfamiliar concepts in the form of a new API?
The approach we ended up using for the initial documentation is essentially a narrative–telling a story. Narrative fills the gap between use case and solution in an engaging way. People are naturally inclined to listen to stories, and to expect certain story structures, such as having a beginning, middle, and end with suitable transitions. Thus, if the listener senses a gap in the story, it’s easy for them to speak up. When the story works, people find it easier to map their personal story on to the narrative, leading to better absorption of new concepts, and a more positive impression of the software.
And it’s working. So far we’ve gotten far more useful feedback than we would have otherwise. Even before showing others, the exercise of writing the narrative has exposed gaps and flaws in our thinking, leading to a better, more cohesive design.
If you think back about how you learned about, say, object oriented programming, or event-driven programming, likely there was a story or detailed use case involved that helped you get on board with a new way of thinking. Software + story: It’s a powerful combination, I recommend it.
BTW, my team is hiring full-time positions. Especially if you’ve got XML skills, you could be part of this team. Send me email if interested. -m
Wednesday, September 17th, 2008
The case for native higher-order functions in XQuery
The XQuery Working Group is debating the need for higher-order functions in the language. I’m working on honing my description of why this is an important feature. Does this work? What would work better?
Imagine you are writing a smallish widget app, in an environment without a standard library. When you need to sort your widgets, you’d write a simple function with a signature like sort(sequence-of-widgets). That’s great.
Now imagine you find your app to be steadily growing. An accumulation of smaller one-off solutions won’t work anymore, you need a general solution. What you’ll end up with is something like qsort in C, which takes a pointer to a comparator function. By providing different comparators, you can sort anything any way you like, all through only a single sort function. C and C++ have something like this, as do PHP, Python, Java, JavaScript, and even assembly language. XSLT has it, as proven by Dimitre.
XQuery doesn’t. It should, because people are now using it for more than short queries. People are writing programs in it. -m
P. S. Comment please.
Tuesday, July 15th, 2008
Top Down Operator Precedence in Python
This article made my day. Very similar approach to what I did in WebPath, but even cleaner. Great explanation and performance numbers. -m
P.S. Thanks to Crock for pointing this out.
Thursday, May 29th, 2008
XRX
Bumped into XRX today. XForms + REST + XQuery. I like the sound of this, and XForms on the client just got a whole bunch easier…
I’m seeing multiple signs that the confluence of XForms and XQuery has legs. (And REST just plain makes sense in any situation). -m
Wednesday, May 28th, 2008
XForms Validator on Google App Engine?
I registered ‘xfv’ on Google App Engine. Too bad there doesn’t appear to be any significant XML libraries supported. I have XPath covered by my pure-python WebPath, but what about Relax NG? Anyone know of anything in pure python? -m
Wednesday, April 30th, 2008
Quote of the day
“Rails is a lot of fun, and lets me do cool new things – but it’s hard to eat it.”
-m
Monday, April 28th, 2008
SearchMonkey in private beta
I haven’t mentioned it yet, but SearchMonkey (now an official name, not just a project name) is in external limited beta. Keep an eye on ysearchblog, lots more technical content is on the way. -m
Monday, March 3rd, 2008
WebPath and Wikipedia
The WebPath bug reports continue to roll in. For one, queries against *.wikipedia.* don’t seem to work. You get something back, but it has no resemblance to the page you were looking for. The problem comes from the W3C tidy service that I use, specifically that the (understandably overworked and understaffed) admins at the Wikimedia Foundation seem to have blocked it. It seems like more than a simple IP or user-agent-based block. I’ve emailed them about it but haven’t heard back yet.
So, this highlights the limitation of having a single-source converter in the Platonic Web module of WebPath. So I turn to my readers: do you know of any other tidy servers? Or converters of a non-tidy origin? For any of these to work, they need to return clean XML corresponding to the original page (as opposed to, say, returning something with big headers/footers or ampersand-encoded). This seems like an outstanding need for the open source community.
Please comment below with ideas. Thanks! -m
UPDATE: heard back from the Wikipedia admins, and although professional and helpful-as-can-be-expected, they won’t be changing anything on their end. Still looking for more open source options.
Wednesday, February 13th, 2008
WebPath on next.yahoo
It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m
Thursday, January 24th, 2008
WebPath wants to be free (BSD licensed, specifically)

WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate.
The focus of WebPath was rapid development and providing an experimental platform. There remains tons of potential work left to do on it…watch this space for continued discussion. I’d like to call out special thanks to the Yahoo! management for supporting me on this, and to Douglas Crockford for turning me on to Top Down Operator Precedence parsers. Have a look at the code. You might be pleasantly surprised at how small and simple a basic XPath 2 engine can be. So, who’s up for some XPath hacking?
Code download. (Coming to SourceForge with CVS, etc., in however many days it takes them to approve a new project) I hope this inspires more developers to work on similar projects, or better yet, on this one! -m
Monday, January 7th, 2008
Yahoo! introduces mobile XForms
Admittedly, their marketing folks wouldn’t describe it that way, but essentially that’s what was announced today. (documentation in PDF format, closely related to what-used-to-be Konfabulator tech; here’s the interesting part in HTML) The press release talks about reaching “billions” of mobile consumers; even if you don’t put too much emphasis on press releases (you shouldn’t) it’s still talking about serious use of and commitment to XForms technology.
Shameless plug: Isn’t it time to refresh your memory, or even find out for the first time about XForms? There is this excellent book available in printed format from Amazon, as well as online for free under an open content license. If you guys express enough interest, good things might even happen, like a refresh to the content. Let’s make it happen.
From a consumer standpoint, this feels like a welcome play against Android, too. Yahoo! looks like it’s placing a bet on working with more devices while making development easier at the same time. I’ll bet an Android port will be available, at least in beta, before the end of the year.
Disclaimer: I have been out of Yahoo! mobile for several months now, and can’t claim any credit for or inside knowledge of these developments. -m
P. S. Don’t forget the book.
Monday, December 24th, 2007
OLPC is here
I’m taking some time off from work to relax a bit. And just in time for that, my OLPC arrived. Check out the photoset on Flickr. It’s an impressive little machine, and I’m very happy to have got this instead of a Kindle. :)
-m
Friday, December 21st, 2007
XML 2007 buzz: XForms 1.1
One whole evening of the program was devoted to XForms, focused around the new 1.1 Candidate Recommendation. I admit that some of the early 1.1 drafts gave me pause, but these guys did a good job cleaning up some of the dim corners and adding the right features in the right places. This is worth a careful look. -m
Sunday, December 16th, 2007
Slides from XML 2007: WebPath: Querying the Web as XML
Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon.
WebPath (html)
WebPath (OpenDocument, 4.7 megs)
Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text versons? -m
Thursday, November 29th, 2007
XPath 2.0 implementation details
Well, my plans for a series of postings about details of implementing XPath 2.0 fell rather short, so let’s skip straight to the good stuff.
An article by Mike Kay giving the details of the Saxon architecture. On the surface it’s about performance, but it also has an excellent section in internals. Worth a look. This has been quite influential for me, and maybe you too. -m
Monday, November 5th, 2007
A better name for CURIEs (?)
“Compact Clark Notation“. (Inspired by reading this) -m