Archive for the 'python' Category
Wednesday, January 7th, 2009
I’ve started looking into porting the WebPath code (and eventually XForms Validator) over to Python 3. The first step is external libraries, of which there is only one. WebPath uses the lex.py module from PLY. I had got it into my head that Python 2.x and 3.x were thoroughly incompatible, but leave it to the remarkable David Beazley to blow that assumption out of the water: the latest version of lex.py from SVN works in both 2.x and 3.x.
From there the included 2to3 tool was easy enough to run. (Relatively more difficult was getting 2.6 and 3.0 versions of Python frameworks installed on Mac, but even that wasn’t too bad.) The tool made some moderate changes, and I can run the unit tests, and a few even pass!
The primary remaining problem stems from code where the documentation is a little unclear, and my inexperience is severe. The part of the code in platonicweb.py that reads nasty, grotty HTML via Tidy and produces a clean DOM throws an exception every time. Seems to be a mismatch between String and Byte (encoded string) types, but manifested as a failed XML parse. Sans exception handling, the code looks like:
page = urllib.request.urlopen(fullurl)
markup = page.read()
dom = xml.dom.minidom.parseString(markup)
urlopen() returns a file-like object, but the docs didn’t seem clear on whether it’s like a file opened in byte or string mode. In any case, I’m almost certainly doing it wrong. Suggestions?
-m
Permalink
Filed under languages, python
Friday, December 5th, 2008
The long-awaited Python 3.0 is out. It fixes almost every annoyance I have with the language, particularly around Unicode handling, which is important in the kinds of projects I work on.
Now, to revisit some of my Open Source projects… -m
Permalink
Filed under python
Wednesday, September 17th, 2008
The XQuery Working Group is debating the need for higher-order functions in the language. I’m working on honing my description of why this is an important feature. Does this work? What would work better?
Imagine you are writing a smallish widget app, in an environment without a standard library. When you need to sort your widgets, you’d write a simple function with a signature like sort(sequence-of-widgets). That’s great.
Now imagine you find your app to be steadily growing. An accumulation of smaller one-off solutions won’t work anymore, you need a general solution. What you’ll end up with is something like qsort in C, which takes a pointer to a comparator function. By providing different comparators, you can sort anything any way you like, all through only a single sort function. C and C++ have something like this, as do PHP, Python, Java, JavaScript, and even assembly language. XSLT has it, as proven by Dimitre.
XQuery doesn’t. It should, because people are now using it for more than short queries. People are writing programs in it. -m
P. S. Comment please.
Permalink
Filed under languages, python, standards, XQuery
Tuesday, July 15th, 2008
This article made my day. Very similar approach to what I did in WebPath, but even cleaner. Great explanation and performance numbers. -m
P.S. Thanks to Crock for pointing this out.
Permalink
Filed under python, xpath
Wednesday, May 28th, 2008
I registered ‘xfv’ on Google App Engine. Too bad there doesn’t appear to be any significant XML libraries supported. I have XPath covered by my pure-python WebPath, but what about Relax NG? Anyone know of anything in pure python? -m
Permalink
Filed under announcement, google, languages, python, XForms, xpath
Wednesday, February 13th, 2008
It’s been an exhausting past couple of weeks, but life goes on. WebPath made front page at next.yahoo. I’m starting to get feedback from developers who are actually using it, filing bugs, suggesting features, and it’s gratifying. The community is still building up. Won’t you join too? -m
Permalink
Filed under announcement, python, xpath, yahoo
Thursday, January 24th, 2008

WebPath, my experimental XPath 2.0 engine in Python is now an open source project with a liberal BSD license. I originally developed this during a Yahoo! Hack Day, and now I get to announce it during another Hack Day. Seems appropriate.
The focus of WebPath was rapid development and providing an experimental platform. There remains tons of potential work left to do on it…watch this space for continued discussion. I’d like to call out special thanks to the Yahoo! management for supporting me on this, and to Douglas Crockford for turning me on to Top Down Operator Precedence parsers. Have a look at the code. You might be pleasantly surprised at how small and simple a basic XPath 2 engine can be. So, who’s up for some XPath hacking?
Code download. (Coming to SourceForge with CVS, etc., in however many days it takes them to approve a new project) I hope this inspires more developers to work on similar projects, or better yet, on this one! -m
Permalink
Filed under announcement, intentional web, python, xml, xpath, yahoo
Monday, December 24th, 2007
I’m taking some time off from work to relax a bit. And just in time for that, my OLPC arrived. Check out the photoset on Flickr. It’s an impressive little machine, and I’m very happy to have got this instead of a Kindle. :)
-m
Permalink
Filed under commercialism, hardware, python
Sunday, December 16th, 2007
Here’s the slides from my presentation at XML 2007, dealing with an implementation of XPath 2.0 in Python. I hope to have even more news in this area soon.
WebPath (html)
WebPath (OpenDocument, 4.7 megs)
Did you notice the OpenOffice has nice slide export, that generates both graphically-accurate slides and highly indexable and accessible text versons? -m
Permalink
Filed under announcement, intentional web, python, software, xml, xpath
Friday, September 21st, 2007
Watch this space for details. I’ll be speaking about something related to Python and XPath 2.0. Watch this blog for tidbits on the subject. :) -m
Permalink
Filed under announcement, python, software, standards, stuff, xml
Monday, July 23rd, 2007
Based on Doug Crockford’s chapter in Beautiful Code, I wanted to take a crack at implementing Top Down Operator Precedence in Python. After all, Python and JavaScript are quite similar, right?
Not really. As you can imagine, Doug’s code makes great use of JavaScript’s strengths, in this case the ability to assign new methods to any object. For an initial version, I wanted to make the Python version behave the same way, as opposed to a deeper redesign that would be more pythonic. (That would come later.)
My initial approach was a __getattr__ method that consisted simply of return getattr(self.prototype, name). When reattaching a new method to an instance, I needed an extra wrapper, done through a wrap method which consisted of return new.instancemethod(method, self, self.__class__). It would be used like this: obj.method = obj.wrap(some_func).
This caused a subtle problem that took me a while to track down. In JavaScript, any function can reference the built-in this variable, which works whether the function is bound to some specific object or not. (Even global functions are bound to the global object.) But Python doesn’t have such a keyword. The language prefers the explicit, and uses a explicitly passed parameter, called by convention self. The call to wrap a specific function also had the effect of binding the self parameter to that particular object; even if it later became a prototype for some other object. This manifested itself as all kinds of broken behavior. For example, the original code has a global scope object, and every time a new scope was entered, the global pointed to a newer object that kept a reference to the rest of the scope chain. But in the object’s methods, self pointed to something different than the global. Messy.
Before I get into solutions, I’d like to see what readers say. How would you go about implementing prototypical inheritance in Python? And what is a more pythonic way to accomplish the same thing? Comment below. Thanks! -m
Permalink
Filed under python, software, stuff
Wednesday, January 24th, 2007
I’ve always had a thing for text analysis.
- the 352
- and 250
- to 225
- of 188
- in 118
- a 108
- we 100
- is 76
- our 75
- that 72
Source. -m
Permalink
Filed under languages, python, stuff
Tuesday, September 19th, 2006
Permalink
Filed under python
Wednesday, September 6th, 2006
Check out this script. -m
Permalink
Filed under python, software, standards