Archive for the 'software' Category

Friday, January 26th, 2007

UBL Swinger

An easy to use UBL Editor. Has anyone tried it? -m

Wednesday, January 24th, 2007

Histogram of top 10 words used in the 2007 State of the Union address:

I’ve always had a thing for text analysis.

  • the 352
  • and 250
  • to 225
  • of 188
  • in 118
  • a 108
  • we 100
  • is 76
  • our 75
  • that 72

Source. -m

Tuesday, January 23rd, 2007

Does XPath 2.0 exist outside of Java?

So, about a year ago, I wanted to use XPath 2.0 on a project. Turns out no non-toy, non-alpha versions existed except in Java land (where Saxon is quite good). Has the situation changed at all? Anything on the horizon? Libxml2? Anybody?? -m

Tuesday, January 23rd, 2007

My .02 on Wikipedia and nofollow

The nofollow setting on an outbound link should be a user-editable option, subject to the same community process that all other content on wikipedia already is. (Site guidelines, dispute resolution, restricted editing on certain articles for unregistered users, etc.) By default, links would get nofollow, but over time, they could be ‘blessed’, perhaps after a certain amount of time or human review. Wasn’t this how nofollow was supposed to work in the first place?
The community process works. Why maneuver around it? -m

Monday, January 8th, 2007

Yahoo! + Opera = Crazy Delicious

(Press release) Starting today, Y! is the exclusive search partner for Opera Mini across more than 100 countries. The release also names “oneSearch”, going live later in Q1–definitely something to keep an eye on. -m

Sunday, November 26th, 2006

Micah visiting UC Berkeley

This Wednesday, I’m visiting Berkeley to speak with visiting professor Erik Wilde and his School of Information students. It’s an open-ended discussion, but will almost certainly center on XForms, the intentional web, and related information flow technologies. If you’re in Berkeley this Wednesday, drop me a line. -m

Sunday, October 1st, 2006

Yahoo! + SoftBank: watch this space

Today Softbank Mobile launched a new mobile service, delivering tons of Yahoo! Japan content, powered by Yahoo! US technology, to Softbank Mobile phones. This is notable for a few reasons:

  • In the past, content of this caliber been inside paid walled gardens in Japan. Opening this up could be the tipping point for a shake-up in one of the most amazing mobile markets.
  • This is the first time a carrier has been in so close with a content provider. If this works out (and leading signs are very good), it could be a model for the rest of the world.
  • I’ve seen some of the new hardware from SoftBank Mobile. The phones are great and–through tight Y! integration–go a long way toward solving longstanding UI problems related to the mobile web.
  • Number portability is coming to Japan, I believe beginning today on October 24. Once this gets momentum, user bases could shift rapidly. Today is the ideal time to be playing a strong card.
  • Apple rumors continue to swirl around SoftBank. I’m giddy at the thought of iPods accessing the web through my code. :-)

So, watch this space. More good things are coming. -m

Tuesday, September 19th, 2006

Shiver me timbers! Python 2.5 be here

Link. -m

Monday, September 11th, 2006

Turning Point

For the first time today, I momentarily wished that jEdit had a particular Emacs key binding, not the other way around. -m

Wednesday, September 6th, 2006

Concentré XML Tools

I’ve written before about the xslt2xforms project by Sébastien Cramatte. The project is not only still alive, but expanded into an entire utility kit including a PHP5 framework and forming “a complete xforms/xml toolbox based only on w3c standards”. Check it out on sourceforge. -m

Wednesday, September 6th, 2006

Disk Usage, Python, and SVG (oh my)

Check out this script. -m

Friday, September 1st, 2006

Should hospitals censor internet access?

Most of the censorship stories you hear on the news involve public libraries, but right now I’m writing this from a hospital, which has free wi-fi. Someone providing a service like this has latitude to do pretty much as they please, including censorship, but is it a good idea?

The system here evidently consists of a monitor observing every HTTP access, either forwarding it on or bouncing to another server, one that seems to be down. That second server, referred to only by numeric IP, has yet to ever actually respond, so trying to load any page with a blocked site requres a lengthy timeout of about two minutes before landing on a browser error page with a URL something like this:

http://10.226.37.60:9014/actionpage?basictype=block&epochseconds=1157135546&
requestedurl=http%3A%2F%2Fbriefcase.yahoo.com%2F&categorylist=170&
categorydescriptionlist=Personal%20Network%20Storage&useripaddress=172.26.0.95&
username=&actiontaken=block&actionreason=by-category&actionreasondata=170&
replayhash=oBvk1MZaKDcrs6zo2FyPDg%3D%3D

Let’s take a look at what kind of sites this inane system prevents hospital visotors from viewing directly:

  • flickr.com (“Personal Pages”) — because honestly, who in a maternity ward would ever need to upload pictures of something?
  • 360.yahoo.com (“Dating&Personal”) — because who in a maternity ward would consider posting to a blog?
  • my.yahoo.com as a (“Portal Site”) — because who, away from home for a few days, might want to check up on news of the world around them?
  • thinkbabynames.com (“Personal Pages”) — thankfully, this dangerous and immoral content too has been shielded from the eyes of maternity ward visitors.

At some point, somebody must have pointed out a flaw in their system–that any named site can also be viewed through a numeric IP. Instead of actually thinking about the problem, they also banned all numeric IPs, even for sites that would otherwise work.
The upside to retarded filtering is that it’s easy to get around. Techniques that work here include using a search engine cached page, Coral Cache (.nyud.net:8080), SSH tunneling, VPN, and adding a new entry to hosts to access the same site under a different name. The access is so slow, however (hmm… in a way another form of censorship) that the strain of the additional measures often leads to timeouts and various other errors.
Fortunately, the filtermasters haven’t caught on to dubinko.info yet, thus allowing this post to appear. I hear that site is pretty subversive.

What’s the net?

  1. It’s obvious their list of sites to filter is woefully generic, not at all adjusted to the environment in which people will be actually using the system. And still, I’d wager they’re paying someone fistfuls of cash to keep updating the generic list.
  2. I can imagine there are a few sites on the internets that wouldn’t be appropriate in this environment.  The majority of well-adjusted adults are perfectly capable of choosing not to visit those sites.
  3. In cases where supervision is needed, it is effective on a one-on-one basis, often parent-to-child. Witness how many ways there are to easily bypass the filters: software, particularly bad software, isn’t clever enough to replace human judgement.
  4. Yay for the mobile web, which allowed me to upload my pictures anyway.

-m

Friday, August 25th, 2006

Spam update

I dug into my mail configuration a bit more and made a few changes. In the past, I had been lazy, so when I needed new email addresses like webmaster at xformsinstitute.com and contact at xformsinstitute.com, I just set up a catch-all. I knew catch-alls would collect lots of spam, but I didn’t know (until now) that the particular skew of the spam would be such that tends to get around the filters.
So all the catch-alls are turned off. I set up explicit forwards for used email addresses, and I think I got them all, but if you get a bounce from any email address on any of my sites, let me know. After another 24 hours, I had:

  • 38 spam incorrectly delivered to inbox (manually marked as spam)
  • 120 messages automatically delivered to spam folder
  • 1 of the above incorrectly (manually marked as not-spam)

A significant improvement. I wonder if it’s worth resetting the training data from scratch at this point? -m

Thursday, August 24th, 2006

Anyone know why Thunderbird Spam filtering is working poorly?

Yes, I’ve been painstakingly training positive and negative cases for weeks. This is a standard TBird setup on imap with the adaptive filter enabled. Here’s the results from a 24 hour experiment:

  • 96 spam incorrectly delivered to inbox (manually marked as spam)
  • 257 messages automatically delivered to spam folder
  • 3 of the above incorrectly (manually marked as not-spam)

Is this typical performance, or has something gone bad? Sifting through ~100 spammy messages a day is bad; losing 3 important things a day is worse. -m

Wednesday, August 9th, 2006

My new favorite quote

How hard could this be? A six month project if three engineers are doing it in a garage. Five years if you put one hundred programmers on it.

Guy Kawasaki

-m

Tuesday, August 8th, 2006

Python @ Yahoo!

This is excellent: a Python Developer Center at Yahoo!. -m

Saturday, August 5th, 2006

Django 0.95

Django has always seemed to resonate with me more than Rails. I’ve loved it ever since Simon Willison pointed me to it. A new version is out, and Bill de Hóra has a superb writeup. -m

Thursday, August 3rd, 2006

Thunderbird thinks this message might be an email scam.

Hmm, this seems like a new feature, auto-installed after my last mail client restart. Unfortunately, there’s no “what’s this?” link for further information.

I find it interesting that the scam message wasn’t also labeled as “Junk”. Also, for some reason, the word ‘scam’ feels unexpectedly slangy in this setting. Great feature, I just wish I was a little more transparent. -m

Thursday, July 13th, 2006

Virtual PC is free

According to the authoratative site. Looks like the virtualization markup is getting interesting. -m

Saturday, June 24th, 2006

Linking from HTTP headers

From mnot: the return of the Link: headers, last seen in RFC 2068, and a new header, Link-Template, which has me salivating over the possibilities.

I wonder, will this lead to better libraries for dealing with HTTP headers? Or at least better developer understanding of the benefits of not just taking whatever Apache or Tomcat or whatever yields by default?  -m

Sunday, June 18th, 2006

Help me make Emacs Not Painful

I spend a Pareto portion of my work day in three applications: jEdit, Firefox, and a terminal.

I hang around Emacs (and VI)-loving folks all day. Emacs. jEdit. Emacs. jEdit. The tension is palpable. :)

Maybe their influence is starting to rub off on me. Here’s what I want: Dear readers, can you provide comments on any tips to achieve any of these in Emacs?

  • I keep about 20 files open at a time, in multiple “sessions”. With one dropdown in jEdit, I can switch to a different 20 files in a different session, all open and ready for editing. When I start the editor, I don’t need to individually open files.
  • I use a plugin to show a bunch of tiny tabs at the bottom, so I can see what’s open at a glance.
  • Text selection with shift+arrow keys, and copy and paste with Ctrl+C and Ctrl+V. PgUp and PgDn working. (Just like my web browser)
  • Ctrl+W to close a tab or workspace. Ctrl+T to open a new tab. (Just like my web browser)
  • Ctrl+S to save (Just like my…you get the picture)
  • I’m not a heavy mouse user, but when I do use a mouse, I should at least be able to select text with it.
  • Line numbers showing on each line.
  • Nice fonts (no small feat on BSD).
  • Here’s the kicker: I want to attach in from a remote computer (on a different OS) and have the same experience, same files already open, and so on. Here, jEdit isn’t helping (unless I go VNC, but that’s a big hammer…)

I’ve talked about this before, though my environment now is a little different. (For one, I am now making basic use of GNU Screen for my terminal sessions.) Basically, I want an editor that works like all the other software I use all day, instead of making me remember an entirely different set of key bindings. Every extra bit of my limited wetware storage claimed by my tools detratcts from the stuff I really need to be thinking about. Comments? -m

Friday, June 16th, 2006

The last monolithic OS

A while back, documenting my Windows XP SP 2 horror story, I mused about when Microsoft would have to throw out the code base and start fresh. Now, I see this, with additional commentary from Rick Jelliffe. Hmm. -m

Tuesday, June 6th, 2006

Infopath and “Design Once”

New features in InfoPath 2007 make me smile

  • Design once to work on browser and client
  • Object model the same across client and server

Both things I worked on extensively for Cardiff LiquidOffice in 2003-2004. ‘Cept we had design once and write out to DHTML, PDF, or InfoPath. :) -m

Wednesday, May 17th, 2006

InfoPath going mobile?

Seen on Bill Trippe’s blog.

Gray Knowlton, who indentified himself as a Senior Product Manager for InfoPath 2007 said the next version of SharePoint will “include InfoPath Forms Services, which will render InfoPath forms to browsers and html-enabled mobile devices, and this will not require InfoPath on the form fillers’ desktop, nor will it require any advance download on the part of the person completing the form.”

This is, as far as I know, breaking news. Nice work, Bill!

Now, the big question is, how well will it work outside of IE? -m