Lightning Search, a Web Application
Take one part lightning-fast full-text search, one part XML, and one
part rich client. Mix thoroughly... It's strange to have a search form
with no submit button, but wonderful.
If you've ever spent any time on Orkut or Gmail, you'll notice that
Google has perfected the art of eliminating needless round-trip
refreshes.
The indexing is provided by David Mertz's gnosis code, as seen in
Text Processing in Python. It's plenty fast on a TiBook, but
searches on complete words only. It might be useful to have a stemming
search, though that would probably require tries or some other data
structure in order to be fast enough.
Link: http://gnosis.cx/TPiP/ .
The XML is provided by a few dozen lines of Python, which implement an
HTTP server, including GET access to the data, and an interface to the
indexing/searching code.
The client piece uses XMLHTTP, the superglue of the Web, to submit a
new lightning query every time the actual text in the query field
changes, then dynamically display the results. It certainly feels
faster to not have a full-page roundtrip, and the feeling of immediacy
helps you experiment with different queries, which has already led to
one interesting discovery within my own data.
Hey, what about XForms? As it is, this works 'out of the box' with
FireFox or Safari. With a minor tweak, it would work in IE as well,
though I don't anticipate ever accessing this via IE. XForms could
probably do it without a bit of script on the client, and would be
easier to maintain, but as it stands today, it wouldn't work 'out of
the box'.
It's interesting--XForms is kind of subversive in that it can be (or
in some cases, already has been) implemented on top of almost any web
app platform, like IE+HTC+XMLHTTP, or Mozilla+XUL+WebExtras, or Flash,
or soon, XAML/Avalon. -m
Dubinko on the net
Other than my family, I don't personally know of any other Dubinko's.
But the Internet is great for research...
First http://croliday.com/kroatien/kroatien/kroatien/beleg/dubinko.htm
which appears to be some kind of hotel or house for rent. Anyone
fluent in German care to translate?
At http://w3.ouhsc.edu/cahsc/Minutes/jan24.htm we have Laura Dubinko,
apparently in Oklahoma.
From MSNBC http://www.msnbc.com/news/537169.asp?0sp=n3b5&cp1=1 we
learn of a space museum tour gide Yelena Dubinko
Something sports-related: http://www.zone.ee/saktartu/eestimv.htm,
Tatjana Dubinko.
A fellow authorhttp://www.raamatukoi.ee/cgi-bin/isik?432 Svetlana
Dubinko writes books to help Russian-speakers get up to speed on
English.
Looking for lovehttp://www.1forlove.com/love0283.shtml is Olga
Dubinko
On Amazon full-text search, you can find a reference to a "Dubinko, G
A who wrote a 1966 research paper on DNA synthesis, referenced from a
$344 book.
Then, there's Vladimir Dubinko, who I have briefly corresponded with.
http://dubinko.info/vladimir.dubinko.html
Have you ever met another Dubinko? Can you translate any of these
links? Let me know. -m
Elliotte Rusty Harold lauds XForms
ERH likes what he sees about XForms at WWW2004. Some quotes:
http://www.cafeconleche.org/oldnews/news2004May25.html
"...going to be a key part of development in the very near future,
with an exponential growth rate for the next couple of years."
"Unlike the semantic web, it does not require learning completely new
and unfamiliar areas of technology..."
"...much better designed than HTML forms ever were."
"...XForms is a compelling enough story to displace IE."
-m
Fiction
A short story.
Movie run. The friends gather to make the all-important decision about
which movie to see. They banter about, as friends do, and finally
manage rough consensus, though a few stragglers in the back keep
unusually quiet.
The group settles into the darkened theater, just in time for the
previews. The opening credits roll. Suddenly, the stragglers stand up.
"Hey, we don't like this movie, let's go somewhere else." -m
Web Applications and Compound Documents
Must-read: the Mozilla/Opera position paper. And naturally my two
pence.
First: http://www.w3.org/2004/04/webapps-cdf-ws/papers/opera.html
Like nearly all real-world requirements, there are
mutually-contradictive aspects, so it will be interesting to see how
things get resolved. -m
InfoWorld Innovators Award
I am thrilled to have won the InfoWorld Innovators 2004 award for my
work on XForms.
http://infoworld.com/reports/21SRinnov04.html
The only other of this year's awardees that I've met is Miguel de
Icaza. Congrats! -m
CNet RSS?
What's up with the CNet RSS feeds? As viewed in NetNewsWire, they have
just a headline. No link. No summary. Is it just me, or does this miss
the point of having RSS in the first place? -m
iTunes GIGO
A surprising 7% of all artists listed in my iTunes library are
duplicates with minor variations.
There's "B B King" vs. "B.B. King" vs. "B. B. King", of course, but
the most annoying thing is a preponderance of backticks instead of
normal apostrophe characters. I see these things everywhere, but
most notably in CDDB. Once the bad data gets into a system like that,
it's pretty hard to get it out. -m
XML, Hack Thyself
Today XMLhack went on hiatus, a sign of the general health of the XML
world, and an occasion meriting a rant.
XML is in tough shape. In short, it's not fun to work with anymore.
Now, "fun" may sound juvenile, but it's important--things that aren't
fun get studiously avoided whenever possible, and engineers are
incredibly good at redefining what's possible. XMLhack was run by
volunteers, including me. When a volunteer effort is no longer fun,
it's better for it to gracefully go dark than linger on in digital
malnourishment.
What happened? In the last 6 years Moore's law has given us CPU power
about 16 more powerful than 1998, when XML--that sleek,
SGML-bloat-avoiding tiger--was unleashed. One person could easily keep
all of XML in her head, even including the basics of related
technologies like CSS. Now you'd need a team of about 16 to keep on
top of things, and it's getting worse.
What happened? For one, XML Namespaces set a terrible (and
much-imitated) precedent of exposing things in markup--making
machine-processing easier and human-processing harder. That's the
wrong way--it's the computers, not the humans, that are getting more
powerful practically by the second. XML hit the need-to-refactor point
much earlier on the curve than SGML ever did.
What's happening? More WS-horribleKluge "specifications" than I can
keep up with, even enough to know what the acronyms mean.
DTD-dependence that lingers on, zombie-like. Popular tools that fail
to meet even basic conformance. Multiple, fragmented, incompatible
versions of RSS.
Is anyone interested enough in reviving XML? Is anyone probing
techniques like alternate namespace approaches (Java and Python, for
example, seem to have good namepsace handling) enough to come up with
a cogent propsal for work on 'XML-next'? Sounds like a promising
research project to me.
In the final message on XMLhack, Edd Dumbill provides some light at
the end of the tunnel: xml coverage lives on athttp://www.xml.com,
all the "cool" URIs on the site will remain active, and finally, never
say never. -m
iTunes backup hint
When copying your iTunes library off to a another server for backup
purposes, first to this: in the search box for the entire "Library",
search for " (a double-quote character). If you see any, rename them
or delete those characters. This will prevent pesky file errors when
accessing that portion of the music tree. -m
The Upgrade Treadmill
It started out as a simple hard drive upgrade...
But my ancient BIOS just looked at 200 gigs, and said 'huh?'. So, I
could either fiddle with getting a new PCI card to run under Linux, or
just get a whole new motherboard. Well, I needed one anyway. Oh, and
the motherboard needs a new processor and new memory. Fine. Except the
thing doesn't start--not even a flicker of a POST on the screen. Turns
out the new board doesn't like the old power supply. And it hardly
makes sense to buy just a power supply without a new case. So, all
said and done, I have an entirely new computer, except for the CD
drive and monitor. Damage: $450 and four round trips to various
computer stores.
On the positive side, a major thumbs-up to trusty Red Hat 8, which
auto-detected all the hardware changes without a hitch, and kept on
plugging along. I could use a better video driver for the ABit VA-10
onboard video, but since this thing is a server with the screen off
99% of the time, I can't even complain about that. Still, the new SuSE
9.1 beckons... -m
Life, the Universe, and Everything
File this away for later: http://www.astrobio.net/news/article966.html
From the Vatican Observatory. Intersting patterns of the intersection
between religion and science. -m
What I'm Reading
You can tell alot about someone by what they read. I recently got my
pick of any 5 O'Reilly books, and it goes even beyond the 'What I'm
Reading' links on the web site and RSS feed...
I picked:
Cascading Style Sheets, 2nd ed
Lex & Yacc
Hardcore Java
Java Cryptogrophy
Security Warrior
I bit Java-heavy, I'll admit. I chalk that up mostly to already having
most other O'Reilly books in areas of interest! The CSS book is as
good as you'd expect. The Lex book is a follow-on to Text Processing
in Python and the Dragon book. And Securiy Warrior just looked
interesting, something new to try with little risk. -m
With great power comes ...
Help me fill in the blank. Send email to the my listed contact
address. Bonus points for humor. -m
Update: Luther says "Microsoft stock options". Any more?
Feringi Rules of Acquisition
XML Hacks
Market-driven TLDs
There are some discussions on www-tag about how (and if) to handle new
top-level domains, like .mobi. Here's my idea...
Do we really need more TLDs? Maybe we need less. Like one. Then
everyone would be free to choose and use second and third-level names
that have meaning within the purpose for which they're used.
OK, so that's a bit over the top. But that scenario isn't much
different than just allowing unlimited top-level domains. So, what
would happen the existing TLDs were managed more-or-less as-is, but
new ones could be had by anyone, for say US$1 million to a trust fund
for advancement of the Internet? Each TLD would then be administered
by the organization that owns in, including managing subdomains.
Example: the Coalition for Youth Safety could buy .kids, then hand
out free or low-cost subdomains, and set up a Terms of Use agreement
that states what kind of content is or is not allowed under that
domain. Per the agreed-upon terms, violators get their name yanked
until they comply. Now, if someone comes along and thinks this is
censorship, they are free to buy their own TLD, and run it however
they wish.
If someone wants to try to set up .mobi, they are free to go for it. I
imagine they'd run into technical issues defining any kind of
resriction for what kinds of devices can and can't access the content,
but the market will deal with that kind of shortsightedness
appropriately. Bad ideas will die out on their own, and the entry fee
will discourage some of the most bone-headed ones.
Does this devalue existing domains? Yes and no. It does break down
some of the artificial limitations in the current system. On the other
hand, it opens up all kinds of new market possibilites.
http://www.microsoft/ anyone? -m
The Grand List of Overused Science Fiction Clichés
Rebuttal to Hixie
Ian Hickson is a sharp guy, but he swings and misses with this post...
In a posting on comp.mozilla.devel.layout (no link, google it if you
want), he writes:
"XForms does not in any way alleviate the need for server-side
validations since the server can never trust the client and therefore
has to do all the validation anyway."
OK, so instead of the classic approach of writing an unmaintainable
pile of JavaScript on the client and an unmaintainable pile of Perl on
the server, now you can write one unmaintainable pile of XForms and
run it in both places. Seems like a win to me. :P
"Note that XForms' leveraging of so many standards is one of its main
downfalls as far as implementations go (you have to implement god only
knows how many specs before XForms is even on the radar)."
"god only knows"?? That sounds like an emotionally-driven argument
rather than a factual one. Here's some non-divine facts: there are 16
normative references in XForms. This compares to 26 for HTML 4.01. And
the 16 includes references made purely for terminology and background,
like the requirements document, RFC 2119, and XHTML Modularization.
It basically comes down to a few obvious things: XML, XMLNS (yep, it's
a W3C spec), and RFC 2388 for multipart/form-data. Then the real work:
XPath, XML Events, RFC 2387 for multipart/related, and a sliver of WXS
datatypes, avoiding the gnarly bits for XForms Basic. That will get
you much more than 'on the radar'.
"XForms has zero synergy with HTML..."
XForms is part of XHTML 2.0.
There's plenty more that could be said, if it was worth debating every
little point.
XForms isn't perfect. Like all recent W3C work, it's heavily
influenced by namespaces, and not for the better.
I don't think XForms is a perfect fit in the browser core, at least
not right away. What we need is better plug-in systems so that upon
encountering a web page with xforms:model
, or svg:rect
, or
whatever, the proper component can be downloaded and installed. Now,
there's a competitive feature that would put IE to shame. -m
XHTML in SVG
Sun Policy on Public Discourse
Tim Bray: http://www.tbray.org/ongoing/When/200x/2004/05/02/Policy
Things like this pile up in background browser windows, until I am
able to "do something" with them. Recording them in a readily
searchable place is a good choice. World-readable turns out to be not
so bad either. -m
Wicked Problems
Dictionary of Algorithms and Data Structures
Picked this up from Mark Baker's feed, but I'll need it here when I
search for it later. http://www.nist.gov/dads/ -m
Cross Platform Hedge
I've worked with some amazing software to store all my stuff. In
roughly chronological order: Zoot, Microsoft Office OneNote,
NoteTaker, and StickyBrain. As great as all these are, they run on
only a single platform (the first two on Windows, the last two on OS
X) Even though I'm currently happy with OS X, I won't settle for a
single-platform solution.
Data, the stuff I care about here, is too important to limit to a
single platform. It's still all about the data. -m
It's the data, stupid
Welcome to the first new entry written from my own personal content
management system, code-named "It's the data, stupid". Unlike most
other CMSes, the content remains fully accessible, even when none of
the software is actually running. This is accomplished by storing all
the important data as plain text, UTF-8 files in directories.
Technically, it's pretty bare metal. It's largely based on David
Mertz's Text Processing in Python, which includes several plain-text
to XHTML converters. Add in some XSLT templates to produce a fully
formatted XHTML page, as well as the RSS, and you're off.
Why do this? Basically, it gives me a platform I have full control
over, down to the last line of source, as well as local storage of my
information in a format of my choosing. It's part of a broader effort
on my part to get all of my notes, writings, journals, news clippings,
and anything else textual under a common umbrella that can last me the
rest of my life. Side projects include a full-text indexer and local
access to all information and actions via REST interface.
Along with this goes a tiny site redesign and a new location for the
RSS feed (which is now 1.0--and should be redirecting.) If you notice
any problems, send me email at the address on the web page. -m
(repost) Write IE browser extension in XForms
The indefatigable Mark Birbeck pointed me to
this: http://www.formsplayer.com/products/add-ons.html --a toolkit to
write IE sidebars in pure XForms. Included are Amazon and Google
search. This is a sign of changes to come in the development of
Internet Apps. -m
Terms of use
For external use only. I doubt
the enforcability of click-through licenses anyway. Copyright 2004 Micah Dubinko. All rights reserved.