I’ve been a fan of David Mertz since I devoured (and practically lived out of) his book Text Processing in Python. So I was thrilled at the chance to be a technical reviewer for his new book Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line…
Category: everythingismiscellaneous
According to Newtonian gravitation, the attraction between two bodies is proportional to the product of their masses and inversely proportional to the square of the distance between them. Einstein refined this somewhat, but as long as there aren’t crazy speeds or non-flattish spacetime involved, Newton’s formulation is accurate. As far as we know. I read this…
Filed under need-to-try this: A homebrew version of a popular hacker drink called Club Mate. I already have a carbonator cap and CO2 setup, as part of my beer brewing hardware. One variation I would experiment with is cutting down on the sugar. Even though Club Mate isn’t very sweet, it still has a fair…
I can’t blog about secret projects I’m working on, so how about something completely different? I’ve improved my fitness level substantially over the last five years. (On index cards, I have my daily weight and body fat percentage, according to the bathroom scale, back to November 2009). Here’s some things I’ve learned: Moving counts. A…
Naming is hard to do well, almost as hard as designing good software in the first place. Take for instance the term ‘node’ which depending on the context can mean A fundamental unit of the DOM (Document Object Model) used in creating rich HTML5 applications. A basic unit of the Semantic Web–a thing you can say stuff…
Having been recently accused of “vile” habits in regard to tea-drinking, I feel that I need to clear the air. :) I’ve never been officially tested, but I am almost certainly a supertaster. (This explains, among other things, my aversion to most vegetables and my status as a nationally ranked beer judge). I’ve never been…
I’ve seen lots of discussion for and against link shorteners, but not specifically this line of argument: Let me grab a random shortened link from Twitter. Don’t go away, I’ll be right back. http://bit.ly/b1fYi1 OK, that’s six characters in the domain, a slash, and six more characters. 50 years from now, if bit.ly is still…
The opening day of the conference was not Balisage proper, but a separate symosium on “XML for the long haul”. Some interesting tidbits overheard, in no particular order… “it is not necessarily clear that this approach would capture the difference between the ridiculous and the merely implausible.” Complexity — what is the relationship betwen complexity…
Steve Martin leaves an awesome list of demands for venue staff when he’s on tour, including BEVERAGE SERVICE must include a thoughtful assortment of meads and bendy straws. IMPORTANT NOTE: Bendy straws must be strong enough to be able to be used as blowguns. ADDITIONAL IMPORTANT NOTE: Local paramedic aid may be required. Read the…
Thought experiment: are there any commonly-expressed semantic queries–the kind of queries you’d run over a triple store, or perhaps a SearchMonkey-annotated web site–expressible in common type-in-a-searchbox query grammar? As a refresher, here’s some things that Google and other search engines can handle. The square brackets represent the search box into which the queries are typed,…
If you dig a bit, there’s all kinds of interesting background material about the terrible disaster ongoing in the Gulf of Mexico. For example, a map of the thousands of rigs and tens-of-thousands of miles of pipelines. Some of the best infographics are from BP itself. And for when you can no longer stand the…
Phrase seen in this article about whether video games are art, and Roger Ebert’s opinions thereon. “Video games by their nature require player choices, which is the opposite of the strategy of serious film and literature…” Hmm, Mr. Ebert doesn’t seem to be up on the concept of hypertext, which has manifold connections with cinema….
Let’s say you have a box that (completely legally) spits out 1 dollar per day. I’m using “box” in an abstract sense here: maybe it’s an investment or a business opportunity. How much would you pay for this box? In other words, what’s its fair market value? What if it spit out one dollar per…
Celebrating 500 posts since I went to WordPress in May 2006. Prior to that, an additional 730 posts as I floated through a typical evolution of blogging platforms: Easy start: blogger (299 posts in 24 months) Succumbing to the desire to roll your own (259 posts in 12 months) Realizing that rolling your own is…
If this site is accurate, it’s now possible to have superconducting material at household freezer temperatures: 254k, or a tiny bit below 0F. From power lines to maglevs to supercolliders to energy storage, the potential applications boggle the mind. -m Note: I’m having trouble finding independent verification of this, other than what appears to be…
Link credit goes to Joho. This looks pretty significant. The AZ Supreme Court ruled that document metadata must be disclosed under existing public records law. This may start a chain reaction with other states following suit. With the movement toward open data including data.gov and the Federal Register, this fits in well. Quite often metadata…
In case any of the 7 regular readers here aren’t following xml-dev, check out and add to the discussion about Pragmatic Namespaces, proposed as a solution for the “distributed extensiblity” problem in HTML5. For years people have been pointing to Java as the model for how XML namespaces should work, so this proposal goes that…
I spent 2 days at the Yahoo! campus at a VoCamp event, my first. Initially, I was dismayed at the schedule. Spend all the time the first day figuring out why everybody came? It seemed inefficient. But having gone through it, the process seems productive, exactly the way that completely decentralized groups need to get…
The central thesis of The Inmates are Running the Asylum by Alan Cooper is dead on: engineers get too wrapped up in their own worlds, and left entirely to their own whims can easily make a product incomprehensible to ordinary folks. For this reason alone, it’s worth reading. But I do question parts of his…
This brilliant bit is almost a throwaway paragraph on page 304, near the end. [Two men in a satirical dialog] managed only to demonstrate that the mathematical limit of an infinite sequence of “doubting the certainty with which something doubted is known to be unknowable when the ‘something doubted’ is still a preceding statement ‘unknowability’…
This article seems encouraging. I’ve never been able to come to grips with the anti-CF bias of the scientific community. Sure a few researchers made fools of themselves two decades ago, but what has that got to do with falsifiable hypotheses? A small amount of research goes on with minimal funding, under the newer name…
This is fantastic. Brian May (yes THAT Brian May) not only blogs, but talks about all kinds of challenging subjects. Like how and why space and time are linked. Worth a read. -m
I’m (just barely) enough of a writer that I can spend cycles on Steorn‘s claims without being branded a crackpot. After all, the novel I’m working on involves a similar device being invented 4,000 years ago. It’s all research. Imagine if Earth’s gravitational field, instead of being a constant 1.0G, rocked back and forth between…
Honestly, I don’t even need to write a punchline for this one, it sounds so much like the setup of a Monty Python-esque joke. Give it your best shot in the comments… -m
Bob DuCharme, Innodata Isogen Content analysis: why? You’ve “inherited” content. Need to save time or effort. Handy tool 1: “sort”. As in the Unix command line tool. (Even Windows) Handy tool 2: “uniq -c” (flag -c means include counts) Elsevier contest: interface for reading journals. Download a bunch of articles, and see what’s all in…
Mark Birbeck, Web Backplane. Problem statement: You shouldn’t have to “scrape” government sites. Solution: RDFa <div typeof=”arg:Vacancy”> Job title: <span property=”dc:title”>Assistant Officer</span> Description: <span property=”dc:description”>To analyse… </span> </div> This resolves to two full RDF triples. No separate feeds, uses existing publishing systems. Two of the most ambitious RDFa projects are taking place in the UK….
Ronald Reck, SAP; Kenneth Sall, SAIC “I wish I knew when people were saying bad things about me.” Sentiment analysis. Kapow used initially. From 800k news articles (from 1996 and 1997), extracted 450M RDF assertions. The 13 Reuters standard metadata elements not used in this case. Used Redland for heavy RDF lifting. Inxight ThingFinder (commercial)…
I won a bronze medal (white ribbon actually) in the Mixed Styles category for my Dusseldorf Altbier, the first non-mead-related beverage I’ve ever entered. It’s a deep copper-colored ale made with a special Alt yeast and with a strong balance of clean malt and hops. There are very few bottles of it left at this…
A special comment. My most vivid memory of my late Grandpa. Even after retiring, Grandpa needed to do small jobs around town to make ends meet. One was cleaning a small sporting goods store. Once, with all the excitement of visiting family from out of town (that would be us), he forgot to clean one…
I’ve been playing lately with this site, and it’s a fantastic resource. The word carboy probably comes from Persian qarabah “large flagon.” Who knew? -m