OpenDataDay Hackathon DC!

So I went down to DC this weekend to participate in the Open Data Day Hackathon!  There were some tremendous projects proposed, but the one that caught my eye from the start of the day was one proposed by Jim Harper of the Cato Institute to track down the genealogy of legislation put forth in congress.

Basically, the goal is to programatically find similar passages in multiple bills.  This can be used for many purposes, including looking at sections in large omnibus bills and getting an idea if the things that get shoehorned in it have been proposed previously, and what happened then.

So, our team largely consisted of  myself, Alexander Furnas of the Sunlight Foundation, and John Bloch of 10up, with guidance from Jim Harper (previously mentioned, of the Cato Institute), Molly Bohmer (also of the Cato Institute), and Kirsten Gullickson providing some clarification on the way the XML data we were working with was structured.

I spent my time building a MySQL database and a PHP import script that could map all the relevant data from the XML files in to it.

Alexander worked in Python primarily fleshing out a way of doing Latent Semantic Analysis on the data we’ve extracted to sort out what is similar to what, and where can we find meaning in it.

John spent his time working on a front-end for the final dataset, to help end-users get something useful out of the data we’re building.

The data that we were pulling from can be readily accessed by anyone through the Library of Congress at the following URLs:

I’m currently putting some finishing touches on the DB structure, but when that’s done, I’ll be releasing that and the import script in a subsequent post, as well as a SQL dump for the final accumulated and sorted data — ripe for data mining.  As the day was wrapping up, I had someone come to me inquiring about data mining for references to money allocated in appropriations bills and the like, and I was able to very quickly do a MySQL query along the lines of

SELECT * FROM `resolution_text` WHERE `text` LIKE '$%'

to find anything that started with a dollar sign and then listed an amount over a very limited data set of three million rows or such.  The final data set will be much larger.

Jonathan Coulton, Baby Got Back, Glee, and Copyright

Disclaimer!  I am not a lawyer!  These are just my musings, if you ARE a lawyer, I’d love to hear back from you as to whether I’m on track.  Also, I call myself a Code Monkey.  That’s also a song by JoCo.  It’s awesome, and you should listen to it.

If you’re here, I’m going to assume you’ve heard some details on the current situation of Glee ripping off Jonathan Coulton’s cover of Baby Got Back.  If not, read JoCo’ summary first.

My understanding of the general consensus is that as the “cover” is a licensed cover, he doesn’t have any specific rights to protect it from Glee using it.

The musical arrangement that the covered lyrics were set to was 100% original, and JoCo released a Karaoke track that omits all of the covered lyrics.

It is my contention that the Karaoke track is not a cover, and is instead a wholly original work, and as such, JoCo owns rights to the melody to which his cover was set.

Let me rephrase it another way:

If I write a little tune that I find to be catchy, and release it, I would own the rights to it.  If, later, I purchased the rights to cover a song, and put the lyrics of the song to my completely unrelated tune, would I still have rights to my original tune?  Or would the fact that I happened to combine the two rob me of the rights to my original tonal creation?

If you believe I would lose my rights, then I licensed my tune non-commercial Creative Commons (as JoCo did) and a third party took it and did a non-commercial cover version of a different song to said tune, would that then rob me of by rights to the tune?  The actions of a unrelated third party licensing it can rob the original rights-holder of his rights to the licensed tune?

If you have a different answer to each of the last two questions, I’ve gotta ask why.  Because, for me, both of them seem to be a firm “Yes, I should keep the rights to the tune”

In fact, that is why the law reads:

A compulsory license includes the privilege of making a musical arrangement of the work to the extent necessary to conform it to the style or manner of interpretation of the performance involved, but the arrangement shall not change the basic melody or fundamental character of the work, and shall not be subject to protection as a derivative work under this title, except with the express consent of the copyright owner.

As such, I question whether the portion of JoCo’s Baby Got Back that was a wholly new melody (that was ripped off by Glee) would suffer the same shackling to the original rights holder, when I would consider that melody to not be a derivative work, and the ‘cover’ to in fact be a derivative work (as it has a wholly new melody).

The law says that it can’t be a derivative work if it keeps the original basic melody.  JoCo didn’t.  So — derivative work?

The best phrased response to the current GPL spat between WordCamps and Envato

As stated by Chip Bennett:

I will preface my comments by saying that I disagree completely with the approach the WordPress Foundation is taking here. The problem is a disagreement between the WPF and Envato, and developers are merely caught in the crossfire.

This approach makes developers choose between putting food on the table and being a persona non grata to the WPF, or else risking their legitimate revenue stream, and be in the WPF’s good graces. Unfortunately, for Jake and thousands of developers like him, the WPF’s good graces don’t put food on the table.

And while the tactic may ultimately work, there are only so many times you can turn the 50-mm barrels on the rank-and-file in the community itself, and not have adverse affects.

That said, I take issue with Envato’s stance, as well:

To my mind, it doesn’t make sense that a regular license sold on ThemeForest should give such a buyer the right to on-sell a creator’s work at that volume – if only for the simple reason that volume reselling can significantly reduce demand for the original work.

You are arbitrarily restricting the ability of your marketplace suppliers to offer their work under the license of their choice. The way I read this, your real concern is that Envato would lose commissions if Themes in their marketplace were offered as 100% GPL, and led to downstream distribution. If that is the real concern, it may or may not be valid, but it is disingenuous to couch such concern as concern for your marketplace sellers.

If that is *not* the real concern, then I don’t see how any real concern exists. Just let your marketplace sellers *choose* to offer their works under 100% GPL. Put up huge banners decrying the risks of doing so. Strongly suggest that they don’t do so. Rail against the GPL all you want. Make them click through 3 “are you sure?” dialogue boxes.

But offer the choice.

I guarantee you that the WordPress Theme developers who opt-in to offering their works under a 100% GPL license do so under full understanding of the license terms, and either disagree with your risk assessment, or have evaluated the risk-reward differently. You don’t need to “protect” them from the license.

Just offer them the choice.

This.  A thousand times this.

Draw Something Cool

So I’ve had a lot of awesome feedback for the “Draw Something Cool” bit that I’ve added to my Contact Form.  It’s in actuality just the Signature add-on for GravityForms!

That being said, here are some of the best images that I’ve had people submit through it thus far:


PVP Redesign has UX Problems

So PVP, a webcomic that I’ve followed since about 2002 (back when it was still physically published by Dork Storm Press) just launched a site redesign.

And it looks hideous.

Not in the way that most people would say hideous, mind you.  But rather from a User Experience (UX) perspective.  The header of the interior pages is 215px tall, and the front page is 415px.  When you add that to the typical browser/OS overhead of 100px or so, that means on the homepage, the content is starting about 550px down the screen.

Worse yet, Scott Kurtz (creator of PVP) has done that highly obnoxious thing (that I seem to recall him venting a couple years ago about Penny-Arcade doing) — moving the single piece of content that 95% of your visitors are coming to your site to view off of your front-page, thereby making them click through to the new page, doubling your page views, and doubling your Ad Impressions. (And doubling the aggravation of everyone who wants to just see today’s comic) (with doublemint gum)

Yes, the new site looks more modern and fresh.  However from the perspective of any visitor, with regard to usability, it has gone way down.

So as I hate criticizing things unless I’m ready to step up to the plate and do something to help, here’s a JS `bookmarklet` that you can drag to your browser bar to see how much nicer pvponline could look without that hideous massive header obscuring the screens of most of his audience.

Because let’s face it, Scott — most of your audience isn’t viewing the website on a 27″ iMac like you.

Without further ado (Yes, I do tend to ramble) here’s your bookmarklet:


Here’s the code that it executes:

+'#header .content,#headerSub .content{height:auto; padding:0;}'
+'#header .content > #adLeaderboard,#headerSub .content > #adLeaderboard,'
+'#header .content > h1,#headerSub .content > h1,'
+'#header .content > #featured{display:none;}'
+'#header .content .nav,#headerSub .content .nav{position:relative; top:0;}');

And the CSS that it puts in the page.

#headerSub {background:#000000;}
#header .content,
#headerSub .content {height:auto; padding:0;}
#header .content > #adLeaderboard,
#headerSub .content > #adLeaderboard,
#header .content > h1,
#headerSub .content > h1,
#header .content > #featured {display:none;}
#header .content .nav,
#headerSub .content .nav {position:relative; top:0;}

By the way, Scott, if you ever read this, I know you consider yourself a professional. Which makes it all the more aggravating when you are eternally incapable of having your webcomics actually follow a posting schedule. Occasionally, they’ll be up by noon on the day that they’re scheduled for. But those times are rare. Far more common, they may go up around 7pm, if you actually make the date they’re meant to be up for, instead of publishing them late and backdating them.

If you want to consider yourself a professional, then why are you chronically unreliable in having the fruits of your labor up, when every single one of the other webcomics I follow does manage it, day in, and day out?

For the record, the other webcomics I read are as follows:

  • XKCD
  • CandiComics
  • QuestionableContent
  • Sinfest
  • Nodwick/FFN
  • OutThere


An Apple Anecdote

Friday evening, I ordered a MacBook Pro for my wife.

Figuring it was about $1000 anyways, I just tacked on one-day shipping.

Saturday, when I get the notice that my order has shipped, it says that it’s going to my old address across the state from me. About five hours away. So I call Apple, and explain the situation, and they say that I should go to, enter the Order ID number, and shipping zip code, and it’ll let me redirect the shipment, but as the shipment hasn’t actually been processed by FedEx yet, I’ll have to wait to do it on Sunday. Alrighty, I think. Not a problem.

I change it on Sunday.

Monday morning I check the tracking number, and find that it’s still going to my old address across the state.

I call Apple to ask what’s up, they say that the request went through to FedEx just after 4am Monday morning. To make up for my having this problem, they’re refunding my $27 or whatever that I had paid as a shipping fee.  “Ah well, it’ll come on Tuesday”, I say.

I call FedEx around noon-ish, and they say that they never received any requests from Apple to change the shipping address whatsoever, but it may just need time to propagate through.

Tuesday Morning. Tracking number says it’s still going to my old address.

I call Apple. Speak with a wonderful young lady named Lisa, who puts me on hold while she calls FedEx directly, and then comes back saying that FedEx has assured her the package will be delivered to my present address on Wednesday. And to make up for this, Lisa tells me, Apple is going to send us a free case for the MacBook Pro. Same one-day shipping as the original order. “Great!”, I say.

A couple hours later, when I receive the email confirmation that the case has shipped, where do you suppose the new parcel is getting shipped to?

Yup. You’ve guessed it.

My old address.

UPDATE:  As of Wednesday Morning, the package is -still- at the holding facility across the state.  I just called FedEx to ask WTF, and it seems that the facility got the request to redirect the package yesterday, they just didn’t act on it.  So Marcy (very wonderful woman that she is) called them up (while I was on hold) (yay) and ensured (theoretically) that it would get forwarded today, and arrive tomorrow.


And all this to get a package that was originally shipped from a warehouse in Middletown, PA — a mere 20 minutes down the road from me.