Why not?

I like to ask researchers whether they think the release of code should be encouraged, if not required, for technical papers. My argument (stolen from somewhere) is, “would you allow someone to publish an analysis of Hamlet but not allow anyone to see Hamlet itself?” The main argument for publishing the code (beyond helping the world as a whole) is that people can check your work, which I hear is a part of this science stuff in “computer science.”
       
Often they’re against it. The two reasons I hear are “my code sucks” and “we’ve patented the technique.” I can also imagine, “I don’t want those commercial fatcats stealing my code,” to which I say, “put some ridiculous license on it, then.” If the reason is, “I want to publish to enhance my resume and reputation, but I also want to keep it all secret because I’m going to make money off it,” then choose A or B, you can’t have both (or shouldn’t, in my Utopian fantasy world).

Don’t worry about code quality. I love “there are codebases that suck, and there are codebases that aren’t used“. This quote was by a lead programmer on one of the best selling videogame development platforms, Unity3D; he got it from someone else. Show us the code, we won’t laugh (much). It doesn’t have to be easy to build. For example, MeshLab, for me at least, is about impossible to build, and has (or had – they’ve improved considerably over the years) some horrific bugs, but I still appreciate that the code is available to look at. I also use the program a lot, I just reached my hundredth use of it this week.
       
It takes a few minutes to slap your source files onto Github and costs nothing. If you’re worried about code quality, don’t – you’re in good company, about 90% of all code on Github is crap (Sturgeon’s Law), including my own (the executable of which gets like 15,000 downloads a month). Notch’s $2.5 billion code for Minecraft sucks. Let it go.
      
Patents: I admit to not liking most software patents, perhaps all. But that’s irrelevant, or should be. If you’re embarrassed to admit you have a patent on some algorithm, that shouldn’t stand in the way of others understanding your research – deal with your shame. The point of a patent is that you are revealing the process. In return your idea is protected for a number of years. This is as opposed to a trade secret, where the process is kept quiet. A patent stops others from using your idea without paying you a licensing fee. However, your part of the bargain is to explain the idea. A trade secret risks someone reverse engineering your clever idea, for which you have little protection. Obvious, but people seem to forget that.
      
I expect these arguments are entirely convincing and code publication still won’t happen, due to pride and lawyers. No one likes to show off their dirty laundry. And lawyers will see no benefit to revealing code: “What’s this ‘research’ stuff you’re talking about? We’re making I.P. here, not research. Releasing code will increase the risk of undetected infringement by others of our I.P., or, worse yet, we might be found to be infringing on someone else’s algorithm patent.”
      
Ah well, I tried. Now get off my lawn.

One thought on “Why not?

  1. Chad Capeland

    I think the conferences and publishers need to require the code to be published. If SIGGraph required it as part of the guidelines for submissions, they’d still get a ton. It would just become a standard practice.

    Also note that if the research is done under a government grant, the code is usually required to become public. Government funding isn’t as prevalent in computer graphics, but where it is used, the standard has already been set.

Comments are closed.