Code is data, and it always has been

Piers Cawley 2008-04-072025-09-29

I’m just back from the first Scotland on Rails conference, and a jolly fine conference it was too. Much kudos is due to Alan, Graeme, Abdel and Paul. It was hard to believe that this was the first conference these guys have run and I think all my fellow delegates hope it won’t be the last. As I said in the Pub on Saturday night, I’d had a talk proposal knocked back and, in that situation, it’s terribly easy to find yourself sitting in a session thinking “Bloody hell, my talk would have been better than this!”, but not at this conference.

A phrase that cropped up a couple of times was the old saw that “Data == Code” – it’s very hard to avoid the idea once you start talking about code generation or Domain Specific Pidgins, parsing… I first came across the idea in The Structure And Interpretation of Computer Programs where it’s elegantly phrased as “Data is just dumb code, and code is just smart data”. Traditionally, the idea seems to be attributed to John McCarthy, the inventor of Lisp. But it’s older than that. Way older than that. The idea is actually older than Computer Science. It lies at the core of Turing’s original paper On Computable Numbers, with an Application to the Entscheidungsproblem in which he invents computer science on the way to proving that it’s impossible to decide algorithmically whether a given statement of arithmetic is true or false.

In the course of the paper, Turing posits what has become known as the Halting Problem:

Given a description of a program and a finite input, decide whether the program finishes running or will run forever, given that input.

Turing’s proof runs something like this:

Suppose we have a subroutine halts?(code,data) which solves the halting problem. Let’s use that to write something like:

def counter_example(code)
  if halts? code, code
    for (;;)
    end
  else
    return
  end
end

counter_example(File.read(==STDIN==))

and ask the question “What happens when we run counter_example.rb < counter_example.rb?” If halts? reckons that counter_example would halt, given itself as input, then counter example will enter an infinite loop, but if halts? reckon that it would enter an infinite loop, then it would halt. Which is a contradiction. Which means that there can be no subroutine halts?, which means that maths is hard enough to be interesting and occasionally undecidable.

Look at how the proof works – it’s built around the idea that code can be treated as data. In fact, you could say that the Turing Machine looks like it does because Turing was working backwards from this core idea to describe a sufficiently powerful machine that could obviously treat it’s own description as data. Certainly when you compare the clarity of his proof that the halting problem is undecidable (given the idea of the universal Turing machine) with the contortions required to make mathematics swallow its own tail in similar fashion so that Gödel could prove his Incompleteness Theorem.

So, if you want to know who the idea that code is data is due to, the answer (as is so often the case in our field) is Turing.

Postscript

Incidentally, Turing is also responsible for the first ever bug – his original implementation of a Universal Turing Machine has a couple, one of which is probably a dumb typo (which even I could spot when I read the paper). Another is more subtle, but still fixable. Somewhat delightfully, a young grad student, (Donald W Davies, who invented packet switching) spotted these bugs and told Turing:

I … found a number of quite bad programming errors, in effect, in the specification of the machine that he had written down, and I had worked out how to overcome these. I went along to tell him and I was rather cock-a-hoop … I thought he would say ‘Oh fine, I’ll send along an addendum. But in fact he was very annoyed, and pointed out furiously that really it didn’t matter, the thing was right in principle, and altogether I found him extremely touchy on this subject.

Nearly fifty years later Davies wrote and published a debugged version of the code, which you can find in The Essential Turing. One lesson to draw from the above is that getting annoyed at people pointing out trivial bugs in example code is also at least as old as computer science. Rather splendidly, there’s also a story of the chap who wrote the first ever assembler getting a serious telling off from Turing because the computer’s time was too valuable to waste it on converting symbols into machine code when humans were perfectly capable of doing it themselves. Who knows, maybe Turing’s contention was actually true back in the days of the Manchester Baby…

3 historic comments »

By she Mon, 07 Apr 2008 08:15:57 GMT

Haha that story about the turing bug is hilarious :)

Must have been a funny time

By paulcc Wed, 09 Apr 2008 07:02:47 GMT

Don’t forget lambda calculus, which is slightly older (and invented by Turing’s doctoral supervisor)

By horia314 Fri, 11 Apr 2008 13:43:43 GMT

Nice. I had no idea Alan Turing worked on actual computers doing the down and dirty work of programming – always though he just did the theory.