Just A Summary

Piers Cawley Practices Punditry

Just a thought

Posted by Piers Cawley Fri, 26 Sep 2008 04:46:00 GMT

There’s a refactoring principle that states that, when you start doing the same thing for the third time, you should refactor to remove the duplication.

I’m starting to wonder if there’s a Smalltalk principle which states that, when you start doing the same thing the second time, you should search the image for the obviously named method (or use the method finder to find some candidate by feeding it some inputs and an expected answer), because the odds are good that it’s only the second time for you – it might be the millionth time for the image you’re working in.

A tiny ruby niggle 32

Posted by Piers Cawley Sun, 09 Sep 2007 07:51:00 GMT

You know what? I’m starting to miss compulsory semicolons as statement terminators in Ruby.

“What?” I hear you say. “But not needing semicolons is one of Ruby’s cardinal virtues! Are you mad?”

I don’t think so, but maybe you’ll disagree after I explain further.

Here’s a piece of code that I might write if semicolons were the only way of terminating a statement:

Category.should_receive(:find_by_permalink)
  .with('foo')
  .and_return(mock_category);

Or how about a complex find query

def find_tags_for(tag_maker, order = 'count')
  klass = tag_maker.class
  find :all
    , :select => 'tags.*, count(tags.id) count'
    , :group => Tag.sql_grouping
    , :joins => 
        "LEFT JOIN taggings ON "
      + "      tags.id = taggings.tag_id "
      + "LEFT JOIN bookmarks ON "
      + "      bookmarks.id = taggings.taggable_id "
      + "  AND taggings.taggable_type = 'Bookmark' "
      + "LEFT JOIN #{klass.table_name} ON "
      + "      #{klass.table_name}.id = bookmarks.#{klass.to_s.underscore}_id"
    , :conditions => conditions_for(tag_holder)
    , :order => (order == 'count') 
        ? 'count(tags.id) desc, tags.name' 
        : "tags.name"
    , :readonly => true
    ;
end

I first came across the idea of the leading comma in Damian Conway’s excellent Perl Best Practices. The idea is that, by leading with the comma it’s very easy to add a new argument to an argument list or hash specification without having to remember to stick a comma on the end of the preceding line if it was at the end, and also, the leading comma makes it very plain that the line is a continuation of its predecessor in some way.

To make the examples work in Ruby, you have to add a \ to the end of each line that has a continuation, so the first example has to be written:

Category.should_receive(:find_by_permalink) \
  .with('foo')                              \
  .and_return(mock_category);

Lining up the \s helps to stop them disappearing, but it’s an awful faff.

What tends to happen (in the rails source especially) is that ruby programmers simply don’t break their lines up. A quick search of the rails source finds plenty of lines more than 160 characters long.

Of course, some will argue that it doesn’t matter, that the old 80 column limit is a silly hangover from the days of steam when the only way to interact with your code was through an 80 column, green phosphor terminal. They have a point. An arbitrary line limit is silly, and we should get over it, especially in source code. However, unless you’re going to go around with every window open to its maximum width, lines will wrap, and they won’t do it nicely, or respect the indentation conventions of your language. Long lines are murder in diffs too, finding the point of difference is so much easier when your eye doesn’t have to scan an epic line.

It’s a shame there’s no way of forcing ruby’s parser to require semicolons as statement terminators for those programmers like me who think that the restriction that a statement must end with a semicolon is worth the freedom to break lines where we like without needing to escape every line break. It’s a shame too that popular tools like Textmate are so clumsy when it comes to dealing with line breaks. I would attempt to hold Emacs up as a paragon in this respect, but its Ruby mode tends to get a wee bit bemused once you start breaking lines, so that’s no good.

Domino theory

It’s amazing how far reaching seemingly simple language design decisions can be isn’t it? Just getting rid of the need to terminate statements with a semicolon has an enormous effect on they way code in ruby looks. I’m just not sure that they look better.

Maybe Smalltalk got it really right – they chose to use the most valuable syntactic character of all, the space, to denote sending a message. That freed up the . for use as a statement (sentence?) terminator. Then that freed up ; for use in one of Smalltalk’s most distinctive patterns – the cascade. Where a Rails programmer might write:

form_for(@comment) do |f|
  f.input(:author)
  f.input(:title)
  f.input(:body)
end

A Smalltalk programmer might eliminate the need for a temporary variable by doing:

Comment>>printOn: html

  (html formFor: self)
    input: #author;
    input: #title;
    input: #body.

All those input: ... messages get sent to the result of html formFor: self. Once you get the hang of it, it’s a really sweet bit of syntax.

Incidentally, there’s been some discussion on the squeak mailing lists of a companion to the cascade, which would use a ;; as a sort of ‘pipe’. The idea is to be able to replace code like:

((self collect: [:each | each wordCount) 
    inject: 0 into: [:total :each| total + each]) 
        printOn: aStream.

with

self collect: [:each | wordCount]
    ;; inject: 0 into: [:total :each | total + each]
        ;; printOn: aStream.

(NB: Please ignore what those code snippets do, because that’s gruesome. Concentrate on how they do it).

Nobody’s quite proposed going as far as Haskell does with its Monads, which can be thought of as a magical land where the meaning of the semicolon changes according to what sort of Monad you’re in. (In an IO monad for instance, the semicolon imposes an evaluation order. In some other monad, the semicolon could just as easily denote a backtracking point). Then again, there’s nothing to stop the dedicated Smalltalker implementing something Monadish – every Smalltalk class can specify how its methods should be compiled after all…

In conclusion…

I’m not sure I’ve got a real conclusion for all this. I’m mostly musing. However, I do think it’s useful to think carefully about restrictions and what they free us to do as programmers. Lispers will wax lyrical about the way that their language’s pared down syntax lets them do amazing things with macros. Smalltalkers will defend to the death the idea that the only way to do anything is to send messages to objects. Pythonistas love their syntactic whitespace. Haskellers love their static typing (admittedly, they have an incredibly flexible notation for expressing type that leaves most other programming languages standing).

And any English speaker with ears will know that a poem like Dylan Thomas’s Do Not Go Gentle Into That Good Night gains much of its power from it’s form, the villanelle, one of the most restricted forms of poetry there is. Two lines repeating through the poem and a staggering number of rhymes to find:

Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light.
Though wise men at their end know dark is right,
Because their words had forked no lightning they
Do not go gentle into that good night.

Good men, the last wave by, crying how bright
Their frail deeds might have danced in a green bay,
Rage, rage against the dying of the light.

Wild men who caught and sang the sun in flight,
And learn, too late, they grieved it on its way,
Do not go gentle into that good night.

Grave men, near death, who see with blinding sight
Blind eyes could blaze like meteors and be gay,
Rage, rage against the dying of the light.

And you, my father, there on the sad height,
Curse, bless me now with your fierce tears, I pray.
Do not go gentle into that good night.
Rage, rage against the dying of the light.

If that’s not making a virtue of a restriction, I don’t know what is.

Cheat all you want, but don't get caught 3

Posted by Piers Cawley Wed, 20 Jun 2007 22:16:00 GMT

As far as I can tell, one of the Smalltalk optimizers’ mottoes is “Cheat all you want, but don’t get caught”.

Well, this morning, I caught Squeak with its hand in the till.

One way I attempt to bootstrap myself towards understanding of code is to try and make it better, if that makes sense. So, I’d run SLint over the OmniBrowser package and was trying to shorten a method. One thing that struck me as rather ugly was a piece of code that ran like this:

|selection|
...
selection := OBChoiceRequest prompt: nil labels: usersNames values: users.
selection ifNotNil: [selection browse].

So I thought I’d tweak ifNotNil: so that, if its receiver isn’t nil, it will pass itself as an argument into the block, which will let me rewrite that ugly code with:

(OBChoiceRequest 
    prompt: nil
    labels: usersNames
    values: users) ifNotNil: [:selection| selection browse].

So, I went to have a look at the implementation of ifNotNil: and found that it was already doing exactly what I was after.

At this point, I had a slight premonition of danger, so I brought up a workspace and tried to print the result of running 10 ifNotNil: [:i| i + 10] and got a compiler error, complaining that ifNotNil: takes a 0 argument block. Which isn’t what the implementation of ifNotNil: thinks.

I’d caught Squeak’s optimizer cheating.

What appears to happen is that Squeak catches conditional code and rewrites it before passing it to the compiler. The rewritten code uses VM level primitives where possible. I needed to fix it so that it would only rewrite any calls to ifNotNil: with a zero argument block.

It took a while, but my local image now optimizes ifNotNil: correctly (the ifNotNil:ifNil: and ifNil:ifNotNil: forms are another matter though, but I shall live. Now, if I can just work out where to submit the changeset to…

I have mixed feelings about this. On the one hand, I’ve just changed something in the workings of Squeak, on the other, it’s not been quite as easy as I’d expected it to be. It seems that, if you go poking around in methods that are defined on ProtoObject, don’t be surprised if changing things doesn’t quite do what you expect.

Maybe I should have just written a ifNotNilDo: taking a single argument block, but that just felt ugly… Ho hum.

That was fun 6

Posted by Piers Cawley Wed, 25 Apr 2007 06:57:00 GMT

On Monday I was down in Brighton for a Brighton Coding Dojo where I had a crack at doing Kata four in Smalltalk.

It took a while to find the balance, but once we got going I think it went well.

We stuck in what seemed like the strangest places though. At one point, I had a method that did almost exactly what I wanted for a new method I was writing so I called up the method in the browser, changed the selector and the few bits that needed fixing up and accepted the changes.

Uproar! “Wha? What’ve you done to testGetMnT?”

“It’s still there, look.” I said, pulling testGetMnT up in the browser.

“But…!”

People were impressed by OmniBrowser’s refactoring tools and slightly boggled by the sheer number of instance methods on Object.

Because we dived straight in, people got a wee bit stuck on the syntax as well. Next time I do something like this, I’ll spend more time walking through what’s going on in each line of code, until people get a bit more secure, and I’ll start handing the keyboard off to other pairs way sooner. Once I did that, it became far more apparent which bits were sticking points.

The session certainly confirmed my opinion that you can read all you like about Smalltalk, but you won’t really get it until you see it in motion.

So, once I have some tuits of the appropriate shape, I’m planning on making a longish screencast of me running through Kata four, with commentary. It won’t be an exemplary example of a Smalltalk user getting the very best out of the toolset; I’m very much a beginner myself, but I hope it’ll give you a feel for why you should at least try Smalltalk for yourself. I also hope that any experienced Smalltalkers watching the screencast will be able to give me some tips on better ways of using the tool.

The "Yes is No" Problem 12

Posted by Piers Cawley Wed, 04 Apr 2007 10:33:00 GMT

login: pdcawley
password: *****
Yes is no and no is yes. 
Do you want to delete all your files? [y]: _

Rereading What’s wrong with Ruby, I realised that the meat of Matthew Huntbach’s worry about dynamic classes boils down to the ‘yes is no’ problem. Given just the source code of a Ruby program, it can be tricky to work out what the eventual program ‘looks like’.

Ruby source can be thought of as a program that you run on the Ruby kernel which creates a bunch of classes, objects and relationships between them. In static languages, once a class has been created, that’s it, the only way to change it is to recompile and restart the program. In dynamic languages, no class is every really ‘finished’, it can be modified at any time.

Done right, that can be a great thing. It’s instructive to look at the way ActiveRecord’s behaviour gets built up in Ruby. ActiveRecord::Base doesn’t really do all that much, it concerns itself with the nuts and bolts of marshalling data to and from the database and building the basic accessors and support methods needed to do that. Its implementations of basic methods like #create, #update and #save are positively naïve. New behaviours are then layered on by including other modules. Modules like ActiveRecord::Locking::Optimistic, ActiveRecord::Validations, ActiveRecord::Callbacks and others all layer behaviour on top of these methods, usually by renaming the old method, so #update becomes #update_without_lock and ActiveRecord::Locking::Optimistic#update_with_lock becomes the new #update, at least until ActiveRecord::Timestamp changes its name to #update_without_timestamps and installs the behaviour that it needs.

It can make the code harder to understand though. Say you’re looking at ActiveRecord::Timestamp#update_with_timestamps and you want to look at how #update_without_timestamps does its thing. If you simply grep for def update in the source code, you’ll find the implementation in ActiveRecord::Base and miss out on the alterations introduced by the locking module. You can’t simply look at the source code, you have to look at the order in which it was executed, and that can be a headache. Once you understand that you need to look at load order, and you find where that is defined, things get much easier, but things can get seriously scary when plugins come into play too.

What’s needed (or at least desirable) is a way of understanding what the state of the program and its classes will be at runtime. It’s easy in static languages, it’ll look like the source code says it looks. It looks like the source code says it will look in Ruby too, but Ruby source code can get a good deal more twisty. Even attr_accessor has the potential to blow minds…

Would you believe it’s a solved problem? It’s been solved for years too. Heck, the solution is older than David Heinemeier Hansson.

Counsel of Perfection

I’m thinking of Smalltalk, but I’m pretty sure that the Lisp Machine folks had similar tools at their disposal. In a Smalltalk image, you can browse the source code at any time, and it always reflects the current specification of the system. In a Smalltalk implementation of ActiveRecord, as soon as you rename #update to #updateWithouLocking that’s what it’s called. When, later you’re looking at the definition of #update and you want to know what #updateWithoutLocking looks like, you can browse straight too it. The knowledge that #updateWithoutLocking used to be called #update hasn’t been lost, you’ll find it in your changes, but it’s irrelevant to how the system behaves now.

I’ve been admiring Smalltalk from afar for a long time now, but until recently it wasn’t a language I spent any time programming in. I was more likely to think “How does Smalltalk do that?”, then fire up my local Squeak image and do a little fossicking about until I had a grasp on how things worked. Then I’d close it down and do something similar (or curse because I couldn’t) in Perl or Ruby. You can learn a great deal just doing that; if nothing else I’ve learned that I like Smalltalk’s message selector syntax a great deal.

Then, a few weeks ago, I watched a couple of screencasts of experienced Smalltalk developers doing their thing and… wow. I’ve always known that Smalltalkers rave about the environment, but I’d not really seen it in full flow.

So, I’ve been doing a few Code Katas in Squeak and starting to get a feel for working in the environment. It’s really, really lovely. Watch this space for more ‘coo er gosh’ type posts. Who know, maybe even an embarrassing screencast – if nothing else but so I can get some feedback from experienced Smalltalkers on what I’m doing wrong.

What about Ruby?

Short of someone coming up with a Smalltalk style interactive development environment for Ruby in the next half hour, this isn’t an immediately useful solution to the problem in Ruby.

But is it really a problem?

Having taken the time to explain why dynamic modification of classes can be a problem, I should point out that I don’t think it’s really a problem in practice. Sure, you can do Very Bad Things with it, but so what? You can do the same bad things (and more) in Smalltalk, if you couldn’t, it would be (nearly?) impossible to implement the Smalltalk environment in the first place. In the absence of an interactive programming environment, you need to take more care with structuring and organizing your code. Impose some standards on yourself – in Rails for instance, the various ActiveFoo classes are structured similarly: first there’s the ‘active_foo.rb’ file, which is responsible for loading up all the packages that define the class (generally found in the ‘active_foo’ sub directory) and includeing them in ActiveFoo::Base in the correct order. This convention helps comprehension. About the only thing I’d like to see further is some documentation in each module describing the behaviour they expect of any class they’re included in – behaviour like that of ActiveRecord::Validations is just too useful not to be pinched for other classes.

As always with these things, it boils down to the good taste and skill of the programmer. Write code that communicates your intent clearly to your fellow programmers – mere compilation is the least of your worries.

Updates

In the comments, Phil Toland has come up with links to some Smaltalk screencasts and a long presentation from Avi Bryant that includes some interaction with a Squeak image.

NB: For ‘ruby’, you can read ‘pretty much every dynamic OO language ever’. Except Matthew Huntbach didn’t write an article called “What’s wrong with pretty much every dynamic OO language ever”.



Just A Summary