Ruby Substrings and Testing Legacy Code

01 Apr 2008

Recently Josh Cronemeyer and I were working on writing a game in Ruby. Gosu, a 2D game library for Ruby and C++, and Chipmunk, a 2D Physics engine, do lots of the heavy lifting so we thought it would be a fun Saturday afternoon thing to do. However, the examples had no tests so when we tried to change some stuff, and of course it didn’t work, we were all sorts of clueless as to why.

Well, wrapping tests around legacy code is not a lot of fun but it is an effective way of debugging. Here’s how I like to do it:
Step one:
Identify potentially troublesome code (As in “What the hell is that doing?”).
Step two:
Write some tests that verify the functionality of the code.
Step three:
Pull out the offending code into a method or methods.

There’s a fair bit of interplay between steps two and three, testing one piece may require extracting it.

Step four:
Either you’ve found your problem or you’ve accomplished four things:
1. You really understand the troublesome code.
2. You’ve refactored it into a more readable version.
3. You’ve put some code under test.
4. You at least know where the problem isn’t.

I adopted this strategy after reading Michael Feathers “Working Effectively with Legacy Code” -- which is good reading.

Eventually we found the problem to be an erroneous substring.
“hello”[0] doesn’t return “h”
Ex:

irb(main):001:0> “hello”[0]
=> 104

Which is the character code of ‘h’, so we tried:

irb(main):002:0> “hello”[0,1]
=> “h”
irb(main):003:0> “hello”[4,5]
=> “o”

Aha! Now we thought that “string”[x,y] gives you the characters between position x and y. But no, we got some really strange behavior until we realized that “string”[x,y] gives you y number of characters starting at x position. The reason “hello”[4,5] gives “o” is because if you ask for 5 things starting at position 4 there’s only one character left, which Ruby is only too happy to hand over. “hello”[0,1] is the way to get the first letter, just as “hello”[1,1] is how to get the second letter.

Now one of the things that drew me to Ruby in the first place was it’s intuitiveness. Often I found that if I didn’t know how something worked, I could just write what I thought would work and it did. But substrings are a rare anomaly – I find I have to look up how they work every time. And I still make mistakes even after I check the ruby doc. Does anyone know the history behind this weird syntax?