I hit the Aeron hard and asked my pair Liz “what’s up?” I’d been sick and she had been working on a tricky Datomic datalog query in my absence. She claimed to have solved the whole problem but the tests would not pass when run as a whole vs. when run individually.
A classic problem with a big Clojure twist. I, being no stranger to this terribleness, eventually shook the sick out of my head and asked if she had tried running the singular test file by itself. Good news: it failed too. What? I’ll explain. We use a classic trick here at Backstop when running tests in the Clojure REPL: ‘:reload-all’
Otherwise we’d spend all day waiting for the JVM to start up every time we run our tests. It’s a cool trick but it has its limitations. We were clearly dealing with one of those limitations as the tests would fail once in a fresh REPL but succeed on a second run. Those who’ve already guessed the answer please apply for a job here. For all us other mere mortals who need to think it through I’ll present this example of what we were trying to do and what happened:
But if you asked for time-bomb again:
Everything is, er, fine?!? Except ‘a-vector’ has disappeared and a ‘function’ has returned different things for the same lack of input. So, of course, “concat” is part of the problem here. As Stuart Sierrapointed out in http://stuartsierra.com/2015/04/26/clojure-donts-concat, “concat” is a lazily evaluated join so it can hide many bombs.
If we had instead used “into,” which is not lazily evaluated, all would have blown up in a nice normal manner:
Well, not a great error but at least the line number would have given us a clue as to where the problem was instead of the stack trace starting where ever the “time-bomb” was evaluated.
However, “a-vector” is defined by the time “time-bomb” is evaluated. So why doesn’t the lazy “concat” use the available “a-vector”? “Concat” seems to freeze the temporary nature of clojure.lang.Var$Unbound and not evaluate a later defined “a-vector.” Weird. Is this a bug? Or just a known evaluation order thing? I verified this behavior in 1.5,1.6,1.7, and 1.8 so it seems here to stay.
In case you are wondering, without a “declare” this code would blow up right away on the right line number:
Why did we use declare instead of just order the code so it compiles? As a team, we’ve adopted a code standard where we try to push private functions down to the bottom of a file and “declare” helps us declare things that will be coming to the compiler/jvm/macro-magic-factory. It works really well with “defn” so we applied similar principles to public “def”s (for use outside the namespace) and “def”s we didn’t think should be part of a namespace’s API.
The combination of testing in the REPL, code conventions, the lazy nature of “concat,” and when “def” gets evaluated created a lot of confusion for us. Hopefully, this blog post helps you figure out some similar problems.