Nomad Adapter for Rails ActiveJob

23 Feb 2021

Migrating a Legacy Application to Docker Part 2

In a previous post, I talked about our team’s long running import/batch process that is triggered by customer request. Making this import survive deployment of new code is achieved using Resque. It is easy to tell a Resque process to finish what it is doing and shut down by sending the QUIT signal (see documentation). This all changes with a move to Docker/containers.

Conventional deployment of containers is NOT friendly to the idea of keeping a container around for a few hours while a process finishes. However, we are using HashiCorp’s Nomad to orchestrate our containers and it has “parameterized jobs” which adapt nicely. Furthermore, Rails’ ActiveJob has such a straightforward interface that a queue adapter can be written in under 50 lines:

require 'net/http'
require 'uri'
require 'json'
require 'base64'

module ActiveJob
  module QueueAdapters
    class NomadAdapter
      def enqueue(job)
        serialized_job_json = job.serialize.to_json
        payload = Base64.strict_encode64 serialized_job_json

        uri = URI.parse("#{ENV['NOMAD_API_BASE_URI']}/v1/job/some-parameterized-job/dispatch")
        header = {'Content-Type': 'text/json'}
        body = {
          "Payload": payload,
          "Meta": {
            "some_meta_data": ENV["ENVIRONMENT_META_DATA"]
          }
        }

        http = Net::HTTP.new(uri.host, uri.port)
        request = Net::HTTP::Post.new(uri.request_uri, header)
        request.body = body.to_json
        http.request(request)
      end

      def enqueue_at(job, timestamp)
        raise "enqueue_at not implemented for NomadAdapter"
      end
    end
  end
end

Now we need to switch the long running process over to the Nomad scheduler. To do that, add self.queue_adapter = :nomad in the class that inherits from ApplicationJob. The ActiveJob framework code takes care of decoding the base 64 and marshaling the json into objects. A pretty seamless drop in replacement.

This code doesn’t implement the ‘run later at a specific time’ functionality of ActiveJob. I’m not actually sure how to do that with Nomad’s abilities but, thankfully, my project doesn’t have a need for that.

I should also note that this is a happy path implementation with no logging. The code above assumes that the Nomad API always accepts the command. A production version would need to handle non-200 responses and should do some logging. Also, if the Nomad instance has some sort of access control (and it should), a token will need to be passed along in the headers for authorization.

Of course, one still has to write a parameterized job for Nomad. Definitely read the documentation but here is a simplified version of what we use:

job "some-parameterized-job" {
  type = "batch"
  parameterized {
    payload = "required"
  }
  group "parameterized-worker" {
    restart {
      # Restarting imports without customer permission is bad
      attempts = 0
      mode = "fail"
    }
    task "parameterized-job" {
      driver = "docker"
      template {
        data = <<EOH
          ENVIRONMENT_META_DATA="This template stanza is pretty handy for setting ENV vars"
          NOMAD_API_BASE_URI="URI of Nomad API"
          EOH
        destination = "secrets/file.env"
        env = true
      }
      config {
        image = "a-docker-image"
      }
      dispatch_payload {
        file = "import-options.json"
      }
    }
  }
}

The image that will be running the job needs to have an entrypoint (or cmd) like:

bundle exec rails runner "ActiveJob::Base.execute JSON.load('import-options.json')"

I honestly thought swapping out Resque for Nomad would be much harder. For once, implementation was easier than predicted.