Testing your test data

Ever had the situation where a test that had run reliably for a while suddenly starts failing and nobody knows why? You spend hours digging through the code with no luck, then later find out somebody tweaked a fixture or factory for another test, breaking your test. Optimally, you have a team practice of always running the tests before any checkin, followed up by a CI instance that is constantly making sure everything is green and these things get caught right away. However, even on projects with the best of intentions we all know there are times when that just doesn’t happen.

The optimal way I’ve found to ease the pain of this is to use two simple practices. Hardcode your assumptions, and test the test data. It only takes a few seconds to add these simple tests, but it’s worth it the first time you have one of these fail and you can pinpoint the problem immediately. Let me explain what I mean by both of these things. A quick note about the source for test data. I tend to use a mix of fixtures and factories. I’m not sure I can quantify why I use both, or what is my determinant for when to use one vs. the other. I definitely prefer fixtures for static reference data, and for things like standard users or data scenarios I need for a lot of different tests. Factories fit my brain better for customized dynamic scenarios that only need to be created for small subsections of the tests. I find that mixing and matching works well for me, YMMV.

Hardcoding assumptions is simply turning your assumptions into assertions. I love rspec, so that’s what I’ll use to illustrate. If I setup a describe or context block with some text that assumes a data condition, then the accompanying before block should ALWAYS hardcode that conditional so you know that it’s true. Assume I’m testing a user controller, and I have different behavior that should happen based on whether the current user is an admin. Here’s how that would look:

describe UsersController do
  describe "#index" do
    context "current user is an admin" do
      before do
        # ensure that the user for this test is an admin
        some_user.should be_admin
      end
      it "...."
      end
      context "current user is not an admin" do
      before do
        # ensure that the user for this test is not an admin
        some_other_user.should_not be_admin
      end
      it "...."
    end
  end
end

If you’re not use to seeing describe/context blocks broken out that explicitly, don’t get hung up on it that’s my preferred style. The important part is that I have an assumption in my context description that needs to be true in order for the tests to be accurate. However, if I don’t harden that assumption by asserting it, somebody else could change the fixture/factory that user comes from and break my assumption without my knowledge and it wouldn’t be clear that the test broke due to data rather than code.

The second related practice I like is to test the data. I’d love to know how many times some test breaks because somebody added a new validation, and the fixture/factory didn’t get updated to provide that data so any save on that model fails. This seems to have bitten me more with controller tests than anything else, and it was rarely obvious what the real cause of the failure was initially. I learned this trick when I worked at Grockit and it’s part of everything I do now. Create a set of fixture and factory specs that just do a simple validity test on all of the generated test data. In most cases, you are assuming the source of your test data will return valid data to you, so bake that assumption into an assertion as well so you have a hard failure if things break down. I have a spec/fixture_specs directory containing specs for every fixture in spec/fixtures, and the same goes for the factories I create.

Here’s what a typical fixture spec looks like:

require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
context "All User fixtures" do
  specify "are loaded" do
    User.count.should be > 0
  end
  specify "are valid" do
    User.all.each do |user|
      user.should be_valid
    end
  end
end
context "individual fixtures"
  specify "are loadable by name" do
    [:bocephus, :pharoah].each do |name|
      users(name).should_not be_nil
    end
  end
  it "bocephus should be an admin" do
    users(:bocephus).should be_admin
  end
end

The first two blocks are critical and will catch a number of problems. This includes a migration that may have tweaked the table rendering my fixtures unloadable by the database, a new validation that renders them invalid, etc. Finding the error in this test is much more direct and easy to fix than wondering why a controller create method suddenly fails to save the form data. The second two blocks are useful at times, but should only be implemented if necessary. When I first created this project, I had an issue with the fixtures loading at all, so I created the first block to help me get that working. If there are global assumptions about any specific fixture users, hardcoding assumptions here creates a single point of failure that may explain lots of other failures.

My approach to factories is similar, but I’ve found a trick for creating a global factory spec that handles most of this work. Here’s what mine looks like:

require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
Factory.factories.keys.each do |factory|
  describe "#{factory.to_s.titleize} Factory" do
    describe "default #{factory}" do
      attr_reader :model
      before do
        @model = Factory.build(factory)
      end
      it "should not be nil" do
        model.should_not be_nil
      end
      it "should be valid" do
        model.should be_valid
      end
      it "should be able to save without error" do
        model.save!
      end
    end
  end
end
#
# When you need to test a specific factory, uncomment this and set the appropriate factory.
# Then run a focused spec from the before block.
#
# describe "Focused Factory" do
#   describe "default" do
#     attr_reader :model
#     before do
#       @model = Factory.build(:factory_to_debug)
#     end
#
#     it "should not be nil" do
#       model.should_not be_nil
#     end
#
#     it "should be valid" do
#       model.should be_valid
#     end
#
#     it "should be able to save without error" do
#       model.save!
#     end
#   end
# end

This will actually run a similar spec for every factory you have defined. The downside is that if one of them fails, you can’t easily debug it using this global spec. That’s why there is a specific version commented out at the bottom. If you get a failure, you can comment out the top half, uncomment the bottom half and change the :factory_to_debug to whatever factory has the failure, then switch it back once it is working. Along with this global factory spec, I still have specific ones for factories that need unique assertions just as I do with fixture specs.

I have seen the benefit of this to the point where it’s a standard part of how I work now. Give it a shot and see if it doesn’t help narrow down your test failures faster.

RSpec pending goodness

I’ve always liked the way RSpec succinctly clarifies the intentions of your tests in such a non-computer fashion so testing in fact turn into requirements gathering exercises. At Rubyconf last week David Chelimsky showed off some new syntax sugar that I absolutely loved. For all I know it could have been in RSpec from the start, but it was new to me and it’s now a regular part of my repertoire. It’s the pending block syntax.

The ability to mark tests as pending has always been a strength of RSpec over any other testing framework I’ve ever used. It’s easy enough to rename a test method so it doesn’t execute, but before RSpec I’ve never worked with one where you can mark it as pending and it then reminds you that you still have work to come back too. Too often renamed tests get forgotten. Hereare the many ways to mark tests as pending.

describe "Test some object with pending tests" do
  it "should make sure 2 + 2 = 4" do
    4.should == (2 + 2)
  end
  it "should make sure the second derivative of something calcs fine"
end

That will run with one green test, and one yellow test marked ‘PENDING: Not Yet Implemented’.

(That’s the output you get from Textmate’s most excellent spec runner). Another way to do the same thing, but add some context is as follows:

describe "Test some object with pending tests" do
  it "should make sure 2 + 2 = 4" do
    4.should == (2 + 2)
  end
  it "should make sure the second derivative of something calcs fine" do
    pending "This test can't be implemented until derivatives make sense to me"
  end
end

It still marks the spec as pending, but instead of ‘Not Yet Implemented’, it gives the text you specified (in this case: ‘This test can’t be implemented until derivatives make sense to me’ — seemed like a good reason to me).

Both of these forms are very useful for marking entire specs as pending. I tend to use the first form when I’m first trying to capture all the specs I need to write to implement a piece of functionality. Then I come back and implement each spec, then get it passing, but I’ve already captured what I think would be the entire scope of functionality necessary in the pending specs. I’ve seen people use the second form (the pending keyword) a lot to skip a spec that is suddenly failing. There is a better way. Instead of marking the whole spec as pending, add a block to the pending keyword and use it to wrap the part of the spec that is misbehaving. You get a double benefit. Take the following test for instance:

describe "Test some object with pending tests" do
  it "should make sure 2 + 2 = 4" do
    4.should == (2 + 2)
  end
  it "should make sure the second derivative of something calcs fine" do
    pending "This test can't be implemented until derivatives make sense to me"
  end
end

It fails as follows:

Ignoring the trivial nature of the spec, if the failing test is part of a library or in some code which you are dependent on somebody else to fix it. Take the failing spec, wrap it in a pending block with an appropriate comment like so:

describe "Test some object with pending tests" do
  it "should make sure 2 + 2 = 4" do
    4.should == (2 + 2)
  end
  it "should follow the rules of basic arithmetic" do
    4.should be < 5
    9.should_not == 76
    pending "need to alter the rules of the universe to make this happen" do
      3.should == (1 + 1)
    end
  end
end

This produces:

Now, not only does the spec pass, you get a huge benefit. While it appears that the entire test is skipped as pending, it’s not. The specs outside the pending block are still evaluated and the spec will fail if any of their assumptions are broken. Another unexpected benefit is that the code inside the pending block is actually run with each spec execution. As soon as it passes, you’ll get an error notification saying that though you expected this spec to fail but it didn’t. ┬áIn the above example I changed “3.should == (1 + 1)“ to “3.should == (2 + 1)“ so the code inside the pending spec now passes, now you will get:

Consider it a friendly reminder to come back and clean up your pending specs as soon as they no longer need to be pending. That way you can stay lean and mean and keep those specs green.

If this is the first time you’ve seen RSpec in action, you don’t know what you’re missing. If you like to read up on cool stuff, head on over to the RSpec Homepage. If you’d rather sit back and let some smart person explain it to you and show you how to get it all set up, I’d say this would be a better bet.

Enjoy!