Time to Bring Back Fixtures
Why I decided to give them a try again
Rail Fastening by Frank Vincentz is licensed under CC BY-SA 3.0
Rails Fixtures long ago got relegated to the “bad idea” heap for a number of valid reasons, mainly the difficulty in managing the ids of records and any association foreign keys. Also early Rails tests, before any factory libraries came along, often had many fixtures, leading to brittle and confusing tests. Many of these problems were fixed with the “foxy fixtures” update to Rails many years ago but the damage was done and fixtures never really recovered from their early form.
The alternatives, specifically factories and factory_girl, grew very quickly and are now the defacto way to manage test data. Unfortunately, due to the fact that factories have to create data for every test, and with how easy it is to set up very deep object associations under a single factory invocation, many apps are now in the situation where the tests create so much data that they are unbearably slow. Not only are factories entering rows into the database but they’re also allocating Ruby objects, running ActiveRecord callbacks and validations (often across associations), and then garbage collecting all of these objects. For every test. This stack has proven difficult to optimize or improve.
I’ve long been sick of the explosion of factories and database activity in Rails tests, so for a new project I decided to try fixtures again. It has been a fantastic success. For those who aren’t aware of the benefits, fixtures are database records that Rails applies at the beginning of a test run. Then, each individual test is run under a database transaction which gets rolled back at the end automatically undoing all changes. Once the fixture records are stored there is zero database setup cost per test and doing nothing is always significantly faster than doing something!
But what are claims without data? I decided to get some comparative data to show the real benefits of fixtures over factory usage. The app in question is still very small, making this a perfect test bed. There are four total tables and only ever one level of required association. Where “=>” means “requires”, the data model looks like the following:
Program
User => Program
Drill => Program
DrillResult => Drill
Our current fixtured test suite is as follows. There are two suites we run: the unit tests then the acceptance (Capybara + RackTest) tests. The two tests suites run individually, loading Rails twice, thus pushing the total run time up a few seconds.
] time rake
Run options: --seed 50585
# Running:
....................................................................................................................................................
Fabulous run in 1.924801s, 76.8911 runs/s, 140.2742 assertions/s.
148 runs, 270 assertions, 0 failures, 0 errors, 0 skips
Run options: --seed 48865
# Running:
.........................................
Fabulous run in 4.681990s, 8.7570 runs/s, 29.2611 assertions/s.
41 runs, 137 assertions, 0 failures, 0 errors, 0 skips
real 0m16.684s
user 0m13.609s
sys 0m2.444s
I then took our test suite and converted it entirely to factories and compared the test run time.
] time rake
Run options: --seed 32619
# Running:
......................................................................................................................................................
Fabulous run in 3.135450s, 47.8400 runs/s, 87.3878 assertions/s.
150 runs, 274 assertions, 0 failures, 0 errors, 0 skips
Run options: --seed 4724
# Running:
.........................................
Fabulous run in 5.148810s, 7.9630 runs/s, 26.6081 assertions/s.
41 runs, 137 assertions, 0 failures, 0 errors, 0 skips
real 0m19.090s
user 0m15.211s
sys 0m2.615s
Over multiple runs I averaged a 10% slowdown with factories. In a test suite where no test builds more than ten total records, this was eye opening. I’ve worked on projects where a single create(:user)
ended up creating up to fifteen(!) records in the database. For every test! If just four factories and a few associations slows a test suite down this much, imagine how much time is spent in bigger applications just in data setup.
So can you start using fixtures in a suite already full of factories? Sure! We’ve started experimenting with exactly that and have already saved many minutes off of a rather slow suite (45+ minutes). If you’re wondering how to use fixtures with Capybara / Selenium-webdriver tests (where DatabaseCleaner and truncation is the norm), we’ve found this super handy snippet floating around the Internet that forces all parts of the stack to use the same connection pool:
# Make all database transactions use the same thread
ActiveRecord::ConnectionAdapters::ConnectionPool.class_eval do
def current_connection_id
Thread.main.object_id
end
end
Caveat!
Of course as with any recommendation it’s important to use fixtures correctly and in moderation. I do not recommend that every factory be replaced with a fixture as you will quickly end up with a test suite that’s difficult to work with and to reason about. Factories, especially with factory_girl, are fantastic for creating valid one-off records for an individual test and we still make heavy use of them. My current guidelines for when to use factories and when to move to fixtures are as follows:
- Start with factories.
- As you notice objects created for a majority of tests, move these into fixtures
- Keep fixtures to as small a data subset as possible
The one major downside of fixtures is the disconnect of the tests from the data they use, though this is often a problem with factories as well. Also, be aware that while you can specify factories for individual tests (fixtures :users, :drills
), you may find yourself creating tests that pass individually but fail in the full suite run. This is because fixtures are loaded before any test suite runs, so Rails pulls together and combines every fixtures
call to know what needs to be in the database for all tests. If your test is expecting a table to be empty, it may not be anymore.
Fixtures are no longer a feature to be feared or ostracized. They are fantastic for keeping your tests fast, though like everything (including factories) please use carefully. If you’ve got a long-running test suite that you’re trying to speed up, give fixtures a try.
Comments
Thanks for sharing a good experience on using fixtures. Could you share how do you setup a project that use both fixtures and factories? And how do you reference them both in the same spec/test?
I agree with your sentiment, I went fixtures -> factories -> back to fixtures. Now I just see them as the better way to handle Rails test data. They have proved to be more maintainable than factories.
Using the thread connection sharing was not successful for us.
We were running randomly in deadlock situations (using postgres/rails 4).
Unfortunatly, fixtures need transaction, so they are not useful for us in acceptance tests :(
@Samnang: In our experience there’s no special set up required to use both at the same time. Fixtures and (in our case) FactoryGirl have non-conflicting APIs so it’s simple to use both in a test:
@Stefan: Sorry that didn’t work out for you. I haven’t myself done extensive testing with this snippet though if I run into similar situations and I find a solution I’ll be sure to post about it.
Great article Jason. I would throw out that I think the speed issues here aren’t so much caused by factories, as it is a problem with the way specs (at least RSpec) work. Every it/test reruns the entire setup! This is largely because rspec missed the ‘When/Act’ concept in their DSL. If it had that, then you could assume your Before/Setup/Arrange/Given and When/Act steps would just run once for all of the relevant specs. Obviously, this requires that your assertions not mutate state, but assertions shouldn’t have been doing that anyway.
Rspec-Given does this for the rspec world, and I highly recommend it. It won’t be quite as fast as 1 set of factories for an entire suite, but that (sensibly) wasn’t what you were recommending either.
Lastly, seeds.rb is just a ruby file. You could use FactoryGirl to define your seeds as well. If you like the object mother pattern, FactoryGirl’s traits can be used to succinctly sculpt detailed object graphs.
Great, data-driven, post Jason.
Hi Jason,
We had similar thoughts and a different experience. We have ~4700 tests, many of those are acceptance or model and hit the db. There are certain things we do in just about every acceptance test (create a user, create an organization, etc) so we extracted a few of those common things into fixtures and replaced the factory calls with appropriate finds.
We found that individually, tests sped up. When run on the local dev machine, we gained about 10 seconds off of a 12 minute run, though that was inconsistent.
When running on our build agents, that run the tests in parallel in 11 processes at once, we found that our test run INCREASED from 3:30 to ~4 minutes.
Each case is different, but I suspect that there is enough overhead in the db find and perhaps a difference in overhead of transactions when there is data in the db, that, at least in our case, this did not work out for the better. If you’re creating A LOT of data in each test I can imagine it would be a win, but in our case, where it was maybe 3-4 records we able to fixturize, it was no benefit.
I use fixtures in complex apps because it helps to see one-day-of-your-app with data, relations, examples. When you know this one-day it’s easy to test, you already know what to expect
How does your test suite speed compare when running a single spec or a single spec file? This speed is significantly more important in a TDD workflow.
I prefer factories, but I think that no matter your preference, you need to wield your chosen tool with care. A user factory that creates 15 associated records by default is no more inherent to factories than a giant fixture file is inherent to fixtures.
I use an hybrid solution where I create my fixtures from factories. I define the fixtures in a file with factories then fill up the DB before testing with a rake task.
I kind of get the benefit of both worlds.
Mysql2::Error: This connection is in use by: #<Thread:0x00000110a48dd0
Any idea how to deal with this one?
This happens in a capybara ajax test. Obviosly an ajax call makes a separate request but we have only on buisy conneciton in the pool due to the ConnectionPool monkey patch
How your Capybara tests work on a page which sends many ajax requests to load the page? This is a typical situation when you go to a page and there are few requests fetch some data from the server though API to load the entire page.
It is only one ajax call. The test first loads the page, which is a search interface and then submits the search request via ajax, which refreshes the results area. This test works with database cleaner, however with this ActiveRecord monkey patch + transactional fixtures it fails. I was wondering whether author had some experience with ajax tests and this setup.
Thanks, Jason, for pointing that out. We are using fixtures AND factories in our projects as well (since a couple of years), and it’s a perfect mix regarding testing speed and ease-of-use. Highly recommended!
Fixtures may be faster to run, but in the context of a large app they are much harder to maintain. I’m really flummoxed how anyone considers it feasible to use a single database state that all tests can be run against.
@max I tried to make it very clear that I do not recommending using only fixtures and I’m especially not recommending putting all of your tests data in fixtures. We’ve all been there and it doesn’t work.
Instead, as per my Caveats section, almost every application sets up the exact same initial data for a vast majority of tests. You can move small subsets of data into fixtures for a significant speed increase. Fixtures and Factories are not mutually exclusive. Use both where they fit.