Reusable Cucumber Steps
Mike Swieton recently posted Never say “Click” advocating the use of custom steps over browser-centric steps such as When I press "submit"
. I know and respect Mike and a lot of the Atomic Object team; they’re a bright bunch. I appreciate where he’s coming from, however following his suggestions you are going to create many unnecessary custom steps. And I firmly disagree with that.
You waste time debugging custom steps
Building reusable steps is one of the rare ideological goals that actually works very well in practice. Lets split up Cucumber’s architecture into three parts. You have the code under test, the cucumber scenario, and your cucumber steps. When you have a failure it can crop up at either the scenario level or the step level. Debugging steps can be a pain in the ass because a scenario can use a hodge-podge of steps that are littered across multiple files AND require context of the steps executed prior.
Some code to illustrate my point
Let’s revive our scenario from our previous post.
Given a product named "Some Product"
When I add "Some Product" to my favorites
Then I should see "Some Product" in my favorites
Sure when I read this scenario, it makes sense on a high level. But if I have a problem, I’m going to have an assertion error buried somewhere in the steps rather than at the top level scenario. If it was an error at the scenario level, it would be apparent to start investigating my code under test.
Given the following product exists:
| Name |
| Some Product |
When I go to the home page
And I follow "Some Product"
And I press "Add to Favorites"
And I follow "Favorites"
Then I should see "Some Product"
Now since all steps are small and reusable, the burden of the precondition and assertion falls on the scenario itself.
What do I mean by "reusable steps"?
Are your steps only being used 3 or 4 times? Are they only being used from a single scenario? If you answer “yes” to a large number of your steps, you may be able to do better.
Here’s another test. Compare the line of code (LOC) ratio between your features and your steps using these two commands:
# Print the total lines of code in your features
find features -iname "*.feature" | xargs egrep -i "(Given|When|Then|And)" | wc -l
# Print the total number of steps you have defined
find features/step_definitions -iname "*.rb" | xargs egrep -i "(Given|When|Then)" | wc -l
Here is the ratios on one of our client projects that I feel is does a good job reusing steps: 2045 feature LOC and 176 step LOC or 11.6:1. Let’s check out a couple other open source projects: rspec has a ratio of 18.2:1 and vcr has a ratio of 31.4:1.
Reusable steps are the true benefit of Cucumber
One argument I hear from many people new to cucumber is that it adds a lot more overhead without much benefit compared to traditional assertion frameworks like test/unit and rspec. You know what, when you approach a 1:1 feature line-to-step ratio, I completely agree with them! That doesn’t mean cucumber is bad, it means that often our first inclination is to write custom steps instead of finding ways to refactor the ones we have. Cucumber’s primary benefit is building a comprehensive test suite from reusable steps.
So in conclusion, my own code and the projects I have been part of have continue to validate that reusable steps are a big win. When testing rails stick with cucumber’s web steps, and “click” the crap out of it.
Update
I can understand how this post can be construed as an argument against the plain-text readability of cucumber features. I don’t feel I’m arguing against that, I don’t think they are mutually exclusive goals. I do appreciate readable features, and I have recently come across relish and think the principle behind it is pretty slick. Here are the docs to both rspec and vcr, the two libraries I referenced above.
I know that some developers use Cucumber as a plain-text communication bridge between developers and non-developers, and in my experience and attempts that just has not stuck. Plain-text or not, the format of Given/When/Then still feels structured enough to give off that code smell. I still feel Cucumber is a valuable testing tool, and we use it on our own projects constantly. Even if those features are never shown to non-developers. There is the case for readability among developers, and regarding relish above I think that has merit. However adding one-use custom steps in order to convey intent to another developer misses the point, which is the purpose of this article.
Comments
Hey Zach,
Great thoughts!
I find it interesting that everyone in this discussion advocates reusable steps - but we don’t all agree about what constitutes “reusable”.
The feature to step LOC number is an interesting metric. I’m not convinced that it’s a good metric for test quality (at least not in isolation).
A ratio like ~ 10:1 for unit test code using rspec would probably be appropriate because, following the principle of single responsibility, the classes under test should be mostly all be different from each other. However, in a systems level test from a user’s perspective seems like there would likely be a lot more cross-cutting concerns (navigation, logon, creation of entities which are inputs to reports or dashboards, common widgets).
My gut feeling is that the level of commonality is such that a 10:1 ratio may be ok, but I can’t help but feel that if you have a 30:1 ratio there that you probably have repeated yourself a lot in tests.
I think your comment about where errors show up (in the feature versus the step) is a really good point, but I think it’s a less important tradeoff than allowing tests to be updated quickly when the app changes (as is given when you’ve got many reusable steps).
Thanks for the great feedback. Any thoughts?
Have you seen this presentation[1]?
If you seriously think re-usable steps are the “true” benefit of Cucumber I’m afraid that, in my humble opinion, you’ve missed the point.
[1] http://skillsmatter.com/podcast/agile-testing/refuctoring-your-cukes
@Mike Swieton: Hey Mike, thanks for feedback. You have a good point on the metric ratios. I do feel metrics are a useful tool for insight but shouldn’t be the end-all be-all. Regarding “repeating yourself” in cucumber features: possibly. It depends on the feature AND how you define the steps. Some steps can be very versatile when applied to a scenario and the features don’t feel like they repeat each other, they feel like they flow naturally.
Regarding where errors show up. When I encounter a broken scenario, my first thought is “Ok, what’s going on here” and starting at the scenario level takes less brain-cycles to get to the root of the issue than within a step definition. That’s coming from my experience, so others’ may differ.
@Matt Wynne: Hey Matt, that was actually our stance early on however sadly in practice that hasn’t held up. Clients don’t like to read let alone write features. We have had success using our features with a Client to convey what we’re talking about in order to help the conversation, but those clients are few and far between.
I understand that non-developer readability is a goal, but it just hasn’t happened for us despite our attempts.
The benefit of Cucumber is being able to express features in plain English, not the ability to re-use steps. You can re-use steps in any testing framework. If you take away that layer of English, Cucumber is really no different than just capybara/webrat with rspec or test::unit.
I completely agree with Zach as far as the value of reusable steps. In my experience, custom steps can easily become too complex and hide how an interface actually works. My general rule is to write my cukes (especially Whens and Thens) as if I’m explaining the feature to my mom over the phone.
…and depending on the interface, sometimes that takes lots of steps!
I’m just getting into Cucumber, so bear with me.
My newbie perspective is that Cucumber is directed at high level testing. The goal of the steps is to create a lexicon or vocabulary that a client, user, or QA team member can write tests specifying intent rather than specific actions. These should be at a fairly high level of abstraction and the implementation details are developed by the software developer.
Example:
Step #1 (high level - implemented for client, users, QA)
Given I am on the login page
When I enter “Bob” credentials
And submit the login form
Then I see my personal homepage
Step #1 (define steps - implemented by developers)
…
When /^I enter “([^”]*)” credentials do |username|
# ruby code baby, lets apply our DRY principles here!
# lookup user by userame
# set username
# set password
# handle multi factor auth
end
In my example I am thinking the client, user, QA, developers all have a pow-wow writing up a bunch of features. Then, the developers return to their stand up desks and implement the steps which were defined at a high level. There is a certain risk here that the developers did not take decent enough notes and their implementations of the steps do not adequately define the behavior the client/user intended. However, the fill in credentials only needs to change once for all of the high level scenario definitions which use it when multi factor gets implemented rather than all of the instances which did those steps explicitly at the high level steps which do each action explicitly.
So, sounds like I am taking Mike’s side this time. But wait! There’s more!
Perhaps there is place for both methods. When I am interested in login behavior I write a scenario which fills in each field of the login form in detail. Check for minutiae, such as proper error messages and other behavior. But I also then put together another step which does the login process as a black box which gets used 99% of the time when I’m interested in navigating to a feature which I must be logged in to test.
To maintain a article/comment timeline for new readers, I’ve posted an update to the article regarding my thoughts on the readability of plain-text features.
“In my experience, custom steps can easily become too complex”
So can your Ruby code, unless you are disciplined about design and refactoring. When I say what I’m about to say, I’m not saying “you”, and I’m also not saying “this doesn’t happen to me,” but if your custom steps become too complex then you’re doing it wrong.
EIther way, there is duplication to manage. With declarative steps the duplication is in the step definitions (i.e. in Ruby), whereas imperative steps put them directly in the scenarios (i.e. in Gherkin). In my experience, Ruby is easier to refactor than Gherkin.
Sometimes it feels right to abstract things with one step (Given I am logged in), while in some other occasions you might want to break it down into smaller steps (to stress specific actions). It is all about readability. As a developer, I find pure rspec (w/o cucumber) perfectly understandable, but when there are non technical people involved in the project, cucumber provides them with an easy way to read and write stories in plain English.
Btw, there is a typo: resuable
@Milan Dobrota: Thanks Milan. Fixed.