Writing good tests (part 1)

Everyone knows how NOT to write unit tests, either you know the rules and avoid pitfalls, or you don't know the rules and struggle with hard to maintain unit tests. The same rules apply to higher level tests, but there are also other things to consider. We, in the team, have learnt them (the hard way, of course) and would like to share, so that you can avoid those traps.

Keep your specification small

Imagine you're writing some kind of calculation module for your e-commerce business. For every order you ship, it should calculate how much it costs you to actually do it. After all, it's not only the articles you're sending, but you need to put it into some kind of box, you need to ship it using DHL, GLS, UPS or whichever partner you have, packing itself also costs either manual work of some person(s) or time of some robot you bought, etc.

Recently business asked us to change the logic in such module. We thought, let's do it BDD way, so adapt the test cases first and only then try to fix the application. This was even more desirable as we got a new team member. She would first learn the application by reading test scenarios, then adapt them to current requirements and then adapt the application itself. Perfect idea.

And then I saw the feature file (we're using Cucumber):

Scenario: Check parcel cost calculation in € for Fulfillment center HOE when there is one parcel with one article

  Given We have a fulfillment center: HOE

  And We have those articles available
    | ArticleName | LengthInMm | WidthInMm | HeightInMm | weightInG |
    | A1          | 100        | 100       | 100        | 100       |

  And We have those boxes available
    | PackingType | WeightInG | LengthInMm | WidthInMm | HeightInMm | Big   | Cost |
    | PBX1        | 100       | 200        | 200       | 200        | false | 1    |

  And We have this warehouse costs configuration
    | PackingType | Big   | Bulk  | ParcelCommissioningCost | OrderLineCommissioningCost | SmallArticlePickCommissioningCost | BigArticlePickCommissioningCost | ParcelOutboundCost | Currency |
    | PBX1        | false | false | 0.1                     | 0.2                        | 0.3                               | 0.4                             | 1                  | EUR      |

  And We have those DSPs with shipping costs available
    | DspName | DeliveryCountry | OverweightThresholdInG | StandardParcelPrice | HeavyParcelPrice | ShippingCostCurrency |
    | DPD     | DE              | 30000                  | 5                   | 10               | EUR                  |

  And We have a parcel that is packed in a box: PBX1

  And The parcel is sent to country: DE

  And The parcel is sent with delivery service provider: DPD

  And The parcel contains the following articles
    | ArticleName | BrickType     | QuantityInBrick | BrickLengthInMm | BrickWidthInMm | BrickHeightInMm | BrickWeightInG |
    | A1          | SingleArticle | 1               | 100             | 100            | 100             | 100            |
    | A1          | SingleArticle | 1               | 100             | 100            | 100             | 100            |
    | A1          | SingleArticle | 1               | 100             | 100            | 100             | 100            |
    | A1          | SingleArticle | 1               | 100             | 100            | 100             | 100            |
    | A1          | SingleArticle | 1               | 100             | 100            | 100             | 100            |
    | A1          | SingleArticle | 1               | 100             | 100            | 100             | 100            |

  When We call Parcel Cost Calculator

  Then There was no exception

  And The box cost is: 1 EUR

  And The number of small picks are: 6

  And The number of big picks are: 0

  And The detailed parcel production costs are
    | ParcelCommissioningCost | TotalOrderLineCommissioningCosts | TotalArticlePicksCommissioningCosts | ParcelOutboundCost |
    | 0.1                     | 0.2                              | 1.8                                 | 1                  |

  And The detailed shipping costs are
    | DspCost | TotalLinehaulCost |
    | 5       | 0                 |

  And The final parcel cost is: 9.1 EUR

Even I couldn't understand what it was all about, although I was one of the developers working on it, and it wasn't longer than a year ago. There are multiple problems in here:

  1. although there's a headline, it's not really clear what each test scenario is about,
  2. we're overloading the one reading it with all implementation details,
  3. every tests is testing the same things over and over again.

There are two aspects here: a human and a machine. First two problems are related to people, business, developers, QA, who would like to understand what this application is doing. In short: it's impossible; one simply can't see the forest for the trees. Last two issues (second one applies to both) are making tests maintenance a hassle. If you need to change one feature, you'll need to rewrite all test scenarios. And you also need to provide data not important for some scenarios.

Why can't we write a scenario like this:

Scenario Outline: Calculating shipping costs
  Given a parcel of weight <weight>
  And DSP with weight threshold of <threshold>
  And cost of shipping parcels, regular: <regular_cost>, overweight: <overweight_cost>
  When Parcel Cost Calculator is executed
  Then shipping cost is: <final_shipping_cost>
  | weight | threshold | regular_cost | overweight_cost | final_shipping_cost |

Now the scenario is much more readable for humans, it's to the point, doesn't contain irrelevant information and it can still be used for automated tests (syntax is of Cucumber, I will not be explaining it here, check their documentation).

Please notice, that right now this scenario has a much better scope: shipping costs. All other information is omitted as not relevant, we don't need to know to which country we're sending the parcel, it's not important how the box is named or its dimensions, etc. And it reads much more "English".

Of course, out of one scenario like I've shown before, we had to create multiple scenarios. Or rather: we had to rewrite every scenario in order to have every feature covered.

Tests implementation

Previously tests implementation was almost trivial. As every line was providing all the data required by the module, all we needed to do was just load it into database and execute the calculation.

Of course, as we're providing far less data in the scenario itself, test implementation will be much different and would need to provide much more "default" data. We decided to use some generic builder or helper class, that by default create all necessary data, but in "neutral" way. And by "neutral" I mean something similar to null object pattern. Using that, you can, for example, say: new Builder().build() and you'll have everything your application needs, so that it will not fail. Of course, most values will be zeros or empty, so that they will not have any impact on the final result. For us, for calculating cost, the result should be exactly 0.

As every scenario requires its own specific data, in all 'Given' steps you need to update the builder with those values, for example like this: builder.withBoxCosts(5). Only after you've entered all the data, you call builder.build().

Possible drawbacks

Our scenarios right now are much more concise because we're omitting all unnecessary data. But what if that actually makes those scenarios less readable? What if business, not seeing those individual values, seems to be lost and stops believing in those, after all, test scenarios? You can always use 'Background' feature from Cucumber or something similar in your tool of choice. That way you actually provide the data, but outside of the scenario. Then you'd only need to feed the data into your builder and override default values with those from scenario itself.

Second problem could be, that the implementation gets more difficult. Previously we just entered all the data that the database requires, including foreign keys, so that we knew what we were referring to. Now it gets trickier, because we need to provide some kind of "default" values for all fields not provided. We also need to manually keep track of all relationships. We also need to have all those builders available, so that, at any point in time during data preparation, we can change any value.


We've just started our journey towards better testing. We believe we can see the bright future, but there's still a long way ahead of us. We still don't know if the method described above is the best one, or even a good one. So far it looks much better, that the previous approach. But this is only step one towards, who knows, maybe living documentation?