A Field Guide to Unit Testing: Overview
THE UNIT TESTING SERIES
As a server developer at PicCollage, what I'm most proud of among all the things I've developed is my relationship with testing.
I didn't start my software career loving tests. When I first started writing them, it was a lot of trouble–I wasn't familiar with its methodology, and the benefits of writing tests weren't readily obvious. It took me some time to warm up the idea of testing. When I did, however, stop seeing it as a necessary evil, I started to see how powerful a tool/mindset it is. If nothing else, it improves code quality significantly.
From Survival to Sophistication
Quite similar to the evolution of civilization, my mental journey went through three stages–the How, Why and Where: How to write tests? Why should we write tests? Where should we start refactoring our tests? (Yes there will be urges to refactor your tests once you've gone through the first two phases.)
This series is an attempt to answer the first question: How to write a test? We'll set up a restaurant to be our example throughout the series. Specifically, a restaurant that makes French toast. With that, we'll go through the following topics in this article.
- The universe of software testing
- The testing pyramid
- Pillars for a good unit test
Note: the example will be in
RSpec(a Ruby testing framework). It is totally fine if you're not familiar with them. Their friendly syntax should be pseudo-codey enough to understand. I'll also explain the parts where they're not instantly digestable.
In the beginning there was a restaurant. Everything in this restaurant is kept to a minimum. It serves only French toast, and has three entities:
module Restaurant class Server def take_order(order); end def serve(dish); end end class Cook def make_french_toast; end end class StorageRoom def self.get(ingredient); end def self.put(ingredient); end end end
Server has two functions:
take_order(dish) receives and passes the order to
serve(dish) takes the dish and serves it to a customer.
Cook has one function:
make_french_toast, since it's all this restaurant offers. To make French toast,
Cook will get the ingredients (eggs and bread) from
StorageRoom has two static/class functions:
ingredient from a persistant datastore;
put(ingredient) is the other way around. It puts
ingredient into the datastore.
Below is an illustration of their relationships:
Now, how do we write tests for this restaurant?
SUT: Subject under Test
To make life easier for both you and me, let's introduce a term first: SUT. SUT stands for Subject Under Test. It is the piece of code that a test case is testing against. For example, if we want to write a test to make sure
Cook can make French toast as expected, the
make_french_toast function will be the SUT.
This term will stay with us throughout the series.
The universe of software testing
Before diving into the actual test code (don't worry, there will be), let's have a bird eye's view on software testing first.
The universe of software testing can be roughly categorized the into three types: acceptance test, *integration test, and **unit test. Below is a diagram of the scope of SUT that each type covers.
* Acceptance tests are sometimes referred to as end-to-end tests. Conceptually speaking they are ambiguously different, but in practice, they are often treated as the same.
Unit test, our star in this series, is the smallest component in the testing universe. Each unit test covers a, well, unit of code. A very important trait about unit test is isolation. A unit test tests the SUT in isolation. That is, it tests the SUT under the assumption that every else works.
Think about the restaurant. Suppose there was a sudden fire that burns everything of the restaurant down to ashes, the unit test for
make_french_toast should still pass because this test only cares that SUT works provided everything else does. It cares about nothing else.
How do I know if it's safe to dine in this restaurant if we test everything in isolation? Well, if the restaurant is really burnt to ashes, or simply has a server that steals a bite from the plate every time they serve, it will fail in other unit tests that cover that specific functionality (e.g., the test case
'#serve presents the dish without modifications to the dish' will certainly fail).
Integration test, like unit test, asserts against a certain behavior of a piece of code; yet, unlike unit test, it does not test the SUT in isolation. An integration test expects the right behavior of the SUT using the SUT's real dependencies.
So in a burnt-down restaurant, an integration test for
make_french_toast will fail, and quite quickly–at the cook's first attempt to step into the kitchen, let alone getting the ingredients from the
Acceptance test, a concept that has an intricately ambiguous relationship with end-to-end test, is the testing method that covers the end to end functionality of a unit. In simpler words, it tests the whole thing, from the end user of the system. It takes everything into consideration: network, device-specific quirks, UI, etc.
An acceptance test case for our restaurnat will be
'Ordering a French toast from this restaurant gets a plate of French toast'.
Let's look at this diagram again, and apply the concept to our
Restaurant. Below are some example test cases we will write for each type of tests we just defined.
Cook#make_french_toastwith all ingredients returns a plate of French toast.
Cook#make_french_toastreturns a plate of French toast.
- [Acceptance test] When customer orders a French toast, they will be served a plate of French toast with fork and knife.
We will dig deeper into the difference between our unit test case and integration test case, as you might think the two look like the same test.
The testing pyramid
Now, let's look at the three types of tests from another perspective: their roles in a codebase. The pyramid above is how you should distribute/use them for your code.
At the bottom we have unit tests. Unit tests should take up the biggest proportion, as they (ideally cover every piece of code in every scenario (more on scenarios later), because they are the first and fundamental guard to make sure your code works.
Acceptance and integration tests should be the fewest, for two main reasons. And all due to the reason that they use the SUT's real dependencies. First, it's painful to execute. Some acceptance tests are executed manually (hi QA engineers), as for integration tests, it requires extra setup to execute a test, database connections for a common example. Second, it's painful to debug. Using real dependencies means when a test fails, the root cause can be in anywhere of the dependencies web for the SUT.
We should, therefore, not use acceptance tests and integration tests to catch any bugs that can be caught by unit tests. This is also why the pyramid has unit tests at the bottom. If we have a robust set of unit tests, we can be very efficient in debugging. Even if an acceptance test fails, we will have better navigation to the root cause since we don't have to inspect the components that our unit tests are already covering.
With that said, we will focus on unit testing for the remainder of the series. In the next section, we will explore the three criteria for a good unit test and expand on each of them in the next article.
Three pillars for a good unit test
Here it comes, the fundamental question we need to answer: What is a good unit test?
There are many traits you can find that describe a good unit test. The Art of Unit Testing by Roy Osherove, the authoritative guide to unit testing, has itself about 10 items on its "good test" list. We will not attempt to iterate them in this article because (1) I only brought this up as a proof that to write a good unit test is hard, as there are many boxes it needs to check to be passed as "good" (2) no one is going to remember any of those 10+ criteria (3) I cannot remember them. What we'll do instead is to consolidate them into three guidelines. Yes, three short, sweet guidelines. If we follow those guidelines, our tests will check all the boxes and be good. Here they are.
- A good unit test is trustworthy.
- A good unit test is maintainable.
- A good unit test is readable.
Huh? I can hear you say. Too general? Too vague? Guilty as charged. But don't leave just yet. Let's walk through one by one, and put them in the context of unit testing. By the end of this article, I hope you'll understand why I chose those three terms to be the pillars of a good test.
A good unit test is trustworthy
Being trustworthy is the raison d'etre of a test. We want to be confident that when the tests for a piece of code pass, that means the code really works, and works the way it's intended. On the other hand, we also want to be confident that when a test fails, it is exactly the SUT that fails to work, not anything else.
Three topics need to be covered to accomplish the above two important goals:
- Definition of a unit
- Testing scenarios
- Test isolation
A good unit test is maintainable
Maintainability is important in not just unit testing, but the whole software universe. In the scope of unit testing, though, there are some specific measures that we can leverage to achieve maintainability. These are:
- Testing one thing at a time
A good unit test is readable
A unit test shouldn't content itself only with being trustworthy and maintainable, but it should have a bigger vision: documentation. Think about it, a robust set of unit tests covers every scenario for every unit that the code contains. In essence, the test cases are the SUTs' documentation, because they narrate how the SUTs should behave.
Amazing, I know right.
To make our unit tests into a source of documentation, we need to guarantee their readability. If we follow the best practice of the following two topics, readability can be achieved as easily as you can ever dream of.
- Test structure
In the next article, we will expand on the three criteria for a good unit test (trustworthiness, maintainability, readability), and see how they can applied to our restaurant. See you there!