Video: Chewing the Fat Review - Testing with Luke Cook and Piers Sinclair (7 min)
Watch the extended cut (32 min).
If you ask your manager, developers, clients or other stakeholders whether they think testing is important, you're very likely to get a resounding "yes, of course!". The question of why they believe it's important is usually more difficult for them to answer, though.
Everyone thinks they know what "testing" is. Like most things, though, there isn't a shared understanding of what testing really means across the IT industry.
Distinguishing "testing" and "checking" is a great way to help build this shared understanding when we're talking about this critical part of the software development process.
Without a good understanding of testing and its limitations, it's easy for clients and customers to believe that we "test everything" - but there's a problem with this belief:
Complete (or 100% or exhaustive) testing is impossible.
There is a common misconception that you can automate all the testing.
While there can be great value in using automation in testing, human capabilities are still required for the key testing skills such as evaluation, experimentation, exploration, etc.
"Critical distance" refers to the difference between one perspective and another, the difference between two ways of thinking about the same thing.
You know how easy it is for someone else to spot things - both good and bad - about your work that you haven’t noticed. You're "too close" to the work to be truly critical of it.
Developers naturally have a builder mindset focused on success, while testers are likely to have a more skeptical, critical-thinking mindset.
The critical distance between the mindset of a developer and a tester is important for excellent testing.
We know that complete testing is impossible so how do we decide which finite set of tests to perform out of the infinite cloud of possible tests for a story, feature or release?
This problem can seem overwhelming, but focusing on risk is a good approach so let's look at risk-based testing.
In a Scrum team, every member of the team has some responsibility for the quality of the deliverables of the team.
If you have a dedicated tester embedded in the Scrum team, they are not solely responsible for performing all of the types of testing required to help build a quality deliverable.
We know that complete testing is impossible, so we need ways to help us decide when to stop testing... aka when we've done "enough" testing.
"Genius sometimes consists of knowing when to stop." — Charles de Gaulle
Exploratory testing is an approach to testing that fits very well into agile teams and maximises the time the tester spends interacting with the software in search of problems that threaten the value of the software.
Exploratory testing is often confused with random ad hoc approaches, but it has structure and is a credible and efficient way to approach testing.
Let's dig deeper, look into why this approach is so important, and dispel some of the myths around this testing approach.
A big suite of various levels of automated tests can be a great way of quickly identifying problems introduced into the codebase.
As your application changes and the number of automated tests increases over time, though, it becomes more likely that some of them will fail.
It's important to know how to handle these failures appropriately.
There are a lot of different ways to test software. Developers will often write unit or integration tests for their code, but can sometimes fall into the trap of trying to test all things the same way (aka the "golden hammer").
How much does it cost to fix a bug? It depends when it is discovered. If the dev notices it 1 hour after writing it, then it is cheap to fix. If the bug is discovered after it is live and lots of people are using it, then it is much more expensive to fix. If it is discovered after the developer has left the company, then it is super expensive to fix.
Relying on the same user research method like only running 1-on-1 interviews leads to blind spots. Interviews are great for depth, but they don’t reveal group dynamics, cross-team dependencies, or patterns that emerge at scale.
To design experiences that scale, you need insights that are qualitative and quantitative, attitudinal and behavioural, and drawn from both individuals and teams.
Shipping a high-risk feature without testing it with users is a gamble. If your update affects sign-up flows, checkout processes, or dashboards, you can’t afford to guess. Usability issues that slip through can cost you revenue, reputation, and rework time.
Even experienced designers and developers miss things. The only way to validate an interface is to observe real users attempting real tasks.
Good quality automated tests can help your development to continue more quickly and with more safety.
Gating deployments on the successful outcomes of your automated test suites can prevent you from automatically pushing bad code into production.
Depending on your automated tests to make deployment/release decisions means that your test code must be excellent quality.
Watching an automated UI test doing its thing can be very compelling. Combining this with the availability (and powerful marketing) of so many automated UI testing frameworks & tools often leads teams to focus too heavily on automating their testing at the UI level.
This is a classic illustration of the law of the instrument or Maslow's hammer, a cognitive bias that involves an over-reliance on a familiar tool. Abraham Maslow wrote in 1966, "If the only tool you have is a hammer, it is tempting to treat everything as if it were a nail".
While automated UI testing has its place in an overall test strategy (involving both humans and automation), you need to exercise care about how much of your testing is performed at this level.
Automation can be an awesome part of a test strategy, but not all tests are good candidates to be automated.
Not all testing can be completely automated, due to the uniquely human skills that are required (e.g. exploration, learning, experimentation). But even for those tests that can be automated, not all of them should be.
Having an awareness of the different types and levels of testing is critical to developing appropriate test strategies for your applications.
<boxEmbed
style="greybox"
body={<>
Remember that different types and levels of tests help to mitigate different types of risk in your software.
</>}
figurePrefix="none"
figure=""
/>
There are various models to help with this, most stemming from Mike Cohn's simple [automated testing pyramid](https://www.mountaingoatsoftware.com/blog/the-forgotten-layer-of-the-test-automation-pyramid).
<endIntro />
<imageEmbed
alt="Image"
size="large"
showBorder={false}
figurePrefix="none"
figure="Mike Cohn's automated testing pyramid (2009)"
src="/uploads/rules/testing-pyramid/test-pyramid-cohn.jpg"
/>
> "All models are wrong, but some are useful"
>
> * George Box
The test pyramid is a model and, like all models, it is wrong, though it is perhaps useful.
The core idea of this model is that an effective testing strategy calls for automating checks at three different levels, supplemented by human testing.
The pyramid model shows you where proportionally more automation effort should be placed - so a good strategy would see many automated unit tests and only a few end-to-end (UI-driven) tests.
The pyramid favours automated unit and API tests as they offer greater value at a lower cost. Test cost is a function of execution time, determinism, and robustness directly proportional to the size of the system under test. As automated unit and API tests have a minimal scope, they provide fast, deterministic feedback. In contrast, automated end-to-end and manual tests use a much larger system under test and produce slower, less deterministic and more brittle feedback.
Let's look at the 3 levels of automation in a little more detail.
### Unit tests
The pyramid is supported at the bottom by unit tests as unit testing is the foundation of a solid automation strategy and represents the largest part of the pyramid. Unit tests are typically written using the same language as the system itself, so programmers are generally comfortable with writing them (though you shouldn't assume they're **good** at writing them). Cohn says:
> "Automated unit tests are wonderful because they give specific data to a programmer—there is a bug and it’s on line 47. Programmers have learned that the bug may really be on line 51 or 42, but it’s much nicer to have an automated unit test narrow it down than it is to have a tester say _"There's a bug in how you're retrieving member records from the database, which might represent 1,000 or more lines of code."_ These smaller (scope) tests put positive pressure on the design of the code, since it is easier for bigger (scope) tests with poor code to pass than smaller (scope) tests with poor code." - Mike Cohn
Although writing unit tests is typically a developer task within the agile team, there is also an excellent opportunity for testers to be involved by pairing with developers to help them write better unit tests. It's a mistake to assume that developers know how to write good unit tests, since it is unlikely that they have been trained in test design techniques. The tester does not need to know the programming language, the idea is that the developer can talk through the intent of their unit tests and the tester can ask questions that may identify missing coverage or indicate logical flaws. This is an excellent use of a tester's time, since getting a good set of unit tests in place is foundational to the rest of the automation strategy.
See [Rules to Better Unit Tests](/rules-to-better-unit-tests).
### Acceptance tests (aka "service tests" or "API tests")
The middle layer of the pyramid - variously referred to as acceptance tests, service tests or API tests - increases the scope of tests compared to unit tests and is often (as Cohn refers to it) the "forgotten layer". While there is great benefit to be gained from automating at this level, it is often ignored or overlooked in automation efforts, especially in teams that are [overly-reliant on automated UI tests](/automated-ui-testing-sparingly).
Testing at this level typically requires different tooling because the tests will be manipulating APIs outside of a user interface, so it can be more challenging for testers to be involved here than at the functional UI test level, but a good framework should make it possible for testers to design and write tests at the service/API level too.
Although there is great value in automated unit testing, it can cover only so much of an application's testing needs. Without service-level testing to fill the gap between unit and user interface testing, all other testing ends up being performed through the user interface, resulting in tests that are expensive to run, expensive to write, and often fragile.
### End-to-end/UI tests
Automated UI tests should be kept to a minimum, leveraging their value to check that important user workflows continue to work as expected while avoiding the problems associated with their overuse.
See [Do you remember to use automated UI testing sparingly?](/automated-ui-testing-sparingly)
### An alternative model - the bug filter (Noah Sussman)
Many different test pyramid models have been inspired by Cohn's simple original idea.
An interesting take comes from [Noah Sussman](https://infiniteundo.com/post/158179632683/abandoning-the-pyramid-of-testing-in-favor-of-a) who re-imagined the test pyramid as a bug filter (turning the pyramid on its head in the process):
<imageEmbed
alt="Image"
size="large"
showBorder={false}
figurePrefix="none"
figure="Noah Sussman's bug filter model (2017)"
src="/uploads/rules/testing-pyramid/bug-filter.jpg"
/>
Note that the area of the bug filter changes at each level. Unit tests focus solely on product code, but integration tests might include databases or external web services. End-to-end tests cover an even larger architecture. Bugs can appear from these new systems without having passed through a previous filter.
Katrina Clokie (in her book [A Practical Guide to Testing in DevOps](https://leanpub.com/testingindevops)) explains this bug filter model as follows:
> I imagine the bugs that drop through this filter as being butterflies in all stages of their lifecycle. Unit tests are going to capture the eggs — bugs before they develop into anything of consequence. Integration tests are going to capture the caterpillars. These may have arisen from a unit test egg that has hatched in the integrated environment, or may have crawled into our platform via a third-party system. End-to-end tests capture the butterflies."
>
> \- Katrina Clokie
### Further reading
* [A Test Pyramid Heresy](https://www.linkedin.com/pulse/test-pyramid-heresy-john-ferguson-smart) by John Ferguson-Smart
* [Why I Still Like Pyramids](http://thatsthebuffettable.blogspot.com/2016/03/why-i-still-like-pyramids.html) by Marcel GehlenReliable suites of automated tests can provide a lot of value to your development effort, giving fast feedback and alerting you to unexpected problems introduced by recent code changes.
The more automated the process of building, testing, deploying and delivering software is (and that's the direction a lot of teams are going in), the higher the responsibility of our tests is. Increasingly often, our tests are the only safety net (change detector) between code being written on a developer machine and that code ending up in a production environment. Therefore, it's probably a good idea to make sure that our tests detect the changes we want them to detect.
Automated test code ages just like any other code, though, and it's common to see teams adding more and more automated tests to their suites, without ever going back to review the existing tests to see if they're still relevant and adding value. This process of adding further tests over time often results in bloated test suites that take longer to run and require more human effort to diagnose failures.
Your automated tests require periodic attention and review — or else they're like smoke detectors, scattered throughout enormous buildings, whose batteries and states of repair are uncertain. As Jerry Weinberg said:
Most of the time, a non-functioning smoke alarm is behaviorally indistinguishable from one that works. Sadly, the most common reminder to replace the batteries is a fire."-Jerry Weinberg
Exploratory Testing (ET) gives the tester much more freedom and responsibility in their testing than when following a more scripted approach.
Putting some structure around ET helps to make the approach more credible and provides a way for managers to track and review testers' work.
Session-based test management (SBTM) is a lightweight approach to the management of exploratory testing effort that defines a set of expectations for what kind of work will be done and how it will be reported.
[SBTM is] a way for the testers to make orderly reports and organize their work without obstructing the flexibility and serendipity that makes exploratory testing useful - Jon Bach
While user stories should have good acceptance criteria, checking that these criteria are met is really just the starting point before engaging in deeper testing.
Without detailed test cases, it can be difficult to work out what to test outside of the acceptance criteria. Using test ideas and heuristics to come up with these ideas are important skills for good exploratory testing.