How do I know my testers are doing a good job?
If you’re a manager who has testers reporting to you (or if you are a manager who has a manager who etc.), at some point you’re going to ask the question: “How do I know they are doing a good job?”
Answering that question may come with a few significant challenges attached. There’s a good chance you do not have a background in software testing yourself. You probably have a tool that can produce some numbers about testing with a dashboard highly recommended by its vendor. You also have plenty of other things on your plate, so whatever the solution is, it should not take up too much of your time. So how to proceed?
Before we go into that, though: this post is about telling if your testers are doing a good job with regards to testing. Testers are also team members, colleagues, employees, etc. This blog post is not about those things.
How not do it
The most straightforward way to (try to) see if someone is doing a good job, is by looking at their outputs and/or outcomes, preferably through metrics. Note that I said “straightforward” not “best”.
Outputs
Three typical output metrics I’ve come across in the past (luckily more through stories than direct experience) are:
- number of test cases executed (higher is better)
- number of defects reported (higher is better)
- number of rejected defects (lower is better)
Since Goodhart’s Law dates from 1975 and the ”-2000 Lines Of Code”-story is from 1982, I’m just going to assume we can all agree that using output metrics to evaluate people’s work is a very bad idea.
Outcomes
The only outcome metric that I know to be used, is the number of escaped defects (lower is better). What defects made it to production that we did not know about? Preferably with a threshold on the minimum level of severity.
This seems sensible. If we’re going to release with any significant bugs, we should at least know what they are. And that’s what we have testers for. However, this metric makes two important assumptions.
The first assumption is that the tester(s) have everything they need to do good testing. And that is rarely the case. So you may very well end up faulting testers for not finding bugs they were never enabled to find. And that’s poor management.
The second assumption is that there actually are significant bugs that might escape in the first place. If you have excellent developers, the defect escape rate might be low, not because your testers found them in time, but because your developers didn’t put them in there to begin with. Or they found and fixed them, before your testers got to take a look. As Charity Majors wrote: “The smallest unit of software ownership and delivery is the engineering team.”
There is a second outcomes metric, that is more sensible, but less well known: number of bugs reported that get fixed (higher is better). It’s a measure that covers two things: (1) is a tester able to find important bugs; (2) are they able to advocate for them so they get fixed?
This metric suffers from the same two assumptions, though. And it leans even heavier on the first assumption, requiring the tester to have sufficient access to stakeholders, so they know what bugs are important and how to best advocate for them.
Neither output, nor outcomes
The third option is not both outputs and outcomes, but neither. When I worked at a staff/principal level, or as a quality engineer, I mostly felt as if I was evaluated neither on output, nor on outcomes.
I wasn’t evaluated on output, because I was not part of a team delivering software. So what outputs were there to measure? Not that my work did not result in certain outputs, but they were specific to my role and unique to me. So you’d need to come up with some custom output measurement. That’s a lot of effort for a single or small number of individuals.
In any case, it makes more sense to evaluate such roles based on outcomes. But that wasn’t happening either. For the simple reason that there were no testing or quality-related initiatives or KPIs. I was expected to do good things in the areas of testing and quality, but I was never provided with a wider company goal or problem that needed addressing. (I did ask.)
So what I ended up doing, was deciding for myself what was worthwhile to do, share that with my manager, do those things, and share the results again with my manager. On the one hand, that level of freedom is really nice. On the other hand it can make you feel you’re not crucial to the organization.
One way to do it
To be honest, I’ve been cheating a bit in the section above by solely focusing on metrics. Instead of trying to put everything in numbers - or even worse, a single number -, you can evaluate someone’s output and outcomes qualitatively instead of quantitatively. Since the goal of testing is to provide information about the quality of the product, you can ask yourself if that’s what you’re getting from your testers.
If that’s not what you’re getting, that might be a you-problem, though. As Esther Derby points out, quoting Kurt Lewin: “Behavior is a function of the person and the environment.” A tester might be providing insufficient information, or irrelevant information, or they might be providing it too late. That might be a them-problem, because of lack of skill and/or a lack of motivation. It might also be a you-problem, because you failed to provide an environment in which they can be effective.1
In that sense, a tester that fails to produce any relevant information about the product, but is able to tell you that and what they need to be able to produce such information, is in fact a great tester.
-
In my experience, unfortunately, it’s often both a skill and an environment problem, mutually reinforcing each other. ↩