Why Companies Fail at Continuous Testing

Digital Marketing Firm Near Me

Continuous Testing is the process of executing setup, test execution, and teardown of all automated checks. These tests run on a loop, for every version control change. Each change can be released to production without additional regression testing.

–Definition by Excelon Development

Executive Summary

In our other piece, “Why Companies Fail at Test Automation,” we covered classic mistakes in software test automation. One of those categories was infrastructure, the ability to set up and run a test quickly and consistently. This paper explores that infrastructure requirement in much more detail, in a context of continuous testing.

The term Continuous Testing (CT) is not new; it was originally defined in an academic paper spanning back to 2003. Since that time, we have seen many companies use the term to walk down a path that leads to some form of improvement. That could be more frequent releases, more value into the hands of customers earlier, or fewer defects in production. Sometimes, continuous testing is mistaken as the goal itself — leading to our first classic mistake.

Other major mistakes include technological incompatibility, trying to achieve coverage before setting up a delivery pipeline (as the single enabler to Continuous Delivery), skill/role gaps, or the lack of any process to identify and reduce defects over time.

By avoiding these traps, companies can set expectations and design a more measured course, understanding the true cost, actual benefit, and with a better chance to have long term sustained success. The report that follows discusses these problems in depth, including alternative ways to look at the problem.

Unclear Goals

Continuous testing isn’t a goal; it is an enabler. Teams that pursue CT should know what they are pursuing. The goal could be faster feedback on defects, it could be faster releases to production or faster development. Continuous testing is also an investment, with a payoff typically measured in months at best. Without understanding how much the company is investing, or some way to measure what the harvest will be, the results will come down to perception and politics.

Before starting a continuous test project, have some idea of the investment, the goals, and the measure of success . Those goals may have to round up to some greater DevOps project, as CT is unlikely to be successful on its own.

CT Sole Enabler to Continuous Delivery

Running tests all the time in a low-quality environment will result in finding many defects quickly. Elizabeth Hendrickson, recently Vice President of R&D at Pivotal, called the problem “Better Testing – Worse Quality?” Hendrickson pointed out that when testing is a separate activity that is a “safety net,” programmers can feel empowered to skip testing. After all, someone else is doing that work. Having a magical, mythical tool to find problems can add to that perception problem.

We see at least five other areas working in concert with CT to reduce risk. These are reinforcing features of the system that lead to resiliency. Multiple deploy points top that list. With deploy points, you can make a change with some confidence that the only feature at risk is the one that changed. Microservices, or separate compiled web pages, are two examples of how to achieve separate deploy points. Configuration flags can enable a feature to deploy “dark” then roll out to a limited user population or roll back. Extensive production monitoring and alerts advise the team of more problems earlier, before they can impact a large customer group. Observability makes it easy to debug complex problems in production. Those four, along with continuous testing, are self-reinforcing practices.

Coverage Before Pipeline

If the goal is to reduce or eliminate regression testing, then the next step after the proof of concept is usually a project to cover the entire application.

Don’t do this.

Instead make sure that there is a complete pipeline, end-to-end, that does all of the setup, execution, and teardown, along with one test. Define a process to add tests and to deal with version control. Get that test as part of the definition of done. “The story isn’t done until the test has run” may become the mantra for the team.

In other words, actually implement the definition of continuous testing outlined above, instead of trying to get as much coverage as possible with classic automated testing methods. Classic methods will add automation delay to the process. If the tests run on the testers laptop, that tester will be essentially ineffective for minutes to hours at a time, watching tests run.

An even worse mistake is running testing that is incompatible with the chosen pipeline.

Testing not compatible with CI

The de facto standard for continuous integration (CI) pipelines is probably Jenkins. Jenkins is easy to learn, easy to download, free to run, and has a moderately excited support community. Using an open source tool to compile and test does not represent the same risk as adding open source to production code. With a compiler too, if something goes wrong, the team can “just” go back to running things manually or tying together a series of batch scripts. Getting a defect out of a code library that is part of an application adds several more layers of complexity.

Jenkins does, however, have a strong tie to Linux/Unix. Even if the team is running it on Windows, Jenkins likely needs to run the tool from the command line, and receive test results it can understand. In theory, yes, Jenkins could run on a Linux server and remote desktop out to a Windows machine to kick off a process, if that process can run over the command line. In practice, trying to get a remote desktop to run a slow test-process is problematic. Delaying that problem by working on building a test suite first (outlined above) will only delay the pain.

As a sort of meta-mistake, delaying the pain is a great way to have continuous testing failure. Because of the nature of the GUI tests, it is very easy for the technical people to show the executives a screen that flashes by and call it testing. The amount of missing coverage from hard-to-test items, or missing automated steps in the pipeline, could be make-or-break for the project. They are also easily hidden by well-meaning people who want to get that “exceeds expectations” on the annual review, and have the best of intentions to “fix it later.”

Skill/Role Gaps

Like any other new tool, who will do the work and how the tool will be added to the process is critical. While a company might think through who will use a test automation tool, who will do the pipeline work can be an even-more-grey area. In addition to who, that person needs the time, space, and aptitude to learn the technology.

These tools are new and emergent. In our experience, companies that provide space to learn (and fail) along with time are more likely to succeed than those that bring in contractors who have the skill or seek to hire external employees. Contractors leave, and often, the ability to deal with exceptions leaves with them. New hire employees can add extra steps and delays, and may also be hit-or-miss.

We have seen more success bringing in contractors in more a consultative/trainer role, with a goal of upskilling the group. Another approach we recommend is to get the group to understand the tool, then bring in contractors to work on the backlog of tests after the team understands and is using the tool. Both of these lead to better outcomes over time.

Lack of feedback to find and resolve error

Assuming you avoid the other mistakes in this paper, one remains: the chance that continuous testing results in more defects found earlier. That is, the team does not improve their engineering skill, and continues to make a lot of bugs. This hampers the effectiveness of the effort, as the team could spend the time on new features but instead will spend it working on the test suite and fixing bugs.

To paraphrase Steve McConnell from the software classic Code Complete, while more testing will help, the second piece that may be missing is to create less bugs in the first place. McConnell actually compares better software outcomes to a weight-loss battle, and says the solution is to not stand on the scale more often, but to code better – eat less, exercise more.

One way to accomplish this is to periodically review the last 100 defects found by the team, especially those that escape to production, and consider how the defect was injected, and make changes to prevent or find it before continuous testing has a chance. This sort of multiple-ways-to-catch problems will reduce the appearance of those problems.

Continuous testing is good, but even better is to use the bugs it finds as a system to continually improve the entire process.


In our other paper, “Why Companies Fail at Test Automation”, we lay out eight problems with test automation. Here we explored two of these in great detail – the infrastructure piece, and, to a lesser extent, skill and role. Where that paper recommended mindfulness about the issues that are relevant, we consider continuous testing to be more clear. You will need to list the gaps your organization has and make a plan to address them. In an organization with any political reality, you will need to address these issues in the correct order, to show positive progress at every step. Multi-year, or, perhaps, even multi-month arguments to “stay the course” may be ineffective.

This is especially true at scale. On other issues we recommend experiments. Dozens of teams can try different methods, to “let a thousand flowers bloom.” For continuous testing, if the teams are using the same technology, we recommend a clear delivery pipeline, documentation, examples, training, and a process for teams to create tests. People need to understand and be able to repeat this process. So, make a delivery pipeline with one sample test. Then add more, likely with a ‘smoke’ suite as a project, then only for new development, in parallel with new development, before code is released to production.

There are a lot of ways that test tooling can accelerate or improve development. This, however, was a list of ways the Continuous Testing road can fail — along with a better way.

Originally Seen At:


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed