Optimizing Our Regression – Are We Ready?
Optimizing Our Regression – Are We Ready?

"Our regression execution effort takes too long…" – is a sentence I hear much too often from clients when trying assist them to optimize their regression pool. It is a real challenge to reduce regression testing as the whole aim of this major effort is not to find any defects while covering most of the product's functionality. From my experience, companies are spending around 40-60% of their test execution effort on regression testing, in order to make sure that their clients experience would stay the same as the last release was, and even better.
But, we all know the pesticide paradox, which states that our systems are getting more and more resilient to our regression test cases, and so as test managers we may ask ourselves why we should run such a huge effort when the chances or probability for getting defects found is getting closer to none? Understanding this dilemma, I have exercised a few times, I came to a conclusion that something must be done in order for us to be more effective and efficient in our regression test execution.
This article will emphasize what are the basic first steps on the road to having a better regression test execution and optimization of that pool.
[Published at Testing Experience Magazine, November 2009]
What is Regression testing?
Regression testing is selective retesting approach of the system or product with the objective of assuring the bug fixes work correctly and those defect fixes have not caused any undesirable effect/s on the parts of the system which haven't change.
Regression testing is also using a selective set of test from the regression test cases pool, and executing them to assure that the new system does not un-intentionally caused the creation of new defects in the parts that were not touched in the new release, thus hurting the current clients.
The first relates to running tests in the same release, the second is about running tests that relate to the previous release.
Motivation for optimization
- Each new release of the system ort product could potentially cause existing features to fail.
- Clients are depending on the fact that the current installed system or product will work properly (at least as the one installed at their servers)
- A failure in the new system or product release would lower the confidence of clients in the system or product, and would harm the branding company.
- At the end of a release, or big cycle, regression tests are done to make sure that no defects were introduced after fixes. Such defects can degrade the system quality, and customer experience.
A few factors influence regression testing along the life cycle, and if we take care while performing those activities, we shall have better chances to optimize our regression test cases pool:
- Test planning aspects exercised
- Test design – techniques used, coverage investigated
- Test Case management approach
- Test Assets maintenance
If we invest a lot in regression test execution, that can come up to 12-15% of the whole system/product development budget; we must make sure that we perform proactively to optimize that pool of test cases.
Process (and other) Assumptions
In order for us to start optimizing our regression pool, certain characteristics of that regression pool and of our process must be in place. Mainly it is to 'guard' our regression pool from being 'damaged' while in the process of building it. Those preventive actions must be carried out during the activities mentioned before, during the life cycle.
The following assumptions that I will make, can make sure that I will be able to optimize my regression pool wisely, and effectively. But, while they are essential to that purpose, most organizations have not yet realized nor proactively worked to get those done. Hence, they are struggling to optimize their biggest effort of regression testing.
Assumption – 1
Regression test cases do not normally need to test bound, invalid data, etc' – normally they will be designed and focused to test the system for how it was designed to.
It is assumed that the above were tested prior to that regression test effort, in previous releases.
Note: A good system test execution phase, according to some industry statistics, may reach only 25% code coverage, and still be very robust.
Assumption – 2
Test cases of each release, are designed to cover most (if not all) of the system or product functionality. Even if we exercise risk based testing, that criteria is still a part of our goals.
Non functional testing is conducted as well to the max coverage if possible and when relevant.
Hence, regression pool of test cases has the potential of covering the whole product capabilities.
Assumption – 3
Regression pool of test cases has minimal scenarios redundancy.
This is being kept across the different versions that went into production in the past.
Note: to some industry statistics, the redundancy of test cases within the regression test cases pool may reach 10-30%. That is mostly a waste of execution time or what I call 'ghosts' inside the regression pool.
Assumption – 4
Test cases for a release are managed according to risks, and has a risk index attached to each test case that was planned to run covering a certain area of functionality of the system or product.
Hence, regression test cases have the same risk index attached from previous releases.
Note: risk based test management has the potential of reducing ones test cases by 30%, while increasing the DDP and added value to the customers substantially.
Regression Test Approaches
There are a few regression test execution approaches which are straight forward, and exist in the market:
- Run all test cases from previous release again.
- Applicable to some systems or products, mainly for safety critical related ones, when strict regulations and standards must be applied (GE Health are using such approach on some of their product, and some more companies)
- Run most of the regression test case pool.
- Happens in companies that were enabled to automate most of their regression pool, and probably most of their test cases
- Run a subset of the regression pool.
- This is the common approach,
- Mostly it is derived from the risks attached to certain areas of the system or product.
The impact of regression testing is shown on the next figure:
<<picture>>
Figure 1: The impact of regression testing
We may use other aspects to determine which test cases from the regression pool will be run: size of the changes, area of the changes, complexity of the changes, past bug logs.
Conclusion So Far – there is no single great regression test approaches and those that exist are context dependent.
But we can learn from them and find out what lays in the basis of their decision making process. Understanding that will give use the ability to enhance, and leverage our decision making process, and optimize our regression pool.
Regression Test Strategy
We may classify regression testing on any dimension:
- Regression on any test level: unit, integration, system, acceptance, etc'.
- Regression on any test type: functionality, usability, performance, reliability, etc'.
- Regression on any configuration supported by our system or product.
Investigating those strategies, will have the same characteristics and assumptions that must be kept in order to maintain a 'clean' regression pool of test cases, and must be integrated into the process during the life cycle.
Summary
Regression test execution, demands that we be proactive along the testing life cycle, keeping in mind that the test cases that we develop today, in this release, will be integrated into the regression pool of test cases of tomorrow – of next releases. Having that in mind, and analyzing the above, regression strategy should at least relay on 3 major factors:
- Risk Assessment results and data
- Coverage information
- Past defects logging
Those 3 elements encapsulate most of the data we need, to perform optimization of our regression pool, but that is only after we make sure that we have the ability to clearly say we have implemented the assumptions mentioned above.
Clearly this is only the 'opening shot', and we still have to investigate the objectivity of these above assumptions, and whether they fit many cases – meaning many types of products and system. Also, there are questions about the improvement of the process of implementing and integrating those factors into a pragmatic way of establishing and maintaining the regression pool effectively and efficiently (beside the few implementations I have already experienced and participated in), and that is my goal of realizing and pursuing in future work, and future articles.