Using The Continuous Testing Benchmark to Build Digital Confidence

Jun 16, 2020

Woman smiling at work in front of a laptop

Last month, Sauce Labs released the second annual Continuous Testing Benchmark Report, also known as the CTB. Using the more than three billion tests run on our platform, this report provides quality teams with real-world data and best practices insights to help them understand how they are performing and where they can look to improve their testing practice. Furthermore, now with its second iteration, Sauce Labs can also report back on how our collective customer base is improving over time by comparing to 2019 numbers.

The CTB is the only report of its kind that allows teams to compare themselves against industry benchmarks. However, with more organizations further focusing their efforts on building digital confidence, how can teams best use the data from the report to improve their own practices to ensure that they are delivering flawless digital experiences to their users? In this post, I will break down the four key components of the CTB that define testing success, and show how you can use the report to help create success metrics for your team to start on the path to true digital confidence. 

Platform Coverage

Digital confidence centers around the idea that everyone in your organization knows the applications you’re developing work exactly as expected. But in an economy where consumers have a seemingly endless variety of browsers, operating systems, and mobile devices from which to access those apps, ensuring the experience is flawless for all users can seem daunting. 

In the CTB, Sauce Labs defines the benchmark for excellence as any test that’s run across at least five different combinations of browser/OS/device. And the good news is that almost 75% of our customers achieve that benchmark in desktop tests (which is up 12% from last year), while about 63% of mobile tests meet the mark. And while this is good for general benchmarking, we encourage our customers to rely on other data points to help understand what kind of platform coverage will help build digital confidence in their applications. What kind of user data do you have access to that shows where they are coming from? What other third parties are collecting information on users in your industry? Understanding your customers, and what technology stacks they are working with, will help you hone in on where you should focus your coverage efforts to ensure that  all of your users have the best experience.

Efficiency

This metric is all about speed, because what is the point of implementing modern development methodologies if testing can’t keep up? To stay on pace with the release velocity that Agile or CI/CD demands, development teams are now asked to run hundreds, and even thousands of tests per day. To avoid the bottlenecks these tests can create, running them in parallel becomes of the utmost importance.

Sauce Labs measures testing efficiency by calculating the percentage of provisioned test capacity an organization utilizes when their suite is running during peak testing periods, with the benchmark for excellence set at 75%. The good news is that this number is up as well, with almost 78% of customers meeting that 75% threshold. However, one thing we know about development practices is that things are only going to start going faster (especially with the advent of AI and Machine Learning technologies). It’s critical that every team make the most of their test capacity, and that they are continually monitoring the performance not just of the development pipeline, but also the testing pipeline. Failure to consider this leads to bottlenecks that slow down release velocity, and threaten digital confidence in your organization.

Test Run Time

Tangential to our efficiency metric, test run time is also an indicator of speed and how effective quality is at accelerating release velocity. But it’s not just about going fast, it also is an indicator of quality best practices. In the 2019 Continuous Testing Benchmark, our data showed that tests that are longer than two minutes are 1.5 times more likely to fail, and those longer than seven minutes are twice as likely. Flaky tests that consistently fail can quickly kill trust in automation in your organization, as they lead to stalled pipeline velocity, and increased workload as teams struggle to find out what went wrong in a given test.

Because of these stats, Sauce Labs set the benchmark for excellence in Test Run Time at less than two minutes (note that this is calculated as the average run time across any given organization). While there was a promising 10-point increase from last year's data, less than 50% of Sauce Labs customers have an average test run time of two minutes or less (across both desktop and mobile). For quality to truly find a foothold in your digital strategy, it’s important to build trust in the fact that testing can speed up release cycles and not create costly bottlenecks. Striving to follow tried and true best practices for test run time is key - this means ensuring tests are short, atomic and autonomous. Continuing to implement those best practices in your organization will ensure testing excellence, and engender digital confidence.

Test Quality

You could make the argument that test quality is the most important metric for continuous testing excellence. The previous three metrics, while important in their own right, can all be traced back to the idea that tests themselves must be high quality in order to be considered successful. Because at the end of the day, the goal of tests is to ensure that they all pass before releasing code into production. Failing tests create roadblocks in the pipeline, and frustration among development teams. So to build confidence in testing, quality is key.

The CTB defines excellence in test quality as any organization that has an average test pass rate of 90% and above. While there was still some small uptick in desktop testing pass rates, there is still a lot of room for improvement. For both desktop and mobile, about 75% of Sauce Labs customers have a pass rate of less than 90%. This is a huge area of improvement for many teams that want to succeed in their testing initiatives and build digital confidence. However, it can also be one of the most difficult things to fix, as failures are common and take time to debug. Sauce Labs understands the ongoing struggle of maintaining test quality, which is why we recently announced early access of our Failure Analysis capability. Using machine learning, developers can for the first time get rapid insight into how often the same type of failure repeats itself across a given test suite. By finding and aggregating the most common causes of test failures in this manner, developers can quickly move to address the most pervasive issues and see real improvements in test quality. To learn more, check out our Failure Analysis demo.

A sound testing strategy is the key to creating digital confidence in your organization, which empowers teams to continually deliver flawless digital experiences to customers. But achieving testing excellence can be a long and winding road filled with success and pitfalls. Even if you’re not a Sauce Labs customer, using the Continuous Testing Benchmark gives you access to data that allows you to start setting realistic goals for your team in areas that will have an immediate impact on your testing efforts. Start small with one of the above success metrics, understand where your team is at today, and start identifying ways that you can affect change to see positive momentum. What’s important to remember is that the eventual goal isn’t simply improving test quality or increasing coverage. Rather, it’s about creating the confidence in your organization that the applications you are building are providing the best experience for your customers, and helping your business grow.

To learn more about how Sauce Labs helps some of the largest digital organizations achieve the true benefits of digital confidence, visit our website.

Written by

Alissa Lydon

Topics

Automated testingContinuous testing

Categories