Browser failure rates

Browser failure rates

Just like any good software company, we track all kinds of data. We shared some of it in this blog post. In the current article, we share some more -- another kind of data that we’ve been indirectly tracking and that is quite surprising.

SauceLabs provides tools for automating real browsers. We have metadata about millions of browser sessions our customershave used to test their actual websites. As everyone knows, sometimes your software doesn’t work. Maybe it crashes or maybe you had a bug. Selenium testing on Sauce is no different – almost 100% of the time, nothing goes wrong; our reliability in the last few months is at least 99.94%. In fact, as you’ll see later, we’re now more reliable than modern browsers. But sometimes there’s an error that we think may have been our fault. When there is, we refund the customer and work to fix it. We also record that there was an error.

 

The browsers that cause the most errors

Sometimes job errors were caused by connectivity, or bugs in our code. But some of the time, it was simply the browser itself crashing. For each error, it would take real investigation to figure out what caused it, and we have thousands of those errors. But our code and our customers’ code is independent of the browser being tested (which is the whole point of both Sauce OnDemand and Selenium), so if we only look at relative error rates broken down by browser, we can see which browsers are least reliable. Nobody else has this data. This is the *only* statistically significant study of browser reliability on real webpages. Check for yourself – it’s not out there.

Browser fight!

Error rates (percent) by browser and version*

The numbers in the graph above are misleading for some of the browsers. Here’s why:

  • We stopped supporting Firefox 3.5 a while ago. That means all FF3.5 jobs are from a long time ago, before we’d had much time to streamline and error-proof our system. All browsers would have a higher error rate if we only looked at jobs from a long time ago.

  • IE9, FF4, FF5, and Opera 11 are new. This is the opposite of the FF3.5 issue. All jobs run recently have lower error rates because our service continues to become more reliable as we fix bugs that our users discover. These browsers’ jobs were all run recently.

Fair browser fight

This is the same graph as above, with the unfairly advantaged or disadvantaged browsers removed

Error rates (percent) by browser and version*

IE6 is worst

If you’re tech savvy, most of these results aren’t very surprising.

  • IE6 is one of the worst browsers. Each newer IE is slightly more stable but still not good.

  • Firefox is solid.

  • Google Chrome is the big winner overall. They force you to update to their newest version every session, so their error rate is an average across all their versions, but it’s still significantly better than even the newest versions of the other browsers.

  • Opera is fine.

The shocker is the Safari story. Safari 3 – the oldest version of safari – is extremely reliable. Safari 4 is a good deal above average. And then there’s Safari 5.

The Safari 5 surprise

Safari 5, the latest in browser technology from the most valuable company in the world, is by far the worst on the market. Go have another look at that number – it’s almost twice as bad as second-worst, the oft-maligned IE6. And that comparison is unfair to IE6. See, Safari 5 was released recently, like Opera 11, IE9, and FF5. Those are the browsers whose error rates were unfairly good. Like them, all Safari 5 jobs were run on our newer, ultra-stable OnDemand infrastructure. We should expect it to have an extremely *low* error rate, like they do, but instead its error rate is ten times worse.

At first we thought the high error rate could be because we always run Safari on Windows, while Safar is made by Apple. That’s easy to dismiss, because earlier versions of Safari were fine. Then we explored the possibility that the errors were caused by always running Safari 5 in proxy mode (for arcane Selenium reasons). So we looked at the average error rate for non-Safari jobs run in proxy mode, and it put things back into perspective.

Proxy mode browser fight

Error rates (percent) by browser and version*

These are the error rates for browsers running in proxy mode (we don’t have enough data for Opera). Notice the new scale on the y-axis.

Surprise! The worst browser wasn’t a surprise after all

As you can see, Firefox actually seems to perform better in proxy mode, so we can’t say that proxy mode is always worse.  Chrome is unaffected. Safari 5 is still the worst Safari, but it’s no longer a huge outlier, overall. IE7, the best IE in proxy mode, is on par with the worst Firefox, that is, 3.6. IE8 is surprisingly much worse than IE7. And the king of being a bad browser, once again, is IE6.

*Percents are 1 – lower bound 95% confidence Wilson score for success rate.

Note: The data examined in this article was derived from logs of tests our clients ran, where the browser crashed, the test aborted prematurely or something else went “wrong”. We aren’t testing individual browsers against features or rendering or compliance with standards; only aggregating statistics about which browsers didn’t complete their tasks for some reason, across our millions of tests.

Our data is about stability of individual browsers, given they’re being automated with Selenium. For data about feature tests for individual versions of individual browsers, see ACID and its kin.