This is what our build looked like. He was not happy. Notice how his eyes are red. That’s because of failing tests, which as we know turn builds red. These failing tests weren’t normal failing tests, which are actually pretty happy, because they are telling you how to fix your broken stuff. No, these tests were intermittent. That’s largely the fault of the fact that they were selenium tests – go figure, the selenium shop that sells selenium as a service is having selenium issues, lol (but sadface).
Well, to get out of sadfaceland, we dug into one of the sadtests and saw something surprising. It was a case of selenium looking for an element before it appeared. Wait, what? We already know how to fix that. In fact, we already told the whole world how to fix it: with spinning! Basically, the way a spinassert works is by trying the assertion over and over until it passes. This is a big deal with selenium because trying to interact with an element on a page before the page has loaded is a frequent source of flakey tests. We even went so far as to write Test::Right (source on github), a ruby test framework which goes out of its way to make it impossible for you to do write selenium tests without spin asserts. So what happened?
Well, we don’t use Test::Right for all our own tests. Test::Right is ruby and our tests are written in the same language as our back-end; python. And while we were using spin-style asserts for many of our tests, some of them were written before we discovered the magic of spinning, and some were written when we just didn’t remember to use spinning. Also, there were some things that were not assertions but still required spinning; like, if you want to click on an element in a page, it makes sense to spin-wait for that element to be present before clicking on it. Test::Right does the right thing in all these cases; we just had to be more like Test::Right in-house, to prevent sadness from creeping back into the codebase again.
So the fix was clear: modify our own test framework to always spin-wait all the time for every DOM element to show up, every time anyone tried to interact with a DOM element, automatically. We did that, and it made the butler happy. All of a sudden we could trust the build again, and it didn’t take hours to wait for a nonflakey build so that we could deploy new changes. This in turn meant that we never had to hotpatch production when there was an issue, which meant we didn’t have to spend time cleaning up hotpatches when a deploy happened. It also unblocked people who wanted to push new changes to trunk. Basically the whole dev process got better.
So here’s the thing – our customers are having the same issue. Not everyone uses Test::Right. We know this because we have a support team and customers contact us all the time asking for help with flakey tests. In our over 3,000 support requests, race-condition flakey tests are the #1 cause of pain. The shocker is that except for Test::Right (and now our own internal python nose framework), nobody has this universal fix.
So we’re hoping to make test frameworks for all the common languages – PHP, Java, Python, at least – that do what Test::Right did. We already have our own PHPUnit extension (thanks Jan Sorgalla!) which will probably be the first to get the Test::Right makeover.
Sadly, we are a focused team, and we have many tasks to accomplish, so this fix to selenium client libraries will likely not be imminently released by us. Thus we decided to at least tell you all how to Fix Everything on your own side. Now that I’ve done that, back to programming! Good night and good luck.