A Brief History-In-Progress of Running Tests in Parallel with PHPUnit

December 1st, 2011 by joe
  • PHPUnit is great.

One of the great things about PHPUnit is that it does its job so well that there’s only one notable testing framework for PHP. Other languages have messy competition between multiple frameworks, and PHP gets to have just one.

  • Like most testing frameworks, PHPUnit runs tests one-at-a-time.

This makes sense because unit tests are usually fast and so you don’t have to wait for them to finish. But these days, many PHP shops also use PHPUnit to kick off their Selenium tests. And because Selenium uses real browsers and exercises the entire end to end system, they are by nature much slower than unit tests. And herein lies one of the biggest speed bumps preventing PHP shops from delivering new code to their users quickly. Here comes the metaphor!

Your tests are people, and running them is feeding them. They all want to eat at the same time (lunch time is build time.) They get in line at the hot dog stand, run by the hot dog vendor, PHPUnit. PHPUnit quickly hands each of them a preheated hot dog, one at a time. Everybody gets fed.

Then you find out about this new kind of person, the Selenium test. They’re more powerful, but their meal takes a long time to finish. No problem! You give PHPUnit the facilities and training to make Selenium food by downloading PHPUnit_Selenium. But now, each time it’s a Selenium test’s turn in line, everyone has to wait for it to finish.

So maybe you heard about this restaurant across the street called Sauce Labs that specializes in Selenium food. They have hundreds of chefs waiting to cook for you. So you tell PHPUnit to use Sauce Labs.

But PHPUnit was brought up in a world of hot dogs, without Selenium or Sauce Labs. It doesn’t know how to multitask. It leads each Selenium test over to Sauce Labs alone, then sits next to it and watches while one Sauce chef cooks and 499 Sauce chefs anxiously twiddle their thumbs. PHPUnit needs to send all the Selenium tests to Sauce Labs at the same time, to take advantage of the idle Sauce chefs and get everyone fed in faster.

Many people, including several of our customers, have already realized how valuable PHPUnit parallelism would be.

They built their own parallelism support in-house. In particular, PBWorks’ OMGUnit, built on top of PHPUnit, launches their test files simultaneously to take advantage of Sauce’s cloud capacity.

That’s great for you if and only if you’re PBWorks, or one of the many PHPUnit users who have cooked up your own system that runs on top of PHPUnit. We don’t want all our customers to have to suffer those indignities. We wanted a parallelism solution to hand to all our customers, that works out of the box, regardless of setup.

I talked to PBWorks, and they graciously open sourced OMGUnit. The original plan was to make OMGUnit general enough to hand to our customers. I played with it until I felt like I had a handle on how parallelism works, but I realized there was a bigger opportunity here. The demand for PHPUnit to support parallel testing was risking PHPUnit’s future as the One And Only PHP test framework, and it didn’t have to.

  • So we contributed parallel test execution to PHPUnit core.

It’s fully cross-platform. As of Nov 30, 2011, it has not yet been released, and I probably have more work to do before it gets released. But you can get a working preview of it today! More on that later.

Once it’s released, you can run PHPUnit tests in parallel with a command line parameter:

-j|--jobs <count>

or with the following attribute on your TestCase class:

/*
* @runTestsInParallel <count>
*/

where <count>is the maximum number of processes you want PHPUnit to use in parallel.

  • It’s due for release in PHPUnit 3.7

In the meantime you can get a preview version of it from the Sauce pear channel, but it’s built off of some old versions of PHPUnit’s supporting libraries. There are steps for getting it working in our docs.

You’ll simply want to replace:

pear install -a saucelabs/PHPUnit_Selenium_SauceOnDemand

with:

pear install -a saucelabs/PHPUnit
Share

Announcing Selenium 2.8.0 support

October 20th, 2011 by Santiago Suarez Ordoñez

Selenium 2.8.0 is Selenium’s latest release. It includes some important bug fixes as well as great features people have been waiting for:

  • setFileDetector and support for file uploads in RemoteDriver for java
  • Native event support in Firefox 7

Here’s the changelog and the official announcement in Selenium’s blog for more information.

The current default version for our service is 2.6.0. But once thoroughly tested, we’ll be announcing the move to 2.8.0 as the default version for all of our users’ tests to run. In the meantime, you can start using this new version right now by adding the following Desired Capabilities/JSON key-value:

"selenium-version": "2.8.0"

If you see any issues after moving your tests to this new release, we definitely want to hear about it. And remember, once we move everyone over, you’ll still be able to test with previous versions using the “selenium-version” capability outlined above, in case you notice any issues with the default version.

For more information about the “selenium-version” flag, you can check our docs on Sauce OnDemand additional configuration.

Share

Announcing Selenium 2.7.0 support

September 26th, 2011 by Santiago Suarez Ordoñez

Selenium 2.7.0 is Selenium’s latest release. Released just Friday, it includes some important bug fixes to make your tests in the cloud even more reliable than before. Here’s the official changelog for more information.

Once thoroughly tested, we’ll be announcing the move to 2.7.0 as the default version for all of our users’ tests to run. In the meantime, you can start using this new version right now by adding the following Desired Capabilities/JSON key-value:

"selenium-version": "2.7.0"

If you see any issues after moving your tests to this new release, we definitely want to hear about it. And remember, once we move everyone over, you’ll still be able to test with previous versions using the “selenium-version” capability outlined above, in case you notice any issues with the default version.

For more information about the “selenium-version” flag, you can check our docs:
http://saucelabs.com/docs/sauce-ondemand#selenium-version.

Happy testing!

Share

The Surprising Worst Browser

August 17th, 2011 by joe

We have lots of data

Just like any good software company, we track all kinds of data. Last time we shared some of it with you, you loved it, and that data wasn’t even the good stuff. We thought it was, but it dawned on us recently that we’ve been indirectly been tracking something better. And when we realized where to look, we found something unexpected.

In case you haven’t heard, Sauce OnDemand is a tool for automating real browsers (try it out! it’s awesome). We have metadata about millions of browser sessions our customers have used to test their actual websites. As everyone knows, sometimes your software doesn’t work. Maybe it crashes or maybe you had a bug. Sauce OnDemand is no different – almost 100% of the time, nothing goes wrong; our reliability in the last few months is at least 99.94%. In fact, as you’ll see later, we’re now more reliable than modern browsers. But sometimes there’s an error that we think may have been our fault. When there is, we refund the customer and work to fix it. We also record that there was an error. Does that seem significant enough to be italicized? If it does, you’re smarter than we were.


We know which browsers cause the most errors

See, sometimes job errors were caused by connectivity, or bugs in our code, or maybe neutrinos from outer space. But some of the time, they were the browser itself crashing. For each error, it would take real investigation to figure out what caused it, and we have thousands of them. But our code and our customers’ code is independent of the browser being tested (which is the whole point of both Sauce OnDemand and Selenium), so if we only look at relative error rates broken down by browser, we can see which browsers are least reliable. Nobody else has this data. This is the *only* statistically significant study of browser reliability on real webpages. Check for yourself – it’s not out there.

Browser fight!

Error rates (percent) by browser and version*

The numbers in the graph above are misleading for a some of the browsers. Here’s why:

  • We stopped supporting Firefox 3.5 a while ago. That means all FF3.5 jobs are from a long time ago, before we’d had much time to streamline and errorproof our system. All browsers would have a higher error rate if we only look at jobs from a long time ago.
  • IE9, FF4, FF5, and Opera 11 are new. This is the opposite of the FF3.5 issue. All jobs run recently have lower error rates because our service continues to become more reliable as we fix bugs our users discover. These browsers’ jobs were all run recently.

Fair browser fight!

This is the same graph as above, with the unfairly advantaged or disadvantaged browsers removed


Error rates (percent) by browser and version*

Yeah, IE6 sucks

If you’re tech savvy, most of these results aren’t very surprising.

  • IE6 is one of the worst browsers. Each newer IE is slightly more stable but still not good.
  • Firefox is solid.
  • Google Chrome is the big winner overall. They force you to update to their newest version every session, so their error rate is an average across all their versions, but it’s still significantly better than even the newest versions of the other browsers.
  • Opera is fine.

The shocker is Safari story. Safari 3 – the oldest version of safari – is extremely reliable. Safari 4 is a good deal above average. And then there’s Safari 5.

The Safari 5 surprise

Safari 5, the latest in browser technology from the most valuable company in the world, is by far the worst on the market. Go have another look at that number – it’s almost twice as bad as second-worst, the oft-maligned IE6. And that comparison is unfair to IE6. See, Safari 5 was released recently, like Opera 11, IE9, and FF5. Those are the browsers whose error rates were unfairly good. Like them, all Safari 5 jobs were run on our newer, ultra-stable OnDemand infrastructure. We should expect it to have an extremely *low* error rate, like they do, but instead its error rate is ten times worse.

At first we thought the high error rate could be the result of the fact that we always run Safari on Windows, while it’s made by Apple.  That’s easy to dismiss, because earlier versions of Safari were fine. Then we explored the possibility that the errors were caused by always running Safari 5 in proxy mode (for arcane Selenium reasons). So we looked at the average error rate for non-Safari jobs run in proxy mode, and it put things back into perspective.

Proxy mode browser fight

Error rates (percent) by browser and version*

These are the error rates for browsers running in proxy mode (we don’t have enough data for Opera). Notice the new scale on the Y axis.

Surprise! The worst browser wasn’t a surprise after all

As you can see, Firefox actually seems to perform better in proxy mode, so we can’t say that proxy mode is always worse.  Chrome is unaffected. Safari 5 is still the worst Safari, but it’s no longer a huge outlier overall. IE7, the best IE in proxy mode, is on par with the worst Firefox, 3.6. IE8 is surprisingly much worse than IE7. And the king of being a bad browser, once again, is IE6. Hail to the king, baby.

*Percents are 1 – lower bound 95% confidence Wilson score for success rate

Share

Lessons Learned: Migrating from Selenium 1 to Selenium 2

July 21st, 2011 by Roger Hu

This guest blog post was written by Roger Hu, Software Engineer at Hearsay Social.

At Hearsay Social, we’ve upgraded our testing environment to use Selenium 2. We made the switch because there was enough evidence to suggest a huge 2-4x performance increase. Having learned a few lessons along the way, we thought it would be helpful to share what we found, especially for those who are considering making the transition.

  • Since Selenium 2 is redesigned to leverage what works best for the browser, whether it’s a NPAPI plugin in Firefox or a DLL module for IE, we’ve discovered a huge performance gain, especially in Internet Explorer (IE) browsers that have much slower JavaScript engines. The new approach seems to allow us to run Selenium more conveniently on Internet Explorer browsers without the hassle of changing the security options because of all the exceptions that were thrown as a result of the older JavaScript-based architecture.
  • Selenium 2 gets closer to simulating the behavior of a user on a browser.  In Selenium 2, the DOM element that is actually clicked is determined by the X/Y coordinates of the mouse event. Therefore, if you attempt to search for a DOM element that is hidden or obstructed by another element, the top element will always be fired and you might encounter ElementNotVisibleException errors from the Selenium server. You need to keep this issue in mind when rewriting your tests, since Selenium 1 versions may not have had this restriction. (We use the Django web framework and the popular django-debug-toolbar, which adds a popup overlay in our web application that has to be disabled in our application during Selenium tests.)
  • We’ve found that the new Selenium 2 WebDriver-based API is easier to train our developers to use. The documentation for Selenium 2 is still somewhat sparse, especially for the updated Python bindings, so digging into the source code (in our case, remote/webdriver.py and remote/webelement.py code) is still the best way to learn what API commands are available. While Java developers may have access to WebDriverBackedSeleniumclass that can use existing Selenium 1 code while leveraging the WebDriver-based API, we didn’t find any similar support for Python. So we took the plunge and refactored most of our tests.

    webdriver/remote/webelement.py:

     @property
        def tag_name(self):
            """Gets this element's tagName property."""
            return self._execute(Command.GET_ELEMENT_TAG_NAME)['value']
    
        @property
        def text(self):
            """Gets the text of the element."""
            return self._execute(Command.GET_ELEMENT_TEXT)['value']
    
        def click(self):
            """Clicks the element."""
            self._execute(Command.CLICK_ELEMENT)
    
        def submit(self):
            """Submits a form."""
            self._execute(Command.SUBMIT_ELEMENT)
    
        def clear(self):
            """Clears the text if it's a text entry element."""
            self._execute(Command.CLEAR_ELEMENT)

On the server-end, it’s important to study how the client API is sending remote commands by reviewing the JsonWireProtocol document posted on the Selenium Wiki, especially since Sauce Labs provides you with the raw logs to see what commands are actually being issued by the client.

  • While experimenting with Selenium 2, we found it much easier to test out the new WebDriver API by downloading and running the Selenium server locally. This way, your connection won’t constantly timeout as a result of using your Sauce Labs account, giving you more freedom to experiment with all the various commands. If you need to run browser tests against an external site while using your own machine to drive the browser interactions, you can setup a reverse SSH tunnel and then experiment with Selenium 2 API by setting debugger breakpoints and testing out the API bindings. In the long-term, though, you definitely want to use Sauce Labs for hosting all the virtual machines in the cloud for running your browser tests!
  • If you’re interested in using Firebug to help debug your application, Selenium 2 also provides a way to inject Firefox profiles. You can create a Firefox profile with this plug-in extension, and Selenium 2 includes an API that will base-64, zip-encode the profile that will be downloaded by the remote host. Note that this approach works best if you’re running the Selenium server locally, since using it over a Sauce Labs instance only gives you access to view the video.
  • Selenium 2 continues to be a moving target with its API, so you’ll want to keep up to date with any release notes posted on the Selenium HQ blog. Most recently, we found that the toggle() and select() commands have not only been deprecated but removed completely from the implementation. If you try to issue these commands, the Selenium server simply doesn’t recognize the commands and WebDriverExceptions are raised. The best thing to do is look at the Selenium version number. In this particular example, version 2.0.0 (three decimal places) are used to represent the release candidate of the latest Selenium build. You may also instantiate your .jar files with the -debug flag to watch how your client bindings execute API commands to the Selenium server.
20:38:02.687 INFO - Java: Sun Microsystems Inc. 20.1-b02
20:38:02.687 INFO - OS: Windows XP 5.1 x86
20:38:02.703 INFO - v2.0.0, with Core v2.0.0. Built from revision 12817
  • Selenium 1 users will find that is_text_present() and wait_for_condition() commands no longer exist, and are replaced by a more DOM-driven approach of selecting the element first before firing click() events or retrieving attribute values through get_attribute(). You no longer have to have wait_for_condition() for page loads. Instead, you set implicitly_wait() to a certain timeout limit to rely on find_element_by_id() to wait for the presence of DOM elements to appear to between page loads.
  • Lastly, we’ve noticed in the Selenium discussion groups that often there are questions about how to deal with concurrent Ajax requests during your tests.  In many test frameworks, there’s the concept of setup and tear down of the database between each set of tests.  One issue that we encountered is that if your browser is issuing multiple requests, you’re better off waiting for the Ajax requests to complete in your tear down function since the requests could arrive when the database is an unknown state. If this happens, then your Selenium tests will fail and you’re going to spend extra time trying to track down these race conditions. If you’re using jQuery, you can check the ajax.global state to determine whether to proceed between pages (i.e. execute_script(“return jQuery.active === 0″)). You’ll want to keep looping until this condition is satisfied (for an example of implementing your own wait_for_condition() command, click here.)

Hope you find these tips helpful for migrating over to Selenium 2. Happy testing!

Share

New Bamboo Sauce Release (Version 1.3.1)

June 24th, 2011 by Jason Smiley

Last summer, we introduced the Bamboo Sauce plugin at Atlas Camp, making it push-button simple for Bamboo users to run their Selenium tests on the Sauce OnDemand cloud. This post outlines enhancements to Bamboo Sauce to brings all the advances in Sauce OnDemand to the growing Bamboo community.

We’ve made a few enhancements to the Atlassian Bamboo Sauce OnDemand plugin over the last couple of weeks. The plugin allows you to set the Selenium configuration (eg. browser/operating system/etc) at the Bamboo build configuration level, making it a painless exercise to run your integration tests against a variety of different browsers using Sauce OnDemand without having to make coding changes.

Selenium 2 Support
The Bamboo Sauce OnDemand plugin now supports Selenium v1 or Selenium v2. If Selenium v2 support is selected, the plugin will set some additional environment variables that reference the selected Selenium configuration which can be used as part of your tests, eg.

String seleniumUrl = System.getenv("SELENIUM_URL");
String browser = System.getenv("SELENIUM_BROWSER");
String version = System.getenv("SELENIUM_VERSION");
String platform = System.getenv("SELENIUM_PLATFORM");
DesiredCapabilities capabilities = new DesiredCapabilities(browser, version, platform);
capabilities.setCapability("name", "Your Selenium 2 Test");
WebDriver driver = new RemoteWebDriver(new URL(seleniumUrl),capabilities);

Sauce Job Results in Bamboo UI
The Bamboo Sauce OnDemand plugin now supports embedding the Sauce Job information with the Bamboo user interface.

In order for the Job results to appear in the Bamboo UI, Sauce OnDemand needs to be told what the Bamboo build number is. The easiest way to do this is to use the selenium-client-factory to set the build number by running the following:

Selenium selenium = SeleniumFactory.create();
//cast the Selenium instance to SauceOnDemandSelenium
SauceOnDemandSelenium sauce = (SauceOnDemandSelenium) selenium;
//set the build number
sauce.setBuildNumber(System.getenv("BAMBOO_BUILDNUMBER"));

If you’re not using the selenium-client-factory, you can configure your tests to tell Sauce OnDemand the Bamboo build number by invoking the Sauce REST API.

Selenium-Client-Factory Usage Notes
We have also updated the selenium-client-factory library which makes running your Selenium tests run against either a local Selenium server or Sauce OnDemand a cinch.

Selenium example:

Selenium selenium = SeleniumFactory.create();

WebDriver example:

WebDriver selenium = SeleniumFactory.createWebDriver();

Instantiating your Selenium tests using the above mechanism will automatically use the configuration items specified on the Bamboo plan. That is, you won’t need to explicitly reference the environment variables that Bamboo sets – they will be used by the factory logic.

Share

Full-on Cloud Dev with Turnkey Integration Testing

June 21st, 2011 by John Dunham

Three key software trends motivated us to form Sauce Labs:

  • The rise of agile practice
  • The domination of open source in the development and deployment process
  • The rise of PaaS

Those three flows suggested that mainstream approaches to software were in for major change and we wanted to help with that.  On the rise of PaaS, we asked ourselves “if we move all our dev tools to the cloud, how are users going to test that stuff?”  ”With machines under their desk?”  “No, we don’t think so” we answered ourselves.

And so today is a red-letter day at Sauce Labs.  We proudly stand beside CloudBees supporting their newly announced  CloudBees Ecosystem.  CloudBees shares our vision that manual and automated integration testing must be an integral part of any cloud development system.  We believe in their vision that as the line between development and operations continues to blur, and deployment is in the cloud, the full development process must therefore live in the cloud.

Paraphrasing Damon Runyon, the race is not always to the swift, but that’s where the smart money is.  The software teams that deliver new value to customers the fastest are the ones most likely to win.  Embracing a thoughtful balance of automated and manual testing, dev teams take on appropriate levels of risk as they accelerate deploying new features and value to their users.  By inviting Sauce Labs to participate in their Ecosystem, CloudBees showed their keen understanding of this, and their view of the puzzle pieces necessary to make seamless dev-to-deploy in the cloud the game-changer it’s destined to be.

So, right on, CloudBees.  Congrats on your launch!

 

Share

Sauce Scout supports JIRA

June 9th, 2011 by The Sauce Labs Team

At Sauce Labs, we believe a video is worth 10,000 words. You should watch below to learn about our new Sauce Scout + JIRA integration.

*with apologies to our friends at Atlassian for the missing “s” in the demo

Share

We’ve been watching you (and we have graphs)

June 7th, 2011 by joe

Lots of teams run tests with Sauce, and we’ve been collecting data on them. Who doesn’t like data diving? Not us. We love it. So, like with any data dive, we have started to notice some patterns with the way people run their tests. Here, okcupid-style, are all the things we know about you just from how many minutes you use testing and when. The graphs you’ll see are minutes used by day for each day in an example customer’s tenure at Sauce, starting with day 0 as the first day they ran a test with us. We’ve broken them down by archetype. These are all actual customer data being shown to the world for the first time, but the names have been removed to protect the innocent (which is us (from lawsuits)).


The Addict
These are scatter plots that increase over time. They represent on-demand use by everyone in a company as it either gets adopted by more people in the company or as more tests get added to the build. The average minutes per day seen in rampups has great range, from 25 per day to 5000. Larger companies tend to fall into this category, as their dev infrastructure slowly switches over to using us. The ones that run many tests usually have high degrees of parallelism, from 20 to 50 tests at a time. They sometimes start out never parallelizing and then one day start running all their tests in parallel, and when they do that, their usage tends to increase more sharply in the immediatly following days.   They tend to be companies that aren’t software-as-a-service, like a travel agency, a sports site, a hospital finder, or a business intelligence consulting agency.

 


The Agile Shop
Yes, I said “agile,” and I realize it cost me 20 hipster points. These are the folks who, after a brief warmup period, start using us random amounts within a somewhat fixed range. These tend to be on the high usage side, averaging 1000 to 3000 minutes per day, and with high parallelism, topping out between 60 and 100 tests in parallel. They increase their use slowly over time, but it’s hard to tell with all the noise and the high volume. They’re companies that enter the game with a lot of their own in-house selenium tests that they switch over to using Sauce, usually a software-as-service model, like an online gaming site or ecommerce. This category of companies also includes ones that use us as a platform and sell a service that leverages ours.

 


The Daily Builder
These are characterized by flat lines of dots that indicate the same number of minutes being used each day. The number of minutes jiggles as latency affects the total nuber of tests, and on some days, possibly when a build fails and has to be restarted, the line jumps up. The line sometimes changes height when the company decides to run more or fewer tests in its daily build, or to change which browsers they run their tests in. The total number of minutes used in a dailybuild is usually not very much – between 80 and 200, with varied parallelism. Some don’t parallelize at all, while some get up to 25 or 30 tests in parallel. They tend to be very plugged-in companies in the tech community, like social media startups or very famous tech companies who won’t let us mention them by name. In this example, the customer ran one test 140 days before they really adopted us, and I removed it from the graph as an outlier; that’s why the day axis starts at 140 instead of at 0. This company is on the extreme low end of usage for a daily builder, but is the best example of a very-flat line.

 


The Zombie
These are users who signed up and paid for a subscription but never ran any tests. They pay us anyway. Maybe they plan to adopt us soon! There are not many of these and they don’t have much in common. Their graphs are boring, so we plan to send them all a nice note saying we’re canceling their subscriptions and they can re-enter their credit card when they’re ready to provide more entertaining graphs.

 


The Abandoner
These are users who ran few minutes for a little while, with no parallelism, and then left. If you are one of these, you are a scarce resource and we want to hear from you! Please tell us why you left. We want to fix it.

For this graph I’ve added a day of 0 minutes where today would be, to make the graph wider. All the other graphs end at today.

 


The Contractor
These guys run lots of tests with huge gaps in between testing cycles. Note the vertical gaps in the graph around days 30, 120, 180, and 300. Some of them are companies that are obviously consultants and some aren’t. We think they’re using us to test the webpages they’re building for someone else while they have a contract. There aren’t enough of them to generalize the number of tests they run or how parallel they get. They’re usually design agencies.

 


The Boomerang
These are guys who used us for a little while, left for many days, and then came back in force. They were probably people with some weird dev infrastructure that took work to integrate us. They might have done an exploratory sprint to see how we worked without integrating us into anything, then had to drop us for a while before they could invest the time to integrate. If you are one of these, we’re sorry we were hard to adopt. Please tell us what the difficulty was so we can smooth it out!

 


The Test Czar
These are characterized by dots that seem to form dotted lines, not scatter plots. Unlike the daily-build companies, these have sloped lines, not flat lines. Our best guess is that they also have a daily build, but have some person or group who manually curates which tests are run in the build, with the job of keeping build times down. This would explain why the build seems to spike up and then hold flat for a while before linearly decreasing over a few days as they prune or tune tests. They top out at around 200 minutes per day, and they don’t run many tests in parallel. There are very few of these, and their verticals aren’t similar; a fashion company, a major university, etc. These appear to be a fluke of internal management decisions.

 

Share

Test::Right: Browser testing done right

May 5th, 2011 by Steven Hazel

Okay, honesty time: browser testing sucks. It’s stuck in the stone age. Browser tests are great when they’re working well, but they’re way too hard to get right, way too expensive to maintain, and just overall not a lot of fun. That’s because existing tools lead us to build brittle, sequential tests that can be clobbered by tiny application tweaks. What starts as straightforward test creation ends up with data, actions, and selectors thrown together in a big mess. We paper over the mess with abstractions like Cucumber and Capybara, but we haven’t addressed the fundamental issues.

We’re the cavemen of browser tests.

Fundamentally, these are the problems with browser tests as they’re written today:

  • Getting started is hard. It’s important to structure your tests in a way that will work out well long-term, but guidance is scarce.
  • Browser tests are brittle. Small changes in the DOM, in application performance, or in other tests often generate a lot of menial labor to repair tests. A single DOM change in your application can easily cascade into hundreds of broken tests — and cost staff days to fix them.
  • Browser tests are slow. The solution is to run many tests in parallel, but current testing frameworks lead us to write tests that don’t parallelize well.

All this makes writing and maintaining tests an expensive, and more importantly unpleasant activity.

We can do better. It’s time for a new way. As an industry, we know how to do this a lot better than we’re managing to in most cases. It’s time to accept a standard, move on, and get things done.

Test::Right aims to be that standard.

The ultimate goal of Test::Right is to make it impossible to write bad tests. That’s right, read it again. Impossible to write bad tests. That’s a serious goal, but we’re serious about it.

Are we there yet? No. But we’re a whole hell of a lot closer than anything popular out there today.

Our philosophy in creating Test::Right is to directly address the problems with browser testing from the start:

  • To make it easy to get started, test authors need an opinionated framework that chooses a thoughtful structure for tests based on our extensive experience working with large test suites.
  • To make browser tests less brittle, we have decoupled intentions from mechanics. Stable intentions like “log in” and “check account status” are disentangled from volatile mechanisms like “click the DOM element identified by this CSS selector”. We isolate the parts of tests that change frequently in a single place instead of scattering and duplicating them throughout tests, making tests robust and maintenance easy.
  • To make browser test suites run faster, we parallelize your tests from day one. Your builds will always be fast, and because every build checks that your tests run in parallel, you’ll never develop performance problems that will bite you later.

Test::Right is an idea whose time has come, born of dozens of on-site integrations of our service in household-name companies and some not-so-household name places. We as a small team have seen more browser test badness than any sane person should have to, and the tools are at fault. Test::Right is designed to address the mainstream case we’ve had the privilege to experience over the past 16 months.

We’re releasing it as a work in progress, but it already has a ton of useful functionality. We’ve put a stake in the ground with a whole slew of opinions that are sure to get a discussion started. And that’s really where we need you.

We built Test::Right by borrowing concepts from the Page Object model, but went even further by enforcing clean separation of concerns. With Test::Right, your test cases are defined in terms of actions and properties on widgets. A widget is a piece of functionality present on one or more pages of your application. A single page can have many widgets, and multiple copies of a widget may appear on the same page. Tests are grouped by feature, and tests don’t have direct access to the underlying Selenium object.

  • Widgets bring the Page Object concept into the Ajax age and decouple tests from DOM, making it far easier to adapt to frontend tweaks. They give your tests an application-specific vocabulary that maps directly to what users do with your app, and let tests make assertions about high-level aspects of your app, instead of low-level DOM state.
  • Automatic spin asserts ensure that your assertions work as you expect them to, even with Ajax.
  • Out-of-the-box parallelization gives you speed and scalability from day one.
  • Random test ordering prevents you from writing bad tests that serially build up state, depend heavily on one another, and can’t be parallelized.
  • Data factory generates isolated test data so that tests don’t clobber each other.

Test::Right is in Ruby because it seemed like a good place to start. We believe these ideas make as much sense in any other language.

Whether or not you use Test::Right, we want you to come away from it with ideas and opinions. The browser testing community needs to wake up and start writing good tests, and with Test::Right we’re going to make that happen.

And I, personally, look forward to hashing out the future of browser automation with you.

Check out the Test::Right source code on GitHub to get started. Let us know what you think, and enjoy!

Share