Improving Your Web Applications with Selenium WebDriver

Posted by Erik Dietrich

As you build a web application, your testing efforts start off simply enough. First, add a text box and button to a page. Then test the page by adding text to the box and clicking the button. But somehow between that initial test of your first functionality and testing a polished application across many browsers and devices, things can get really complicated. Selenium WebDriver can handle a lot of that complexity for you, however, if you leverage it correctly.

Historically, multi-platform deployment and robust testing have existed as natural enemies. Throughout the mid 2000's, companies deployed countless internal web applications guaranteed only to work with Internet Explorer 6 on Windows XP. If you wanted to use Internet Explorer 7 when it came out, they made no guarantees that things would work. And something exotic like upstart Firefox? Forget it.

Organizations didn't do this to annoy their users; they had a very pragmatic piece of reasoning. Testing the application across multiple operating systems, browsers, and browser versions demanded so much effort, that it was easier and less expensive to completely homogenize their user base. That speaks to just how difficult and time consuming testing web applications becomes.

Of course, the world adapted, eventually. Internet Explorer's dominance receded and users demanded functionality in their browser of choice. And they wouldn't accept serially broken functionality. So, something had to give.

The History of Selenium WebDriver

Forward-thinking developers did not just sit idle during this period of IE6 dominance. Even as the web application testing conundrum seriously hampered businesses, clever people sought ways to address it.

Developers at Thoughtworks, specifically, had some ideas. While working on a web version of a time and expenses application, Thoughtworker (and later, Sauce Labs cofounder) Jason Huggins built a tool that could obey encoded scripts. And so Selenium was born. Another Thoughtworker, Paul Hammant, built on this automation idea. He introduced a second mode of operation for Selenium, that allowed remote "steering" of the functionality over TCP/IP. This meant that users could drive the functionality using the programming language of their choice. Selenium thus had two operating modes: core and remote-controlled (RC). This had powerful implications for testing. Selenium users could now script interactions with browsers.

Thoughtworkers and eventually others would continue to evolve Selenium. By 2007, a competing design emerged that would become known as WebDriver. Whereas Selenium 1.0's RC mode worked via Javascript that ran in all browsers, the new design operated via idiomatic, close to the metal plugins for each individual browsers. The interested parties all heralded this as an evolutionary improvement. And so, they eventually merged the original Selenium with the new WebDriver and released the merger as Selenium 2.0. WebDriver had replaced RC mode.

(You can read a more detailed history of Selenium WebDriver here).

Selenium WebDriver as a De Facto Standard

As all of this development took place, Selenium WebDriver made its way beyond the purview of Thoughtworks. Some of the original developers had moved on and Thoughtworks open sourced the technology, so the world (via a Selenium committee) owned it.

As it acquired functionality, the world continued to struggle with the problem of testing web applications. And this problem only grew worse with the rise of competitive browsers, mobile technologies, and competing operating systems. Web application authors saw their conceptual binders of test cases grow exponentially.

Against this backdrop, Selenium WebDriver emerged as a de facto standard. As browsers multiplied, contributors could write plugins to support them. This allowed test automation to expand to new browsers with a fraction of the effort. And, on top of that, Selenium users could automate with their language of choice. This lowered barriers to entry even further.

As many-browser support became the new reality, serious browser automation meant Selenium WebDriver.

Selenium WebDriver Architecture

Let's look in a bit more detail at just how this works such a broad user base. Selenium WebDriver boasts an architecture that makes it incredibly extensible and flexible.

If you've done any work in an object-oriented language, you've probably heard of the so-called "gang of four" design patterns. One of these is called the bridge pattern. Its description often sounds intimidating: decouple an abstraction from its implementation. But this is actually a fairly simple, if powerful, concept. Think of light switches in your house. The switch is your abstraction and the light turning on or off is the implementation.

You want to be able to change light bulbs without caring whether you have a push-button or rocker switch. And you also want to be able to change from a push button switch to a rocker without caring whether you have a yellow or white light bulb.You want to do this to turn the light bulb/light switch combination problem into an additive one instead of a multiplicative one. And Selenium WebDriver's architecture works exactly the same way.

First, you have a collection of bindings representing your language choices for scripting (e.g. C#, Java, etc). Then you have a collection of drivers representing the different browsers. In the middle sits the Web Driver API. You can now add bindings and browser drivers independently, making for incredible depth of support. If some new browser comes out and someone writes a driver for it, then people using any binding can make use of it. Likewise, if someone adds a binding for a new programming language, they'll immediately have use of all available drivers.

Consider an Example

Let's see what this actually looks like using a bit of code. For example purposes, I'll use Java, but you can easily extrapolate the idea here to your preferred language. The idea is just to showcase how one particular binding makes use of the API to allow easy use across different drivers.

 public void SeeWhatHappens() {
     GoogleYourself("Joe Smith");
 public void GoogleYourselfAcrossBrowsers(String name) {
     GoogleYourself(name, new FirefoxDriver());
     GoogleYourself(name, new InternetExplorerDriver());
     GoogleYourself(name, new ChromeDriver());
 public void GoogleYourself(String name, Driver driver){
    WebElement searchText = driver.findElement("q"));


The code above consists of three different methods, with the first a trivial one to kick things off. In that method, I call GoogleYourselfAcrossBrowsers and supply a pretty generic name. GoogleYourselfAcrossBrowsers then invokes GoogleYourself using three different drivers that it instantiates.

For its part, GoogleYourself does exactly that. And it allows you to see a bit of the API in action. Retrieve the homepage of google and then locate the "q" element (for query). Then send the keystrokes for the name to that element and perform a submit, before quitting.

Taken as a whole, this code will Google the name in question using Firefox, Internet Explorer, and Chrome in sequence. Now, imagine a future in which some new browser comes out. You could add it to your test strategy by adding the appropriate driver dependency and then adding a single line of code.

Rethinking Your Test Strategy

If you've lived without this sort of test automation, hopefully you're starting to understand its power. You might have tests at the unit level, and perhaps you're got automated system and integration tests as well. But, without Selenium WebDriver, there's a good chance you're not automating at the top of the testing pyramid.

Instead, you're probably doing this manually. And, in the world of web applications and multiple versions of multiple browsers across multiple devices, you're probably doing a lot of highly repetitive testing. Or, if you're not, then you're probably taking an awful lot on faith.

With Selenium WebDriver, you can really address a serious blind spot in your testing strategy. Handing QA a gigantic binder full of repetitive test cases and asking them to verify with every release creates mind-numbing work better spent on other things. And, it wastes money.

Consider leveraging Selenium WebDriver to automate this part of your testing approach, particularly since it lends itself quite well to automation. Free QA up to focus more on exploratory testing and other approaches that require more human judgment.

Selenium WebDriver Use Cases

Let's now get a bit more concrete about actual use cases for Selenium WebDriver. Hopefully, you understand now that it can give your testing strategy a makeover, while making it much more comprehensive. But let's look, in general, at what you can do with a web automation framework.

  • Propagate all forms of testing across a multitude of browsers.
  • Create much more comprehensive automated regression testing.
  • Generate a very visible display of functional or acceptance testing for your application.
  • Learn an awful lot about the document and page object models.
  • Help you perform powerful demos of your application's functionality.

As you can see, beefing up your testing strategy features most prominently. This form of automation can help with regression, functional, and acceptance testing. But you can also realize a couple of additional perks. In order to get good at automating manipulation of the GUI, you need to develop an in-depth understanding of the GUI's elements. And, you can use the automation to demo functionality both to a product owner and to end-users or other stakeholders.

Improving Your Web Applications

You can do a lot with Selenium WebDriver. Much of it focuses around testing, but it has additional use cases as well. In fact, you can probably come up with some for your unique situation that I haven't listed here. This is a powerful tool.

Overall, this adds up to making improving your web applications and making your shop better at what it does. We've come a long way since the days of Internet Explorer 6 dominating the market. Browsers have proliferated and people have more choices than ever, while having higher expectations than ever before. You can differentiate yourself by keeping up with that proliferation while delivering high quality web applications. But you can't do that without help. You need tools in your tool belt, and Selenium WebDriver is one of the most powerful ones out there.

Free Trial

Get access to a free 14-day trial version, or contact Sales for more information.