BiDirectional APIs in Selenium 4

One of the most interesting and awaited features included in Selenium 4 are the bidirectional APIs. They allow us to do things like: intercept network requests, mock backends, perform basic authentication, and view the console logs. Everyone is referring to this as the Chrome DevTools Protocol APIs.

The term “DevTools” is an ambiguous term, as many browsers provide “DevTools”, which is a set of tools integrated with the browser. Those tools allow users to do things such as exploring the website's performance or debugging web applications.

The Chrome DevTools Protocol (CDP) is a wrapper provided by Google to interact with the integrated set of tools. Given its nature, it is not really designed for testing and it does not have a stable API, which means that its functionality can break between different browser versions.

The Selenium project is working with the browser vendors to create the WebDriver BiDirectional Protocol, meant to provide a stable, cross-browser API that uses the bidirectional functionality useful for both browser automation generally and testing specifically. Since that protocol is not available yet, Selenium is providing access to the CDP for those browsers implementing it (Chrome, Edge, and Firefox).

The new bidirectional APIs will let you perform different activities, such as:

  • Check the browser console logs

  • Intercept network requests, useful to mock backend APIs

  • Perform Basic Authentication

  • Inspect and observe elements in the DOM

  • Execute bootstrap scripts to improve test execution time

  • Mock geolocation

  • Throttle network performance to simulate real world conditions

Here are code examples for some use cases:

Basic authentication

If a website uses basic or digest authentication, it will prompt a dialog that cannot be handled through Selenium. To go around that, it is possible to register an authentication method to access the content needed for the test.

@Test
public void authenticate() {
	WebDriver webDriver = new ChromeDriver();
	((HasAuthentication) webDriver)
  	.register(() -> new UsernameAndPassword("admin", "admin"));

        webDriver.get("https://the-internet.herokuapp.com");
	webDriver.findElement(By.linkText("Digest Authentication")).click();

        String body = webDriver.findElement(By.tagName("body")).getText();
	assertThat(body, containsString("Congratulations!"));

        webDriver.quit();
}

Observe changes in the DOM

Mutation observation is the ability to capture DOM events you are interested in. For example, you might want to know if an element has changed its property value. Before, the approach to this was to query the element continuously until the desired change occurred. Now you can observe the changes in the DOM and assert over them when an incoming event notifies you about the expected change.

This example waits for changes in a span element, then it is altered via JavaScript (for demonstration purposes), and finally we assert the expected state. On a real test, the simulated JavaScript alteration would be a real event making the web application change.

@Test
public void mutationObservation() throws InterruptedException {
	ChromeDriver driver = new ChromeDriver();

        AtomicReference<DomMutationEvent> seen = new AtomicReference<>();
	CountDownLatch latch = new CountDownLatch(1);
	((HasLogEvents) driver).onLogEvent(domMutation(mutation -> {
    	seen.set(mutation);
    	latch.countDown();
	}));

        driver.get("https://www.google.com");
	WebElement span = driver.findElement(By.cssSelector("span"));

       ((JavascriptExecutor) driver).executeScript("arguments[0].setAttribute('cheese', 
       'gouda');", span);

        assertThat(latch.await(10, SECONDS), is(true));
	assertThat(seen.get().getAttributeName(), is("cheese"));
	assertThat(seen.get().getCurrentValue(), is("gouda"));
	driver.quit();
}

Listen to events from console.log

Logs are vital in testing, and getting logs from the browser console has not always been the most straightforward operation with Selenium. With the Bidirectional APIs, it is possible to get the console logs to understand better how our application under test is performing. With that, we can now do assertions over the console log messages.

The following code shows how a listener is registered, and it prints to the terminal console all the messages that get printed to the browser console. It will load http://the-internet.herokuapp.com/broken_images, which has a few broken resources and messages related to that are outputted to the console.

@Test
public void consoleLogTest() {
	ChromeDriver driver = new ChromeDriver();
	DevTools devTools = driver.getDevTools();
	devTools.createSession();
	devTools.send(Log.enable());
	devTools.addListener(Log.entryAdded(),
                              logEntry -> {
                               	System.out.println("log: "+logEntry.getText());
                               	System.out.println("level: "+logEntry.getLevel());
                           	});
	driver.get("http://the-internet.herokuapp.com/broken_images");
	// Check the terminal output for the browser console messages.
	driver.quit();
}

Simulate a mobile device

This feature is ideal for responsive websites, as it enables us to quickly verify how the website is displayed on simulated mobile device dimensions. Doing this type of checks is very helpful for early testing. It is possible to achieve this by overriding the device metrics (width, height, etc.), which sets the dimensions of the simulated device where the website will be rendered.

@Test
public void overrideDeviceMode() {
	ChromeDriver driver = new ChromeDriver();
	DevTools devTools = driver.getDevTools();
	devTools.createSession();
	devTools.send(Log.enable());
       // iPhone 11 Pro dimensions
	devTools.send(Emulation.setDeviceMetricsOverride(375,
                                                 	812,
                                                 	50,
                                                 	true,
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty(),
                                                 	Optional.empty()));
	driver.get("https://opensource.saucelabs.com/");
	driver.quit();
}

Emulate GeoLocation

When a website needs to behave differently based on the user’s location, we need to find ways to set the environment to test the website’s behaviour. To do that, we can rely on the geolocation override provided by Selenium 4. In the following code, we set the coordinates of the Sauce Labs office in Berlin, and use the website https://my-location.org/ to verify the emulated location.

@Test
public void emulateGeoLocation() {
	ChromeDriver driver = new ChromeDriver();
	DevTools devTools = driver.getDevTools();
	devTools.createSession();
	devTools.send(Emulation.setGeolocationOverride(Optional.of(52.5043),
                                               	Optional.of(13.4501),
                                               	Optional.of(1)));
	driver.get("https://my-location.org/");
	driver.quit();
}

Collect Performance Metrics

Performance is key in today’s websites. Customers won’t tolerate sites that load slowly, use lots of resources, and in general, perform poorly. In Selenium 4, Performance.enable and Performance.getMetrics() enables us to retrieve and validate all performance metrics for the website under test. The following code collects metrics for the Sauce Labs Open Source Program Office website, and prints them. With that, you can analyse the available metrics and assert on the relevant ones for your use case.

@Test
public void performanceMetrics() {
	ChromeDriver driver = new ChromeDriver();
	DevTools devTools = driver.getDevTools();
	devTools.createSession();
	devTools.send(Performance.enable(Optional.empty()));
	List<Metric> metricList = devTools.send(Performance.getMetrics());
	driver.get("https://opensource.saucelabs.com/");
	driver.quit();
	for(Metric m : metricList) {
    	    System.out.println(m.getName() + " = " + m.getValue());
	}
}

Intercepting Network Traffic

Manipulating network traffic is extremely useful for testing web applications because it gives us endless possibilities to configure the environment where the web application will be running. Blocking network traffic, mocking network connections, simulating the network speed, adding headers, etc. The following code shows how to add a header to the HTTP request, which is helpful when our application under test exposes filters requests or exposes specific features depending on the received headers.

@Test
public void addExtraHeaders() {
	ChromeDriver driver = new ChromeDriver();
	DevTools devTools = driver.getDevTools();
	devTools.createSession();
	driver.get("https://manytools.org/http-html-text/http-request-headers/");

        devTools.send(Network.enable(Optional.empty(), Optional.empty(), Optional.empty()));

        Headers headers = new Headers(Collections.singletonMap("testing", "selenium"));
	devTools.send(Network.setExtraHTTPHeaders(headers));

        driver.get("https://manytools.org/http-html-text/http-request-headers/");
	driver.quit();
}

Summary

Testing with Selenium just got more powerful thanks to the endless possibilities offered by the bidirectional APIs. It is important to note that this is only supported on Chrome, Edge, and Firefox. However, when WebDriver BiDi is ready, other browsers like Safari will be supported too.

While the examples shown above use the devTools.send(Command<X> command) to interact with CDP, the driver also exposes the executeCdpCommand. The executeCdpCommand allows you to send raw commands to CDP, and explore even further functionality. Nevertheless, using executeCdpCommand extensively will tie your tests to a specific browser version, making them harder to maintain and prone to errors given the unstable nature of the CDP APIs.

While it is tempting to use the CDP methods directly, whenever possible, the Selenium team recommends using the provided group of helper classes in Selenium 4. With that, you will future proof your tests and they will be ready when WebDriver BiDi is available through Selenium. More information about the Birectional APIs can be found in the official documentation.

At launch, Sauce Labs will support all features except for the Bidirectional APIs. While we are working hard to add full support we do provide similar functionality through our Extended Debugging feature.


Written by

Diego Molina and Titus Fortner