Capture Network Traffic with Automation Scripts

When learning about the ability to capture network traffic by using my existing Selenium scripts or the headless test framework - PhantomJS scripts, I was excited. A whole new set of tests is about to be added to the continuous integration (CI) pipeline. We often come across requirements when we need to capture and analyze browser network traffic in real time to find HTTP status of the page, examine the headers, validate parameters, do performance analysis, and more. Just another testing strategy to protect the end-user experience when they are using your web application in real time.

What is network traffic?

It includes all the network requests the browser makes when retrieving and loading the Javascript, CSS, image files, and more for a single web page. In lay terms, it is the communication between the browser and server when it is loading a web page.

Why do we need to inspect web application network traffic?

The inspection of network traffic paints a picture of a web page’s condition. The painting starts by going to a web page which triggers all the HTTP requests and responses that need to be collected in real time. To finish, analyze and measure activity across all identified pages of your application.

What to look for when capturing network traffic

Number of HTTP requests - every time someone visits a web page, the browser communicates with the server that hosts the site. Identify the pages in the application that may have too many HTTP requests, and that could possibly impact the end-user experience by causing users to wait too long for a web page to finish loading. Test Strategy Rule: Decrease the number of requests per page by using browser caching.

Load time of web application page - behind the scenes, the browser fetches and renders multiple resources needed to compose the web page. The resources include text, JavaScript (JS) code, a cascading style sheet (CSS), and more. Test Strategy Rule: To optimize rendering speed, use more efficient CSS selectors, a better page layout, and refactor JS code.

HTTP response time - duration from the start of the request to the final byte in the response. Test Strategy Rule: Report the resources taking 'x' milliseconds to finish the response.

Status - the HTTP status code for each request trigger on the web page. What is an HTTP status code? The status codes help identify the 2xx success or cause of the problem when a web page or other resource does not load correctly. Test Strategy Rule: Report the two major groups of HTTP status codes 4xx client errors and 5xx server errors.

Request and response size - combines the request header, response body, and file being delivered by the server. Test Strategy Rule: No image files greater than 250kb.

Analytics - can be found within resource response content. Strategy Rule: Check that comScore analytics are firing on every page of your web application. Focus on areas that can be used to improve your web application performance to protect the end-user experience.

Source: Browser Network Monitoring - https://saucelabs.com Source: Browser Network Monitoring - https://saucelabs.com

The technologies behind capturing network traffic

You can use Selenium scripts and headless testing webkits to collect network traffic. BrowserMob Proxy (BMP) is a great utility to capture performance data easily from browsers using existing Selenium scripts. The reasons for using BMP over testing technologies: to artificially simulate various bandwidth and latency, for the blacklisting and whitelisting of certain URL patterns, controlling DNS, and collecting more details about the request and response content. PhantomJS is a headless WebKit scriptable with a JavaScript API. It supports various web testing features such as headless testing (used to launch the tests via a suitable test runner), screen capture, page automation (DOM handling and CSS selectors), and network monitoring. CasperJS, built on top of PhantomJS + SlimerJS, and comes with testing utilities such as assertions, org tests, and easier way to track failed tests. YSlow analyzes web pages and why they're slow based on Yahoo!'s rules for high performance web sites.(("YSlow - Official Open Source Project Website." 2009. 3 May. 2016 <http://yslow.org/>))

Example: PhantomJS script for network traffic

All the resource requests and responses can be collected using onResourceRequested and onResourceReceived via PhantomJS API. I recommend checking out PhantomJS cookbook examples to capture network traffic.

  • netlog.js dumps all network requests and responses
  • netsniff.js captures network traffic in HAR format

FILE: netlog.js

"use strict"; 
var page = require('webpage').create(),
    system = require('system'),
    address;

if (systems.args.length === 1){
    console.log('Usage: netlog.js <some URL>');
    phantom.exit(1);
} else {
    address = system.args[1];

   // page.onResourceRequested = function (req) {
    //     console.log('requested: ' + JSON.stringify(req, undefined, 4));
    // };
    
    page.onResourceRequested = function(requestData, networkRequest){
      console.log(Request (#'+ requestData.id + JSON.stringify(requestData));
    };

    page.onResourceReceived = function(response){
      console.log('received: ' + JSON.stringify(response,undefined, 4));
    };

    page.open(address, function(status){
      if (status !== 'success'){
        console.log('FAIL to load the address');
      }
      phantom.exit()
     });
}

CONSOLE LOG: Snippet of Request #1

Request (#1): {"headers":[{"name":"User-Agent","value":"Mozilla/5.0 (Macintosh; PPC Mac OS X)
AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.8 Safari/534.34"},{"name":"Accept","value":
"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"}],
"id":1,"method":"GET","time":"2016-05-04T01:18:35.909Z","url":"https://saucelabs.com/"}
received:{      "bodySize": 32050,      "contentType": "text/html;charset=utf-8",      "headers": [ {          "name": "Server",          "value": "nginx"          },          {          "name": "Date",          "value": "Wed, 04 May 2016 01:18:35 GMT"          },          {          "name": "Content-Type",          "value": "text/html;charset=utf-8"          },          {          "name": "Connection",          "value": "keep-alive"          },          {          "name": "X-Cache-Operation",          "value": "plone.app.caching.moderateCaching"          },          {          "name": "Content-Language",          "value": "en"          },          {          "name": "Expires",          "value": "Sun, 07 May 2006 01:00:57 GMT"          },          {          "name": "Vary",          "value": "Last-Modified,Accept-Encoding"          },          {          "name": "Last-Modified",          "value": "Tue, 03 May 2016 20:55:19 GMT"          },          {          "name": "X-Ua-Compatible",          "value": "IE=edge,chrome=1"          },          {          "name": "Cache-Control",          "value": "max-age=0, s-maxage=86400, must-revalidate"          },          {          "name": "X-Cache-Rule",          "value": "plone.content.itemView"          },          {          "name": "X-Frame-Options",          "value": "SAMEORIGIN"          },          {          "name": "X-Varnish",          "value": "1874945977 1874944052"          },          {          "name": "Age",          "value": "1024"          },          {          "name": "Via",          "value": "1.1 varnish"          },          {          "name": "X-Varnish-Cache",          "value": "HIT"          },          {          "name": "Content-Encoding",          "value": "gzip"          },          {          "name": "X-Backend-Server",          "value": "nginx3"          },          {          "name": "Strict-Transport-Security",          "value": "max-age=31536000"          },          {          "name": "Accept-Ranges",          "value": "bytes"          }      ],      "id": 1,      "redirectURL": null,      "stage": "start",      "status": 200,      "statusText": "OK",      "time": "2016-05-04T01:18:36.264Z",      "url": "https://saucelabs.com/" }

Conclusion

Developing scripts to capture network traffic will give insight into the requests and downloads over the network in real time. By adding network monitoring testing as part of the pipeline, it will alert the team of potential problems before shipping to production, and affecting end-users. Greg Sypolt (@gregsypolt) is a Senior Engineer at Gannett - USA Today Network and co-founder of Quality Element. He is a passionate automation engineer seeking to optimize software development quality, while coaching team members on how to write great automation scripts and helping the testing community become better testers. Greg has spent most of his career working on software quality—concentrating on web browsers, APIs, and mobile. For the past five years, he has focused on the creation and deployment of automated test strategies, frameworks, tools and platforms.

Written by

Greg Sypolt

Topics

Programming languages

Categories