Bleacher Report’s Continuous Integration & Delivery Methodology: Test Analytics

Posted Jun 24th, 2014

This is the final post in a three part series highlighting Bleacher Report’s continuous integration and delivery methodology by Felix Rodriguez.  Read the first post here and the second here.

Last week we discussed setting up an integration testing server that allows us to post, which then kicks off a suite of tests. Now that we are storing all of our suite runs and individual tests in a postgres database, we can do some interesting things - like track trends over time. At Bleacher Report we like to use a tool named Librato to store our metrics, create sweet graphs, and display pretty dashboards. One of the metrics that we record on every test run is our PageSpeed Insights score.

PageSpeed Insights

PageSpeed insights is a tool provided by Google developers that analyzes your web or mobile page and gives you an overall rating. You can use the website to get a score manually, but instead we hooked into their api in order to submit our page visit score to Liberato. Each staging environment is recorded separately so that if any of them return measurements that are off, we can attribute this to a server issue. average page speeds

Any server that shows an extremely high rating is probably only loading a 500 error page. A server that shows an extremely low rating is probably some new, untested JS/CSS code we are running on that server. Below is an example of how we submit a metric using Cukebot:


require_relative 'lib/pagespeed'
Given(/^I navigate to "(.*?)"$/) do |path|
  visit path
  pagespeed =
  ps = pagespeed.get_results
  score = ps["score"]
  puts "Page Speed Score is: #{score}"
  metric = host.gsub(/http\:\/\//i,"").gsub(/\.com\//,"") + "_speed"
    puts "Could not send metric"


require 'net/https'
require 'json'
require 'uri'
require 'librato/metrics'

class PageSpeed
  def initialize(domain,strategy='desktop',key=ENV['PAGESPEED_API_TOKEN'])
    @domain = domain
    @strategy = strategy
    @key = key
    @url = "" + \
      URI.encode(@domain) + \

  def get_results
    uri = URI.parse(@url)
    http =, uri.port)
    http.use_ssl = true
    http.verify_mode = OpenSSL::SSL::VERIFY_NONE
    request =
    response = http.request(request)

  def submit(name, value)
    Librato::Metrics.authenticate "", ENV['LIBRATO_TOKEN']
    Librato::Metrics.submit name.to_sym  => {:type => :gauge, :value => value, :source => 'cukebot'}

  Google's PageSpeed Insights return relatively fast, but as you start recording more metrics on each visit command to get results on both desktop and mobile, we suggest building a separate service that will run a desired performance test as a post - or at least in its own thread. This will stop the test from continuing its run or causing a test that runs long. Which brings us to our next topic.

Tracking Run Time

With Sauce Labs, you are able to quickly spot a test that takes a long time to run. But when you're running hundreds of tests in parallel, all the time, it's hard to keep track of the ones that normally take a long time to run versus the ones that have only recently started to take an abnormally long time to run. This is why our Cukebot service is so important to us. Now that each test run is stored in our database, we grab the information Sauce stores for run time length and store it with the rest of the details from that test. We then submit that metric to Librato and track over time in an instrument. Once again, if all of our tests take substantially longer to run on a specific environment, we can use that data to investigate issues with that server. To do this, we take advantage of Cucumber's before/after hooks to grab the time it took for the test to run in Sauce (or track it ourselves) and submit to Librato. We use the on_exit hook to record the total time of the suite and submit that as well.

Test Pass/Fail Analytics

To see trends over time, we'd also like to measure our pass/fail percentage for each individual test on each separate staging environment as well as our entire suite pass/fail percentage. This would allow us to notify Ops about any servers that need to get "beefed up" if we run into a lot of timeout issues on that particular setup. This would also allow us to quickly make a decision about whether we should proceed with a deploy or not when there are failed tests that pass over 90% of the time and are currently failing. The easiest way to achieve this is to use the Cucumber after-hook to query the postgres database for total passed test runs on the current environment in the last X amount of days, and divide that by the total test runs on the current environment in the same period to generate a percentage, store it, then track it over time to analyze trends.


Adding tools like these will allow you to look at a dashboard after each build and give your team the confidence to know that your code is ready to be released to the wild. Running integration tests continuously used to be our biggest challenge.  Now that we've finally arrived to the party, we've noticed that there are many other things we can automate. As our company strives for better product quality, this pushes our team's standards with regard to what we choose to ship. One tool we have been experimenting with and would like to add to our arsenal of automation is So far we have seen great things from them and have caught a lot of traffic-related issues we would have missed otherwise. Most of what I've talked about in this series has been done, but some is right around the corner from completion. If you believe we can enhance this process in anyway, I would greatly appreciate any constructive criticism via my twitter handle @feelobot. As Sauce says, "Automate all the Things!"

Written by

Bill McGee