Appium + Sauce Labs Bootcamp: Chapter 1, Language Bindings

Appium logo w- tagline {final}-01

Welcome to the first in our new series, Appium + Sauce Labs Bootcamp. This first chapter will cover an overview of Appium and its commands, demonstrated with detailed examples of the Java and Python language bindings.  Later we will follow up with examples in Ruby. This series goes from fundamental concepts to advanced techniques using Appium and Sauce Labs. The difficulty is Beginner->Advanced. In Chapter 2 discusses Touch Actions; Chapter 3 covers Testing Hybrid Apps & Mobile Web; and Chapter 4 is about Advanced Desired Capabilities.

The Sauce <-> Client Relationship

Sauce Labs provides a service consisting of two APIs which operate at different conceptual levels — the Sauce API and the Webdriver API. Both operate as HTTP (secured with SSL/TLS) and encode their data as simple JSON objects:

  • Commands which control an individual Mobile Device (such as an iOS simulator or an Android emulator) are sent through the Webdriver API (urls which begin with
  • Commands which interact with how Sauce Labs stores Tests and Builds eg: “Passing/Failing” are sent through the Sauce API (urls which begin with

When running an automated test script, every command, such as tapping on the screen or typing into an input field, is sent as an HTTP request to an Appium server through the Sauce Labs Webdriver API. Since commands are sent as standard HTTP packets and every language has methods for communicating over the Internet, automated tests can be run and written using any programming language. Automated test scripts are descriptions of scenarios that users enact when using an app and it would be cumbersome to fill them full of code which assembles HTTP packets and sends them off to specific urls. For this reason, there exist a number of language bindings (sometimes called Appium Clients or drivers since they “drive” a mobile device) which provide a set of methods and functions which handle communication with the Appium servers running in the Sauce cloud. The Appium Java client may provide a method for pressing the HOME button:


And the Appium Python client may provide a slightly different function:


But both functions generate the same exact HTTP packet which is sent to:[session Id number]/appium/device/keyevent

  “keycode”: 3

Developers and QA specialists who write automated test scripts need never concern themselves with long URLs like the one in the example above; this is the advantage provided by the language bindings. The commands that Appium accepts are a superset of the standard protocol used by Selenium servers for automating web browsers. In order to mitigate the effort of learning something completely new, the Appium language bindings all extend and modify existing Selenium language bindings. Though in some cases the methods are very different, the documentation for the various Selenium clients is usually applicable to the Appium clients.

Test Configuration

In order to specify the environment that Sauce Labs will set up and use for running your app, you must populate a set of "desired capabilities". The entire list of capabilities that Appium can understand is specified here, and the set of capabilities that Sauce Labs understands is here. The three most important capabilities for mobile testing on Sauce with Appium are: platformName, which specifies whether you are testing an "Android" or "iOS" app; platformVersion, which allows you to specify a specific version of the platform (e.g., iOS "7.1" or "8.0", or Android "4.3" or "4.4"); and deviceName, through which iOS allows you to distinguish "iPhone" from "iPad", and for Android, allows you to specify the device itself (e.g., "Samsung Galaxy S3" or "Nexus 7"). Further, there are two mutually exclusive capabilities for specifying the app under test: app allows you to define the native app, and must be an app you've already placed in Sauce Storage or the URL of the app somewhere on the internet; browserName is used for mobile web testing, allowing you to get a specific browser running on the mobile device (e.g., "Safari" or "Chrome"). If you are testing a native app, do not use browserName, and vice versa. For example, to tell Sauce Labs to use the application we have uploaded with the name my_app.apk, and to run it on an Android emulator running Android 4.4 and emulating a Samsung Galaxy S3, we use the following capabilities:

  "app": "sauce-storage:my_app.apk",
  "platformName": "Android",
  "platformVersion": "4.4",
  "deviceName": "Samsung Galaxy S3 Emulator",
  "appiumVersion": "1.3.4"

For an iOS test, with the following capabilities we will be telling Sauce Labs to use the application we have uploaded with the name, and run it on the iPhone simulator running iOS 8.1:

  "app": "",
  "platformName": "iOS",
  "platformVersion": "8.1",
  "deviceName": "iPhone Simulator",
  "appiumVersion": "1.3.4"

The final capability in both cases pertains to Appium itself. With appiumVersion you can use a specific build of the Appium server (in this case, 1.3.4, the latest as of this writing). Since the configuration of your environment can be complex and getting it right is very important, Sauce Labs provides a tool, the Platform Configurator, which allows you to visually configure the test environment you want, and generates code in your desired language. Through the desired capabilities you will send to Sauce Labs, you get the environment in which you would like your test to run. From there, you can automate your test scenarios!

Instantiating a Driver

The various Appium language bindings share the concept of providing a driver object. When desired capabilities are sent to Sauce Labs, a new environment is set up and assigned a session identifier. Each driver is associated with a single session identifier and therefore each driver is associated with a single mobile device for the duration of a test. The different language bindings have slightly different conventions, but they all need to do the following at the beginning of each test:

  • Store the SAUCE_USERNAME and SAUCE_ACCESS_KEY for accessing the Sauce Cloud
  • Set as the endpoint to send Appium commands to
  • Specify a set of desired capabilities for this test session
  • Send the desired capabilities to Sauce Labs and begin a new session

The following are samples of instantiating Appium drivers in Java and Python:


[code language="java"]
import io.appium.java_client.ios.IOSDriver;
import io.appium.java_client.remote.MobileCapabilityType;
import org.openqa.selenium.remote.DesiredCapabilities;


public class Main {

public static void main(String [ ] args) throws MalformedURLException {

DesiredCapabilities desiredCapabilities = new DesiredCapabilities();
desiredCapabilities.setCapability("name", "iOS test - Java");
desiredCapabilities.setCapability(MobileCapabilityType.PLATFORM_VERSION, "7.1");
desiredCapabilities.setCapability(MobileCapabilityType.DEVICE_NAME, "iPhone Simulator");
desiredCapabilities.setCapability(MobileCapabilityType.APP, "");
desiredCapabilities.setCapability("appiumVersion", "1.3.4");

URL sauceUrl = new URL("http://[SAUCE USERNAME]:[SAUCE_ACCESS_KEY]");

IOSDriver driver = new IOSDriver(sauceUrl, desiredCapabilities);



[code language="python"]
from appium import webdriver
desired_capabilities = {
'name': 'iOS test - Python',
'platformName': 'iOS',
'platformVersion': '7.1',
'deviceName': 'iPhone Simulator',
'app': '',
'appiumVersion': '1.3.4'
sauce_url = ""
driver = webdriver.Remote(
command_executor=sauce_url % (SAUCE_USERNAME,SAUCE_ACCESS_KEY)

Simple Commands

There are multiple commands available for the inspection of the elements present on the UI of a device, and interacting with them. So many, in fact, that it can be overwhelming to learn them all at once. (The complete list is a combination of all the API endpoints described in the Selenium Documentation, and the Appium Documentation.) The first commands to learn are the following: Finding Elements

  • find element
  • find elements

Inspecting Elements

  • text
  • location
  • size

Interacting with Elements

  • click
  • send_keys

Finding Elements In order to perform any meaningful command, one needs a UI element to work with. Appium allows for finding UI elements by a number of means. The preferred method is to find elements by their Accessibility Id. These would be identifiers which app developers manually attach to important elements so that different handicap accessibility interfaces can meaningfully interpret the UI. The Android and iOS platforms both have Accessibility programs (iOS, Android).

MobileElement button = (MobileElement) driver.findElementByAccessibilityId("play-button");
button = self.driver.find_element_by_accessibility_id('play-button')

Elements can also be found by using the name of their class. On Android devices, these names start with “android.widget.” eg. “android.widget.TextView” and “android.widget.LinearLayout”. On iOS, class names start with “UIA”, eg. “UIATextField” and “UIATableView”.

MobileElement button = (MobileElement) driver.findElementByClassName("UIAButton”);
button = self.driver.find_element_by_class_name('UIAButton')

If multiple elements are found by these commands, only the first is returned. For finding multiple elements a pluralized version of each command exists. These commands return arrays of elements.

MobileElement buttons = (List)(List<?>) driver.findElementsByAccessibilityId("play-button");
MobileElement buttons = (List)(List<?>) driver.findElementsByClassName("UIAButton”);
buttons = self.driver.find_elements_by_accessibility_id('play-button')
buttons = self.driver.find_elements_by_class_name('UIAButton')

One can find elements contained within another element:

List tableViews = (List)(List<?>) driver.findElementsByClassName("UIATableView");
MobileElement button = (MobileElement) tableViews.get(2).findElementByAccessibilityId("play-button");
tableViews = self.driver.find_elements_by_class_name(“UIATableView”)
button = tableViews[2].find_element_by_accessibility_id('play-button')

The Java client has an alternative method of finding elements which behaves the same but has a slightly different syntax:

MobileElement button = (MobileElement) driver.findElement(MobileBy.AccessibilityId("play button"));
List buttons = (List)(List<?>) driver.findElements(By.className("UIAButton"));

These different approaches to finding elements (by class name or by accessibility ID) are called locator strategies. Appium has additional locator strategies for finding elements by id, xpath, and platform specific locators like iOS UIAutomation commands and Android UIAutomator selectors. These will be discussed in a later chapter. Inspecting Elements By inspecting the properties of elements visible on the UI, we can detect whether or not the app behaves as expected. We can test for the presence of a popup, look for a user’s name when logged in, check that lists are populated, that images are in the right place, etc. Whenever a UI element is “found” through appium, the server returns an id, not an object populated with UI properties. Additional functions need to be called in order to get the specific properties of an element. The “text” command returns the textual contents of the element.

String buttonText = button.getText();
buttonText = button.text

The “location” command returns the current location of the element on the screen, measured in pixels.

Point location = button.getLocation();
location = button.location

The “size” command returns the size of the element on the screen, measured in pixels.

Dimension dimension = button.getSize()
dimension = button.size

Since these properties are calculated when the command is called, if the element is no longer visible on the UI the command will fail. Interacting with Elements By interacting with elements, we simulate the actions of a user, typing into fields, pressing buttons, tapping the screen, and performing touch gestures. Use the “click” command to simulate tapping on an element:;

Use the “send keys” command to type into a text field.

textField.sendKeys(“Hi, my name is”);
textField.send_keys(‘Hi, my name is’)

Touch gestures will be discussed in a later chapter. Discussed above are the basic commands for finding, inspecting, and interacting with UI elements. The examples are in Java and Python. Each language binding follows conventions particular to its developer culture, but they all encompass the same set of commands. When in doubt, check the documentation for a particular language binding.

Python Example

[code language="python"]
import unittest
import os
import sys
from appium import webdriver
from sauceclient import SauceClient

USERNAME = os.environ.get('SAUCE_USERNAME')
ACCESS_KEY = os.environ.get('SAUCE_ACCESS_KEY')
sauce = SauceClient(USERNAME, ACCESS_KEY)

class SimpleIOSSauceTests(unittest.TestCase):

def setUp(self):
self.desired_capabilities = {
'platformName': 'iOS',
'platformVersion': '7.1',
'deviceName': 'iPhone Simulator',
'app': '',
'appiumVersion': '1.3.4',

sauce_url = ""
self.driver = webdriver.Remote(
command_executor=sauce_url % (USERNAME, ACCESS_KEY)

def tearDown(self):
print("Link to your job:" % self.driver.session_id)
if sys.exc_info() == (None, None, None):, passed=True)
else:, passed=False)

def test_ui_computation(self):
# populate text fields with values
field_one = self.driver.find_element_by_accessibility_id('TextField1')

field_two = self.driver.find_elements_by_class_name('UIATextField')[1]

# they should be the same size, and the first should be above the second
self.assertLess(field_one.location['y'], field_two.location['y'])
self.assertEqual(field_one.size, field_two.size)

# trigger computation by using the button

Written by

Isaac Murchie


AppiumProgramming languages