Non-ASCII Characters in Appium, Part 1

Posted Mar 10, 2014

English is not the only language in the world. This should not be news. It should also not be news that people would like to test their apps with input from different languages. The good news is that both main mobile environments, iOS and Android, and their emulators, can handle Unicode text internally. Non-ASCII text in Android EmulatorNon-ASCII text in iOS Simulator Unfortunately, the mobile device emulators generally don't play well with input of characters outside of the ASCII range. Which is to say, they have a hard time dealing with anything that is not English and a very small number of other characters (!#%...). The emulators can use the text, they just have some issues with getting it into the system. Below is a brief discussion on testing non-ASCII characters for each. iOS iOS, generally, plays better with text than Android: the simulator can manually handle the full range of Unicode characters, whether through a different keyboard or by pasting text into the appropriate field. Programmatically, however, the system cannot deal with characters not available in the default iOS keyboard. Sending the character é, for instance, will result in an error target.frontMostApp().keyboard() failed to locate key 'é'. This is not good for automated testing. Appium gets around this by sending the non-ASCII text directly to the text element, thus bypassing the keyboard altogether. Unfortunately, this can potentially interact badly with some applications, if they are expecting input character-by-character (e.g., predictive autocomplete). One main caveat to this behavior is the encoding of the strings sent and received. Many strings will work without problems, such as sending "परीक्षण" ("testing" in Sanskrit). But Unicode also has the concept of "combining characters", which are diacritical modifications of other characters. Rather than a single character representing what is seen, two (or more, in the case of heavily accented characters) separate characters are used to represent one, with the system overlaying them. Thus, while in Unicode the letter é (called "LATIN SMALL LETTER E WITH ACUTE") can be encoded as a single letter, the iOS simulator will return the equally valid representation of the letter e followed by the accent, ́, which the Unicode Consortium calls a "COMBINING ACUTE ACCENT". Therefore to test equality you may need to account for that difference, by normalizing the input and output (for an example in JavaScript, see accent_specs.js). The symptom of this kind of error will generally be an error condition with the expected and actual text seemingly exactly the same. Non-normalized combining characters failing By normalizing the text this seemingly erroneous error ought to go away. Android On the other hand, the Android Emulator does not work so easily. Manually, one can change keyboards and then type using it. When automating tests, however, this is more difficult. Theoretically it is possible to install the keyboard for your target language, and then send the text to that. The problem with this is that your tests will not be able to set the text in the element, but rather need to simulate the keystrokes, with meta-keys and all. The text input method will only accept ASCII characters, no matter what. The Appium team is working on an experimental solution to the situation on Android, marshaling an ASCII encoding of the Unicode text into the Android emulator. We are currently testing the solution. We look forward to making Appium even more useful for your mobile testing. Stay tuned for Part 2, coming soon, with details on making testing Android through Appium mutlilingual!

Written by

Isaac Murchie