Selenium: XPath marks the spot

January 6th, 2011 by joe

Selenium doesn’t speak English

A webpage looks different to a computer than to a human (and if you are computer, you can skip this blog post). Most humans don’t care how computers look at web pages, but it may matter to you. That’s because if you are writing a Selenium test, you have to know how to see a web page the way a computer does.

A webpage has things on it. The text you’re reading now isn’t a blog post to a computer – it’s a thing in a webpage. So are all the links and buttons on this page. If you were writing a Selenium test that wanted to come to this page and check for text, that would be easy; you just need to know the URL for the page and the text to look for. If you wanted to write a test that came to this page and clicked on a button, that would be more difficult. You’d need to know how to tell the computer which button to click on. That’s what XPath is for. It stands for “XML path”; XML is (loosely) the language your page is written in, and path means the path to an element (in this case a button) in the page.

XPath leads to the spot

Everything in the page is an element. Most of them live inside another element. The code for an element looks like this: <type>content</type>. “Type” is what kind of element it is; input, paragraph, division (of the page), etc. Where you see <type> is where the element’s code begins, and </type> is where it ends. The content in the middle is some combination of text that is going to appear in the webpage and lists of other elements. Those elements can contain other elements. When you’re looking at a page, those elements are visible as rectangles, with backgrounds or images or text in them. They live inside bigger rectangles that partition up the page. Those rectangles probably live inside even bigger rectangles.

The “path” to an element is the list of elements it lives inside. To construct one, you list all the elements it lives inside, starting with the top one. You can also do fancier things, like scan the whole page for certain kinds of elements, or some specific kind of element but only when it’s inside another certain kind of element. It wouldn’t make sense for me to explain how to write an XPath here, because w3schools has totally got that covered.

Don’t use XPath

There are two important things to say about XPath with regards to Selenium. First, XPaths are only one of three ways to tell Selenium how to locate an element, and it’s the worst one. Second, that if you must use XPath locators, there are some steps you should take to keep them from breaking or slowing the script down when they’re being used.

They’re the worst kind of selector to use because of their brittleness and their slowness. If you can, you should use unique id attributes on the elements you’re trying to interact with through Selenium. If that fails, you should try to use CSS selectors.

If you have to use XPath, be careful

If you must use XPaths, you have to make a tradeoff between reliability and speed.
A fast xpath might say “click on the third input element in the second div in the fourth paragraph in the body element of the html element.” It’s like having a treasure map that says “Turn left at the first fork in the path, then right, then your third left.” That’s easy to follow, but if someone adds another fork to the trail anywhere along that path, those directions will lead somewhere else. An XPath like that will be interpreted quickly by Selenium, but it will stop working as soon as you put another div along the path through the page.

A more reliable XPath might say “I want you to find an element that is an input element inside any paragraph inside any element that is inside any div as long as it has a ‘name’ attribute of ‘blah’ .” It’s like a treasure map that says “You want the spot between two pine trees that are 100 feet to the west of an oak tree that’s just after a left turn in the trail.” Those directions will stand the test of time, but when you need to follow them, you’ll have to check every left turn, every oak tree, and every pine tree on the island to be sure. This is less brittle, but it will force the browser to check a lot more things on the page before it can be sure it has found what you want.

Share

Related posts:
Selenium Tips: Start improving your locators
Selenium Tips: CSS Selectors in Selenium Demystified

Comments (You may use the <code> or <pre> tags in your comment)

  1. Adam Goucher says:

    To expand upon and to clarify a few points as this is an important topic.

    First, there are [at least] seven, not three, ways to locate things in Selenium which can be broadly categorized as:

    By Attribute – id, name, identifier
    By Displayed Value – link
    By Structure – XPath, CSS
    Direct Access – DOM

    But since this post is about XPath, let’s talk about it some.

    Don’t use XPath is a phrase bandied around the Selenium community so much that it has now taken its place in the spheres of self-fulfilling myth. Please stop saying it. I’ll suggest instead saying Don’t use XPath when you don’t need to and when do do, don’t use bad XPath — though I admit that doesn’t really roll off the tongue as the other.

    The new, more accurate phase has two important parts:

    When you have to – There are certain lookups that can only be accomplished through XPath, specifically those that traverse up the page content from the current location. CSS is down only, which is why there is a table > tr selector but not a table < tr one. XPath does that easily with its preceding-sibling:: ability.

    Another place that you have to use XPath is for the massively useful getXpathCount function, which, not surprisingly requires an XPath. Prior to about two weeks ago there was no equivalent getCssCount but I got sick of that obvious hole in the API and added it. Look for it in 2.0b2.
    Don’t use bad XPath – XPath, as you allude is a means of interrogating the structure of the document, as is CSS. I would be willing to bet that the vast majority of ‘brittle’ XPath was actually generated by Selenium IDE — and guess what, Selenium IDE is known to produce brittle XPath. The brittleness is because it tries to be correct, but not clever. We, as the people doing the automation, can actually be clever and fix the generated XPath. In fact, you really should modify what Selenium IDE produces. By using starts-with, ends-with, contains, preceding-sibling, following-sibling, etc. you can turn brittle XPath into rock solid XPath.

    Speed for XPath is indeed an issue that should be considered when creating locator strings. The speed issue is primarily for execution on Internet Explorer because it does not have native XPath interrogation for the DOM so Selenium needs to do it in Javascript which is doomed in the realm of performance from the start. But if you don’t care about Internet Explorer or speed, the XPath is a perfectly valid choice of strategy.

    (I’m also not convinced that all XPath is slow on IE and suspect that it is only certain operations on DOM trees of a certain complexity that trigger the deathly performance but have to data prove or disprove that hunch.)

    You’re recommendation at the end of the post is spot on except it doesn’t go quite far enough. In either structural locator, be it XPath or CSS, it is important to make the XPath as specific as possible. The implication in the Selenium community that using CSS will immediately solve all your brittle locators is complete myth. It is just as easy to make a brittle CSS locator as it is a brittle XPath one. As a consultant, this is one of the constant things I need to explain to even advanced users who are using CSS because everyone knows XPath is brittle. Actually, its only your use of it that is brittle.

    -adam
    Selenium IDE Maintainer

Leave a Comment