Selenium 4: Simplified Guide to New Features

Selenium is a powerful tool for automating web browsers, and with the release of Selenium 4, several exciting features have been introduced to make automation tasks easier and more efficient. In this blog, we’ll dive into a detailed explanation of relative locators, invoking multiple windows or tabs, capturing web element screenshots, and retrieving element dimensions (height and width). These features were introduced in Selenium 4 and are not available in earlier versions like Selenium 3.x. By the end of this blog, you’ll have a clear understanding of how to use these features with practical, real-world examples.

Introduction to Selenium 4 Features

Selenium 4 brings a host of new capabilities that make it easier to locate elements, handle multiple browser windows, and verify the appearance of web elements. The key features we’ll cover include:

Relative Locators: A friendly way to locate elements based on their position relative to other elements (e.g., above, below, left, or right).
Invoking Multiple Windows or Tabs: Opening and switching between multiple browser windows or tabs to work with multiple applications simultaneously.
Capturing Web Element Screenshots: Taking screenshots of specific web elements (not the entire page) to verify their appearance.
Getting Element Dimensions: Retrieving the height and width of web elements to ensure they meet design specifications.

Let’s explore each feature step-by-step, using a practice application as an example.

1. Understanding Relative Locators in Selenium 4

Relative locators (also called "friendly locators") are a game-changer introduced in Selenium 4. They allow you to locate a web element based on its position relative to another element, such as above, below, to the left of, or to the right of. This is particularly useful when an element lacks unique attributes, and traditional locators like XPath or CSS selectors are complex to build.

Why Use Relative Locators?

Imagine you’re working on a form in an e-commerce application, and you need to extract the label text (e.g., "Name") for a field like the "Name" input box. The label might not have unique attributes, making it hard to identify directly. Traditionally, you’d traverse the DOM (Document Object Model) using parent-to-child or sibling relationships with XPath. Relative locators simplify this by letting you describe the element’s position relative to a known, unique element.

Let’s try some examples!

Example: Extracting the "Name" Label

Let’s walk through a practical example, where we extract the "Name" label above the "Name" input box.

Step 1: Identify the Unique Element

First, we identify the "Name" input box, which has a unique attribute, name="name". Using a CSS selector, we can locate it:

driver.get("https://rahulshettyacademy.com/angularpractice/");
WebElement nameEditBox = driver.findElement(By.cssSelector("input[name='name']"));

This input box is unique and easy to locate. Our goal is to find the <label> tag that says "Name," which is positioned above this input box. Check the reference image below:

Step 2: Use a Relative Locator

The <label> tag has no unique attributes, but we know it’s directly above the "Name" input box.

Here’s how to use a relative locator to find it:

import org.openqa.selenium.support.locators.RelativeLocator;

WebElement label = driver.findElement(RelativeLocator.with(By.tagName("label")).above(nameEditBox));
String labelText = label.getText();
System.out.println(labelText); // Output: Name

Explanation:

Import the RelativeLocator package: You need to import org.openqa.selenium.support.locators.RelativeLocator to use relative locators.
with(By.tagName("label")): Specifies that we’re looking for an element with the tag name <label>.
above(nameEditBox): Filters the <label> tags to find the one directly above the nameEditBox element.
getText(): Retrieves the text of the located label, which is "Name."

Key Points:

Relative locators only support By.tagName for now, not other locators like By.id or By.cssSelector.
If multiple elements match the tag name, the relative locator filters them based on the specified position (e.g., above).

Example: Clicking a Button Below the "Date of Birth" Label

Now, let’s try locating an element below another element. Suppose we want to click the "Submit" button below the "Date of Birth" label, which has a unique attribute for="dateofbirth" as shown in the image below:

Step 1: Capture the "Date of Birth" Label

WebElement dateOfBirth = driver.findElement(By.cssSelector("label[for='dateofbirth']"));

Step 2: Locate the Submit Button Below

The "Submit" button has a tag name <input>. We use a relative locator to find it:

WebElement submitButton = driver.findElement(with(By.tagName("input")).below(dateOfBirth));
submitButton.click();

Note: In this example, the input box directly below the "Date of Birth" label is a flex element, which relative locators don’t support. As a result, Selenium skips it and clicks the next <input> tag, which is the "Submit" button. When executed, this submits the form, and a "Success! Form has been submitted successfully" message appears.

Key Points:

Relative locators may skip flex elements, so test carefully.
The below locator finds the next matching tag in the specified direction.

Example: Selecting a Checkbox to the Left of a Label

Let’s select a checkbox to the left of the "Love IceCreams" label, which we will locate using an XPath based on its text:

WebElement iceCreamLabel = driver.findElement(By.xpath("//label[text()='Love IceCreams']"));
WebElement checkbox = driver.findElement(with(By.tagName("input")).toLeftOf(iceCreamLabel));
checkbox.click();

Explanation:

The iceCreamLabel is located using XPath.
The toLeftOf locator finds the <input> tag (the checkbox) to the left of the label.
Clicking the checkbox selects it.

When executed, the checkbox is checked, demonstrating how relative locators can handle dynamic layouts where the checkbox’s position isn’t fixed.

Example: Extracting a Label to the Right of a Radio Button

Finally, let’s find the label to the right of a radio button with the ID inlineRadio1:

WebElement radioButton = driver.findElement(By.id("inlineRadio1"));
WebElement label = driver.findElement(with(By.tagName("label")).toRightOf(radioButton));
String labelText = label.getText();
System.out.println(labelText); // Output: student

Explanation:

The radio button is located using its unique ID.
The toRightOf locator finds the <label> tag to the right, which says "student."
The label text is printed to the console.

2. Invoking Multiple Windows or Tabs in Selenium 4

Selenium 4 allows you to open and switch between multiple browser windows or tabs, enabling multitasking across different URLs. This is useful when your test case requires data from one webpage to be used on another.

Scenario: Filling a Form with Data from Another Page

In our example, we need to:

Navigate to the form page.
Fill the "Name" field with the title of the first course from another page.
Open a new tab to fetch the course title, then switch back to the form to enter it.

Step 1: Set Up the Chrome Browser

WebDriver driver = new ChromeDriver();
driver.get("https://rahulshettyacademy.com/angularpractice/");

This opens the form page in the parent window.

Step 2: Open a New Tab

To open a new tab:

driver.switchTo().newWindow(WindowType.TAB);

Explanation:

switchTo().newWindow(WindowType.TAB) opens a blank tab.
WindowType.TAB specifies a tab; use WindowType.WINDOW for a new browser window.

When executed, a blank tab appears, but the driver’s focus remains on the parent window.

Step 3: Switch to the New Tab

To work in the new tab, we need its window handle (ID):

Set<String> handles = driver.getWindowHandles();
Iterator<String> it = handles.iterator();
String parentWindowID = it.next();
String childWindowID = it.next();
driver.switchTo().window(childWindowID);

Explanation:

getWindowHandles() returns a Set of all open window IDs.
iterator() allows us to iterate through the IDs.
The first it.next() retrieves the parent window ID; the second retrieves the child window ID.
switchTo().window(childWindowID) shifts the driver’s focus to the new tab.

Now, the driver can interact with the new tab.

Step 4: Navigate to the Course Page and Extract the Course Title

In the new tab, navigate to the Course Page and locate the first course title from the Featured Courses:

driver.get("https://rahulshettyacademy.com");
List<WebElement> courses = driver.findElements(By.cssSelector("a[href*='/p']"));
String courseName = courses.get(1).getText();

Explanation:

findElements retrieves all elements matching the CSS selector a[href*='/p'], which targets course links (24 elements in this case).
get(1) selects the element at index 1, as it corresponds to the first visible course (index 0 is invisible).
getText() extracts the course title.

Step 5: Switch Back to the Parent Window and Enter the Course Name

Switch back to the parent window and fill the "Name" field:

driver.switchTo().window(parentWindowID);
WebElement nameField = driver.findElement(By.cssSelector("input[name='name']"));
nameField.sendKeys(courseName);

Explanation:

switchTo().window(parentWindowID) returns focus to the parent window.
The nameField is located using a CSS selector.
sendKeys(courseName) enters the course title into the field.

Step 6: Clean Up

Close all browsers:

driver.quit();

When executed, the program:

Opens the form page.
Opens a new tab and navigates to the Course Page.
Extracts the first course title.
Switches back to the form page and enters the title.
Closes all browsers.

3. Capturing Web Element Screenshots in Selenium 4

Selenium 4 allows you to capture screenshots of specific web elements, not just the entire page, which is useful for verifying the appearance of individual fields or components.

Scenario: Verify the "Name" Field Content

After entering the course name into the "Name" field, we want to capture a screenshot of just that field to confirm it displays correctly.

Step 1: Locate the Web Element

We already have the "Name" field:

WebElement nameField = driver.findElement(By.cssSelector("input[name='name']"));
nameField.sendKeys(courseName);

Step 2: Capture the Screenshot

Use the getScreenshotAs method:

File screenshot = nameField.getScreenshotAs(OutputType.FILE);

Explanation:

getScreenshotAs(OutputType.FILE) captures the screenshot of nameField as a file object.
This feature is new in Selenium 4.

Step 3: Save the Screenshot as a Physical File

Convert the file object to a physical PNG file using FileHandler:

FileHandler.copy(screenshot, new File("nameInput.png"));

Explanation:

FileHandler.copy(source, destination) saves the screenshot as nameInput.png
Add a throws IOException declaration to handle file operations.

When executed, a nameInput.png file appears in your project directory, showing only the "Name" field with the entered course name as shown in the image below:

4. Getting Element Dimensions (Height and Width) in Selenium 4

Selenium 4 lets you retrieve the height and width of web elements, which is crucial for verifying responsive web designs that adapt to different screen resolutions.

Scenario: Verify the "Name" Field Dimensions

We want to check the height and width of the "Name" field to ensure it matches the design specifications provided by the product owner or business analyst.

Step 1: Locate the Web Element

We reuse the "Name" field:

WebElement nameField = driver.findElement(By.cssSelector("input[name='name']"));

Step 2: Get the Dimensions

Use the getRect() method to retrieve the element’s dimensions:

int height = nameField.getRect().getHeight();
int width = nameField.getRect().getWidth();
System.out.println(height); // Output: 38
System.out.println(width);  // Output: 930

Explanation:

getRect() returns a Rectangle object containing the element’s position and size.
getHeight() and getWidth() extract the height and width in pixels.
Print the values to compare with expected dimensions.

When executed, the program outputs 38 for height and 930 for width. You can use assertions to compare these with the required values and fail the test if they don’t match.

Key Points:

Use getRect().getHeight() and getRect().getWidth() for dimensions.
Essential for testing responsive designs across devices (e.g., iPad, iPhone).
Part of UX (User Experience) testing, which includes height, width, and pixel color.
Available only in Selenium 4.

Why These Features Matter

The features introduced in Selenium 4 make automation more intuitive and powerful:

Relative Locators simplify locating elements in complex layouts, reducing reliance on cumbersome XPath traversals.
Multiple Windows/Tabs enable multitasking, allowing tests to interact with multiple URLs seamlessly.
Element Screenshots provide targeted visual verification, ideal for debugging in non-visual environments like virtual machines.
Element Dimensions ensure compliance with responsive design requirements, enhancing UX testing.

These capabilities are exclusive to Selenium 4, so if you’re using Selenium 3.x, you’ll need to upgrade to access them.

Tips for Beginners

Set Up Selenium 4: Ensure you have the latest Selenium WebDriver and a compatible browser driver (e.g., ChromeDriver).
Practice with Real Applications: Use practice sites to experiment with these features.
Debugging: Use tools like SelectorsHub to inspect elements and validate locators.
Handle Exceptions: Add error handling (e.g., try-catch) for robust tests.
Explore Documentation: Refer to Selenium’s official documentation for detailed syntax and examples.

Check out the code repository below for examples discussed in this blog:

https://github.com/samikshakute/SeleniumLearning/tree/main/Selenium4

Conclusion

Selenium 4’s new features - relative locators, multiple windows/tabs, element screenshots, and element dimensions empower beginners and experienced testers alike to write more efficient and precise automation scripts. By following the examples in this blog, you can start using these features to handle dynamic web elements, multitask across browser instances, verify visual outputs, and ensure responsive designs. Practice these concepts on a test application, and you’ll be well on your way to mastering Selenium 4!

Happy testing!

Selenium 4 Made Easy: Using Relative Locators, Windows & Screenshots

Table of contents

Introduction to Selenium 4 Features

1. Understanding Relative Locators in Selenium 4

Why Use Relative Locators?

Example: Extracting the "Name" Label

Step 1: Identify the Unique Element

Step 2: Use a Relative Locator

Example: Clicking a Button Below the "Date of Birth" Label

Step 1: Capture the "Date of Birth" Label

Step 2: Locate the Submit Button Below

Example: Selecting a Checkbox to the Left of a Label

Example: Extracting a Label to the Right of a Radio Button

2. Invoking Multiple Windows or Tabs in Selenium 4

Scenario: Filling a Form with Data from Another Page

Step 1: Set Up the Chrome Browser

Step 2: Open a New Tab

Step 3: Switch to the New Tab

Step 4: Navigate to the Course Page and Extract the Course Title

Step 5: Switch Back to the Parent Window and Enter the Course Name

Step 6: Clean Up

3. Capturing Web Element Screenshots in Selenium 4

Scenario: Verify the "Name" Field Content

Step 1: Locate the Web Element

Step 2: Capture the Screenshot

Step 3: Save the Screenshot as a Physical File

4. Getting Element Dimensions (Height and Width) in Selenium 4

Scenario: Verify the "Name" Field Dimensions

Step 1: Locate the Web Element

Step 2: Get the Dimensions

Why These Features Matter

Tips for Beginners

Conclusion

Subscribe to my newsletter

Samiksha Kute

Samiksha Kute