Actions, Frames, and Child Window Handling in Selenium WebDriver


Selenium WebDriver is a powerful tool for automating web applications, and it offers advanced features to simulate user interactions like mouse movements, keyboard inputs, and handling complex web elements such as frames and child windows. In this blog, we’ll dive into the Actions class, frames, and child window handling in Selenium WebDriver with practical examples. Let’s get started!
Introduction to the Actions Class
The Actions class in Selenium WebDriver is designed to simulate complex user interactions, such as mouse movements, keyboard inputs, and gestures, that go beyond simple clicks or text inputs. Unlike basic Selenium methods (e.g., click()
or sendKeys()
), the Actions class allows you to perform advanced actions like:
Mouse hover: Moving the mouse over an element without clicking.
Right-click (context click): Simulating a right-click on an element.
Double-click: Performing a double-click action.
Drag-and-drop: Dragging an element and dropping it elsewhere.
Keyboard actions: Holding down keys (e.g., Shift) to type in capital letters.
These actions mimic how a real user interacts with a web application, making the Actions class essential for testing dynamic and interactive UI elements.
Why Use the Actions Class?
Imagine you’re testing a website like Amazon.com, where moving your mouse over a menu (e.g., “Hello, Sign in”) displays a dropdown without clicking. The Actions class helps you automate such scenarios, ensuring your tests validate real-world user behavior.
Using the Actions Class for Mouse and Keyboard Interactions
Let’s explore how to use the Actions class with practical examples, focusing on a test scenario on Amazon.com.
Setting Up the Actions Class
To use the Actions class, you need to:
Create an instance of the Actions class and pass the WebDriver object to it.
Use methods like
moveToElement()
,click()
, orsendKeys()
to define actions.Build and perform the actions using
build()
andperform()
.
Here’s a basic setup in Java:
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.interactions.Actions;
public class ActionsDemo {
public static void main(String[] args) {
System.setProperty("webdriver.chrome.driver", "path/to/chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("https://www.amazon.com");
// Create Actions class instance
Actions actions = new Actions(driver);
}
}
Moving to an Element (Mouse Hover)
Let’s automate a scenario where hovering over the “Hello, Sign in” link on Amazon displays a popup.
Identify the element: Inspect the “Hello, Sign in” link to find its locator. For example, its ID is nav-link-accountList.
Use moveToElement(): Move the mouse to the element.
Build and perform the action: Combine the action and execute it.
WebElement signInLink = driver.findElement(By.id("nav-link-accountList"));
actions.moveToElement(signInLink).build().perform();
moveToElement(signInLink): Moves the mouse to the “Hello, Sign in” link.
build(): Prepares the action for execution.
perform(): Executes the action, simulating the hover effect.
When you run this script, the mouse hovers over the link, and the popup appears, just like a user interaction.
Performing Composite Actions
The Actions class allows you to chain multiple actions (called composite actions) in a single sequence. For example, let’s automate entering text in capital letters in Amazon’s search bar.
Scenario: Move to the search bar, click it, hold the Shift key, type “hello” (which appears as “HELLO”), and double-click to select the text.
Locate the search bar: Its ID is “twotabsearchtextbox”.
Chain actions:
Move to the search bar.
Click to focus.
Hold the Shift key (
keyDown(Keys.SHIFT)
).Type “hello” (
sendKeys("hello")
).Double-click to select the text (
doubleClick()
).
Build and perform the composite action.
WebElement searchBox = driver.findElement(By.id("twotabsearchtextbox"));
actions.moveToElement(searchBox)
.click()
.keyDown(Keys.SHIFT)
.sendKeys("hello")
.keyUp(Keys.SHIFT)
.doubleClick()
.build()
.perform();
click(): Focuses on the search bar.
keyDown(Keys.SHIFT): Holds the Shift key to type in capitals.
sendKeys("hello"): Types “hello”, which appears as “HELLO” due to Shift.
keyUp(Keys.SHIFT): Releases the Shift key.
doubleClick(): Selects the entered text.
build().perform(): Executes the entire sequence.
When you run this, the search bar will contain “HELLO”, and the text will be selected.
Right-Clicking (Context Click)
To simulate a right-click, use the contextClick()
method. For example, right-clicking the “Hello, Sign in” link might open a context menu.
actions.moveToElement(signInLink)
.contextClick()
.build()
.perform();
This moves to the link, performs a right-click, and displays the context menu.
Why build()
and perform()
?
Build: Combines all actions into a single executable unit.
Perform: Executes the built actions.
Without both, the actions won’t execute, as Selenium needs to prepare and then trigger the sequence.
Understanding Frames in Selenium
What Are Frames?
A frame (or iframe) is a container within a webpage that displays content independent of the main page’s HTML. It’s like a separate webpage embedded inside the main page. For example, a webpage might host a frame to display an advertisement or a draggable element.
Why Are Frames Tricky?
Selenium cannot directly interact with elements inside a frame because they’re in a separate context. You must explicitly tell Selenium to switch to the frame before interacting with its elements.
Switching to Frames
To work with elements inside a frame, use the switchTo().frame()
method. You can switch to a frame using:
Frame ID: If the frame has an ID attribute.
Frame Index: Frames are indexed starting from 0 (first frame is 0).
WebElement: Locate the frame using its attributes (e.g., class or tag).
Example Scenario: On the demo website , there’s a draggable box inside a frame. Let’s click it and then drag-and-drop it.
Inspect the frame: Right-click and inspect the draggable box. You’ll notice it’s inside an
<iframe>
tag with a class “demo-frame”.Switch to the frame using its class:
driver.switchTo().frame(driver.findElement(By.className("demo-frame")));
Interact with the element:
WebElement draggable = driver.findElement(By.id("draggable")); draggable.click();
switchTo().frame(): Moves Selenium’s focus to the frame.
After switching, Selenium can locate and interact with the draggable element.
Handling Drag-and-Drop in Frames
Let’s drag the draggable element to a droppable target inside the same frame using the Actions class.
Switch to the frame (as above).
Locate source and target elements:
Source: draggable (ID: draggable).
Target: droppable (ID: droppable).
Perform drag-and-drop:
WebElement source = driver.findElement(By.id("draggable")); WebElement target = driver.findElement(By.id("droppable")); Actions actions = new Actions(driver); actions.dragAndDrop(source, target) .build() .perform();
dragAndDrop(source, target): Drags the source element to the target.
After switching to the frame, Selenium can interact with both elements.
Switching Back to the Main Page
After working inside a frame, you must switch back to the main page (default content) to interact with elements outside the frame:
driver.switchTo().defaultContent();
Example: After dragging and dropping, click a link outside the frame (e.g., “Accept” link):
driver.switchTo().defaultContent();
driver.findElement(By.linkText("Accept")).click();
Finding the Number of Frames
To determine how many frames are on a page, search for <iframe>
tags:
int frameCount = driver.findElements(By.tagName("iframe")).size();
System.out.println("Number of frames: " + frameCount);
If there’s only one frame, you can switch to it by index:
driver.switchTo().frame(0); // Switches to the first frame
Note: Using indexes is risky, as adding new frames can break your script. Prefer WebElement based switching when possible.
Handling Child Windows in Selenium
What Are Child Windows?
A child window is a new browser tab or window opened by clicking a link or button on the parent page. For example, clicking a link on a login page might open a new tab with an email ID that you need to copy and paste back into the parent page.
In Selenium, both tabs and windows are treated as windows, and each has a unique window handle (ID).
Example Scenario: On the Practice Website, clicking a link opens a child window with an email ID. We’ll grab the email and enter it into the parent page’s username field.
Switching Between Parent and Child Windows
To switch between windows, use the switchTo().window()
method and window handles.
Get all window handles:
Set<String> windows = driver.getWindowHandles();
This returns a set of window IDs (e.g., parent ID and child ID).
Iterate through handles to find the child window:
Iterator<String> it = windows.iterator(); String parentId = it.next(); // First handle is parent String childId = it.next(); // Second handle is child
Switch to the child window:
driver.switchTo().window(childId);
Interact with the child window:
Locate the paragraph containing the email ID (e.g., class: “im-para red”).
Extract the text:
String text = driver.findElement(By.cssSelector(".im-para.red")).getText();
Extracting Data from a Child Window
The paragraph might contain extra text, so we need to parse it to extract only the email ID.
Steps:
Split by “at”: Split the text at “at” to separate the email part.
Trim spaces: Remove leading/trailing spaces.
Split by space: Further split to isolate the email.
String[] splitByAt = text.split("at");
String emailPart = splitByAt[1].trim(); // Get part after "at" and trim spaces
String[] splitBySpace = emailPart.split(" ");
String email = splitBySpace[0]; // Get email before space
Debugging Tip:
Use debugging to inspect the text at runtime.
Set a breakpoint, run in debug mode, and use the “Watch” feature to test string manipulations (e.g., split, trim).
This ensures you extract the correct email ID.
Switching Back to the Parent Window
After extracting the email, switch back to the parent window:
driver.switchTo().window(parentId);
Then, enter the email into the username field:
driver.findElement(By.id("username")).sendKeys(email);
Why Switch Back?
Selenium’s focus remains on the child window after switching. Attempting to interact with the parent page without switching back will cause a “NoSuchElementException”.
Complete Code:
driver.get("https://rahulshettyacademy.com/loginpagepractice");
driver.findElement(By.cssSelector(".blinkingText")).click();
Set<String> windows = driver.getWindowHandles();
Iterator<String> it = windows.iterator();
String parentId = it.next();
String childId = it.next();
driver.switchTo().window(childId);
String text = driver.findElement(By.cssSelector(".im-para.red")).getText();
String[] splitByAt = text.split("at");
String email KILLED = splitByAt[1].trim();
String[] splitBySpace = emailPart.split(" ");
String email = splitBySpace[0];
driver.switchTo().window(parentId);
driver.findElement(By.id("username")).sendKeys(email);
This script clicks the link, switches to the child window, extracts the email, switches back, and enters it into the username field.
Key Takeaways and Best Practices
Actions Class:
Use for complex interactions like hover, right-click, double-click, and drag-and-drop.
Always call
build()
andperform()
to execute actions.Chain multiple actions for composite behaviors (e.g., typing in capitals).
Frames:
Switch to frames using
switchTo().frame()
before interacting with elements inside.Use WebElement or frame ID for reliable switching; avoid indexes if possible.
Return to the main page with
switchTo().defaultContent()
after frame operations.Count frames using
findElements(By.tagName("iframe")).size()
.
Child Windows:
Use
getWindowHandles()
to get window IDs andswitchTo().window()
to switch.Iterate through handles to access parent and child windows.
Parse text carefully using string methods like split and trim.
Always switch back to the parent window before interacting with its elements.
General Tips:
Avoid relying on tools like SelectorsHub or ChroPath; practice writing locators manually using HTML and browser Developer Tools (like Chrome DevTools).
Use debugging to validate string manipulations or locator accuracy.
Maximize the browser window (
driver.manage().window().maximize()
) for better visibility during testing.
Conclusion
The Actions class, frames, and child window handling are advanced Selenium WebDriver features that enable you to automate complex user interactions and dynamic web elements. By mastering these concepts, you can create robust automation scripts that mimic real-world user behavior, such as hovering over menus, dragging elements, or switching between windows.
Check out the complete code repository below:
Subscribe to my newsletter
Read articles from Samiksha Kute directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Samiksha Kute
Samiksha Kute
Passionate Learner!