Automate browsing on Debian 12 (KDE 5.7) using image recognition, keyboard, and mouse automation.

To automate browsing on Debian 12 (KDE 5.7) using image recognition, keyboard, and mouse automation, follow this structured approach:
0. LLM API.
U can use an LLM , say gpt4,
to generate a script with the below Tools: xdotool, AutoKey, wmctrl, scrot, PyAutoGUI, OpenCV .
1. Install Required Tools
Run the following command to install all dependencies:
sudo apt update && sudo apt install -y xdotool autokey-gtk autokey-qt wmctrl qdbus scrot python3-pip
pip install pyautogui opencv-python numpy pillow
xdotool → Simulate keyboard & mouse input.
AutoKey → Macro automation.
wmctrl → Control windows.
scrot → Screenshot tool for image recognition.
PyAutoGUI → GUI automation with image recognition.
OpenCV → Advanced image processing.
2. Automate Browser Actions Using Image Recognition
Using PyAutoGUI
(a) Open Browser & Navigate
import pyautogui
import time
# Open Firefox
pyautogui.hotkey('ctrl', 'alt', 't') # Open terminal
time.sleep(1)
pyautogui.write('firefox\n') # Launch Firefox
time.sleep(3)
# Click on Address Bar
pyautogui.hotkey('ctrl', 'l')
pyautogui.write('https://www.debian.org\n')
(b) Locate & Click Elements via Screenshots
Take a Screenshot of the Button
scrot ~/button.png
Find & Click the Button
button = pyautogui.locateCenterOnScreen('~/button.png') if button: pyautogui.click(button) else: print("Button not found!")
3. Automate Browser with xdotool
(a) Open a Website
xdotool search --onlyvisible --class "firefox" windowactivate key ctrl+l type "https://www.google.com" key Return
(b) Click a Button via Coordinates
xdotool mousemove 500 300 click 1
(c) Scroll & Navigate
xdotool key Down Down Down
xdotool key ctrl+Tab # Switch tab
4. Automate Browser Using Selenium (Headless)
If GUI automation is not required, use Selenium WebDriver.
(a) Install Selenium & Chrome WebDriver
pip install selenium webdriver-manager
(b) Automate Google Search
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
driver.get("https://www.google.com")
search_box = driver.find_element("name", "q")
search_box.send_keys("Debian 12 Automation")
search_box.send_keys(Keys.RETURN)
5. Automate KDE Desktop & Browser With AutoKey
Open AutoKey (
autokey-gtk
)Create a New Script:
keyboard.send_keys("<ctrl>+t") # Open a new tab time.sleep(1) keyboard.send_keys("https://www.debian.org\n") # Type URL
Assign a Hotkey (Example:
Ctrl+Shift+X
)
6. Automate Browser Using QDBus
Open a Firefox Tab:
qdbus org.mozilla.firefox /browser org.mozilla.firefox.LoadURI "https://www.debian.org"
Summary
Method | Purpose |
PyAutoGUI | Image-based mouse & keyboard automation |
xdotool | Simulate keyboard & mouse |
AutoKey | Automate text input & macros |
Selenium | Web automation (headless) |
QDBus | Control KDE applications |
EXAMPLE 1 .
xdotool search --onlyvisible --class "firefox" windowactivate --sync key --delay 100 ctrl+l
xdotool type "https://www.google.com"
xdotool key Return
Subscribe to my newsletter
Read articles from user1272047 directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
