When Beautiful Soup Wasn’t Enough _ My Selenium Web Scraping Adventure


After mastering Beautiful Soup , i discovered a shocking truth "Not all websites surrender their data easily! some hide behind javascript " ,when I met selenium Jocker of web scraping
Why Selenium? (When Beautiful Soup Isn't Enough)
Beautiful Soup | Selenium | |
JavaScript Content | ❌ Fails | ✅ Works |
Clicking Buttons | ❌ Impossible | ✅ Easy |
Real Interactions | ❌ No | ✅ Yes |
Speed | ⚡ Fast | 🐢 Slow |
Since Selenium is widely used by data scientists and developers to interact with JavaScript-heavy sites, I decided to give it a try and share my experience learning it.
My goal was to scrape a list of audiobooks from Audible.com, including useful details like:
📘 Title
✍️ Author
🌐 Language
⏱️ Duration
⭐ Rating
I used - selenium
to interact with the browser, Edge WebDriver
for the automation, and pandas
to store and export the data to CSV.
Enter Selenium
- set up selenium and Edge driver
from selenium import webdriver
from selenium.webdriver.edge.service import Service
import pandas as pd
# EdgeDriver drama 🎭
path = "C:\\Users\\hp\\Downloads\\edgedriver_win64\\msedgedriver.exe"
service = Service(executable_path=path)
driver = webdriver.Edge(service=service)
driver.get('https://www.audible.com/search')
- Get book's informations
products = driver.find_elements(by='xpath', value='//li[contains(@class,"productListItem")]')
book_titles = []
book_authors = []
# ... (other lists)
for product in products:
book_titles.append(product.find_element(by='xpath', value='.//h3[contains(@class,"bc-heading")]').text)
book_authors.append(product.find_element(by='xpath', value='.//li[contains(@class,"authorLabel")]').text)
# ... (other attributes)
driver.quit() # Always close the browser!
by the way _XPath is like GPS for HTML elements
- Saving the Loot (as CSV)
df_books = pd.DataFrame({
'title': book_titles,
'author': book_authors,
'language': book_languages,
'duration': book_durations,
'rating': book_ratings
})
df_books.to_csv('audio_books.csv', index=False)
Lessons Learned
Selenium is slow but powerful. It literally opens a browser—prepare for delays.
Always quit the driver. Or you’ll have 10 Edge windows laughing at you.
😇remember me ,I’ll scrape something way more complex—like auto-logging into a site, scraping behind auth walls, or even automating purchases (for research, obviously).
🔗 Full Code: GitHub Repo
🐦 Follow Me: @BoussanniEl
#Python #WebScraping #Automation #LearningToCode #selenium
Subscribe to my newsletter
Read articles from ASSIA EL BOUSSANNI directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
ASSIA EL BOUSSANNI
ASSIA EL BOUSSANNI
🎓 Master's student in Big Data & Data Science | 🚀 Focused on data science, big data, machine learning, and development. Passionate about designing scalable systems and solving real-world problems with tech innovation. 🌟 On my blog, I break down complex concepts in system design and data science to help others grow. Let’s learn and build together! 💡