Hey everyone! 👋

I’ve been diving into web scraping recently, and today I completed my second scraping project — scraping quotes from the awesome site quotes.toscrape.com. This project was a great way to practice handling pagination and extracting multiple fields like quotes, authors, and tags.

What I Wanted to Build

The goal was to:

Scrape all quotes from every page on the site (there are 10 pages)
Extract the quote text, author, and tags
Save everything neatly into a CSV file for further analysis or fun

Tools I Used

requests — to fetch the HTML content
BeautifulSoup — to parse the HTML and extract data
csv — to save the data into a CSV file
urljoin from urllib.parse — to handle the pagination URLs

Here’s the Full Code I Used

import requests
from bs4 import BeautifulSoup
import csv
from urllib.parse import urljoin

base_url = "https://quotes.toscrape.com"
current_url = base_url

with open("scrapped.csv", 'w', newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerow(["Quote", "Author", "Tags"])

    while current_url:
        response = requests.get(current_url)
        soup = BeautifulSoup(response.text, "html.parser")

        quotes = soup.find_all("div", class_="quote")

        for quote in quotes:
            text = quote.find("span", class_="text").get_text()
            author = quote.find("small", class_="author").get_text()
            tag_elements = quote.find("div", class_="tags").find_all("a", class_="tag")
            tags = [tag.text for tag in tag_elements]
            tag_string = ", ".join(tags)

            writer.writerow([text, author, tag_string])

        print(f"Scraped: {current_url}")

        next_btn = soup.find("li", class_="next")

        if next_btn:
            next_btn_url = next_btn.a["href"]
            current_url = urljoin(current_url, next_btn_url)
        else:
            print("No next page found. Scraping complete.")
            current_url = None

How This Works

Start scraping from the homepage (base_url)
For each page, find all quote blocks and extract the quote, author, and tags
Write each quote to a CSV file
Look for the “Next” button — if it exists, update the URL and continue scraping
Stop when there are no more pages

What I Learned

How to handle pagination properly by updating URLs
Extracting multiple data points from nested HTML elements
Using urljoin to safely join relative URLs with the base URL
Writing to CSVs cleanly to preserve all data

Final Thoughts

This was a fun and practical project to deepen my web scraping skills. If you’re new to scraping, definitely give this a try — the site is perfect for beginners!

Happy scraping! 🚀

My Second Web Scraping Project Today: Scraping All Quotes from Quotes.toscrape.com with Pagination