Introducing Housefly: A Playground for Web Scraping

Johannes NaylorJohannes Naylor
1 min read

Web scraping is an essential skill for developers, but learning it can be tricky. That’s why I created Housefly, a hands-on project designed to teach web scraping through interactive exercises. Inspired by Google Gruyere, Housefly provides a series of small tutorials with dedicated companion websites built to be scraped. The goal? To give you a safe, structured environment to practice and refine your scraping skills.

Why Did I Make This?

I’ve seen countless tutorials that explain web scraping in theory, but very few offer real, controlled environments to experiment in. Housefly solves that by providing self-contained challenges where you scrape provided websites and verify your solutions against expected outputs. It’s built for hands-on learners who want to do rather than just read.

How to Get Started

The few chapters are live, and you can try it out today!

  1. Clone the GitHub repo:

     git clone https://github.com/jonaylor89/housefly.git
     cd housefly
    
  2. Navigate to Chapter 1 and explore the provided website.

  3. Write your scraper in apps/solution1/.

  4. Run the checker to validate your solution:

     npm run ca 1
    

More chapters are on the way, but for now, dive into the first challenge and start scraping!

👉 Try Housefly on GitHub

0
Subscribe to my newsletter

Read articles from Johannes Naylor directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Johannes Naylor
Johannes Naylor