Introducing Housefly: A Playground for Web Scraping


Web scraping is an essential skill for developers, but learning it can be tricky. That’s why I created Housefly, a hands-on project designed to teach web scraping through interactive exercises. Inspired by Google Gruyere, Housefly provides a series of small tutorials with dedicated companion websites built to be scraped. The goal? To give you a safe, structured environment to practice and refine your scraping skills.
Why Did I Make This?
I’ve seen countless tutorials that explain web scraping in theory, but very few offer real, controlled environments to experiment in. Housefly solves that by providing self-contained challenges where you scrape provided websites and verify your solutions against expected outputs. It’s built for hands-on learners who want to do rather than just read.
How to Get Started
The few chapters are live, and you can try it out today!
Clone the GitHub repo:
git clone https://github.com/jonaylor89/housefly.git cd housefly
Navigate to Chapter 1 and explore the provided website.
Write your scraper in
apps/solution1/
.Run the checker to validate your solution:
npm run ca 1
More chapters are on the way, but for now, dive into the first challenge and start scraping!
Subscribe to my newsletter
Read articles from Johannes Naylor directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
