WGET cookbook PDF link : Download Here

wget is a powerful command-line utility for downloading files from the web. It supports various protocols such as HTTP, HTTPS, FTP, and FTPS.

Basic File Download

This command downloads the file named file.zip from the specified URL.

wget https://example.com/file.zip

Example:

wget https://sample-videos.com/zip/10mb.zip

Download to a specific directory

wget -P /path/to/directory https://example.com/file.zip

Example:

wget -P Downloads https://sample-videos.com/zip/10mb.zip

Download with a different name

wget -O newname.zip https://example.com/file.zip

Downloads the file and save it as newname.zip

Example:

wget -O newname.zip  https://sample-videos.com/zip/10mb.zip

Download multiple files

wget https://example.com/file1.zip https://example.com/file2.zip

Example:

wget https://sample-videos.com/zip/10mb.zip https://sample-videos.com/zip/20mb.zip

Download in Background

wget -b https://example.com/largefile.zip

Example:

wget -b -O 20mnbfile.zip https://sample-videos.com/zip/20mb.zip

It will write the logs in to wget-log txt file.

Rate Limiting Download

wget --limit-rate=200k https://example.com/largefile.zip

it limits the download rate to 200 Kb/s.

Example:

wget --limit-rate=200k https://sample-videos.com/zip/20mb.zip

Resume Interrupted Download

wget -c https://example.com/largefile.zip

Example:

wget --limit-rate=200k https://sample-videos.com/zip/20mb.zip

Downloading Entire Website

wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains example.com --no-parent https://example.com

--recursive: Enables recursive retrieval, meaning wget will download not only the specified URL but also follow and download links within that page, continuing recursively.
--no-clobber: This option prevents wget from overwriting existing files. If a file with the same name already exists in the local directory, wget will not download it again.
--page-requisites: Downloads all the elements needed to properly display the page offline. This includes inline images, stylesheets, and other resources referenced by the HTML.
--html-extension: Appends the .html extension to HTML files downloaded. This is useful when saving a complete website for offline browsing, as it helps maintain proper file extensions.
--convert-links: After downloading, converts the links in the downloaded documents to point to the local files, enabling offline browsing. This is important when you want to view the downloaded content without an internet connection.
--domains example.com: Restricts the download to files under the specified domain (example.com). This ensures that wget doesn't follow links to external domains, focusing only on the specified domain.
--no-parent: Prevents wget from ascending to the parent directory while recursively downloading. It ensures that only content within the specified URL and its subdirectories is downloaded.
https://example.com: The URL from which wget starts the recursive download.

Example:

wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains hashnode.dev --no-parent https://redterminal.hashnode.dev

Mirror an entire website

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com

--mirror: Enables mirroring, which includes recursion to download the entire website.
--convert-links: Converts the links in the downloaded documents to point to the local files for proper offline browsing.
--adjust-extension: Adds proper file extensions to downloaded files.
--page-requisites: Downloads all the elements needed to properly display the page offline, such as inline images and stylesheets.
--no-parent: Prevents wget from ascending to the parent directory while recursively downloading.

Example:

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com

Download with a user-agent

Some websites block the request, if it finds the request is not coming from a browser. In those scenarios we can add the User-Agent in the http-header.

wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" https://example.com/file.zip

Example:

wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" https://sample-videos.com/zip/20mb.zip

Download with proxy - [[Proxy Server]]

wget --proxy=http://proxy.example.com:8080 https://example.com/file.zip

Download files matching a pattern

wget -r -l1 -np -nd -A "*.jpg" https://example.com/images/

l1 -> Recursion depth level 1
np -> No parent directory files downloaded
nd -> No directory created

13.Test download url exists before downloading

wget --spider https://example.com

Example:

wget --spider https://sample-videos.com/zip/10mb.zip

14.Quit Download when it exceeds a certain time

wget -Q5m -i FILE-WHICH-HAS-URLS

Example:

wget -Q5m -i https://sample-videos.com/zip/10mb.zip https://sample-videos.com/zip/20mb.zip

Note: This quota will not get effect when you do a download a single URL. That is irrespective of the quota size everything will get downloaded when you specify a single file. This quota is applicable only for recursive downloads.

Lets try with recursive download,

wget --recursive -Q5m --no-clobber --page-requisites --html-extension --convert-links --domains hashnode.dev --no-parent https://redterminal.hashnode.dev

Increase total number of retries

wget --tries=75 DOWNLOAD-URL

Top 15 WGET commands you need to know !

Subscribe to my newsletter

Syed Jafer K

Syed Jafer K