Today I've learned: Compression and Archiving


Intro
Pentesters/hackers often need to share files and scripts. Most of the time we need to send more than one file and sometimes those files can be BIG. That’s where the archiving and compression comes handy.
On Windows machines it’s common case to see .zip files. These are the archives.
What is Compression?
Basically compression makes data smaller requiring less storage capacity and making the data easier to send.
There are two types of compression:
Lossy
Lossless
Lossy
Lossy compression is very good when we need smaller file sizes. It’s efficient and effective, but it also has draw backs. Files lose integrity of the information.
Some of the most popular Lossy compression algorithms are -
.mp3
.mp4
.jpg
If we compare the sounds of .mp3 file and same sounds of .flac (uncompressed) file, we could hear the difference. Same with .nef (RAW) and .jpeg photos.
.nef (RAW) size would be ~40MB and .jpeg would compress the same photograph to around 11MB.
We cannot use lossy compression when dealing with software sending because data integrity is crucial.
In this article we’re focusing on lossless compression
There are several ways to compress files. They use different compression algorithms and have different compression ratios.
Tarring the files
First thing that we need to do before compressing files is to combine them into one file.
For that we’ll use tar .
Tar stands for “tape archive”, because in the prehistoric days systems used tapes to store data
When we create single file with tar it is called archive, tar file, tarball.
Let’s create example files to combine them into one file:
echo "Example file 1" > example1
echo "Example file 2" > example2
echo "Example file 3" > example3
Let’s see the files that we’ve created (use long listing to see the file size):
└─# ls -l
total 12
-rw-r--r-- 1 root root 10 Dec 30 05:35 example1
-rw-r--r-- 1 root root 19 Dec 30 05:35 examaple2
-rw-r--r-- 1 root root 23 Dec 30 05:35 examaple3
Now we can combine the files into one file:
tar -cvf Example.tar example1 example2 example3
example1
example2
example3
c - stands for create
f - write to the following file
v - optional parameter which lists files that tar is dealing with
You can check the options using tar —help
Now when we long list our files we can see that there’s a new file called Example.tar:
ls -l
-rw-r--r-- 1 root root 10 Dec 30 05:35 example1
-rw-r--r-- 1 root root 15 Dec 30 05:46 example2
-rw-r--r-- 1 root root 15 Dec 30 05:46 example3
-rw-r--r-- 1 root root 10240 Dec 30 05:46 Example.tar
You can see that the size of tarball is way bigger than the files sizes combined. It becomes less significant with larger files.
Display files without extracting them
We can see what files are inside the tarball without extraction by doing so:
tar -tvf Example.tar
-rw-r--r-- root/root 10 2024-12-30 05:35 example1
-rw-r--r-- root/root 15 2024-12-30 05:46 example2
-rw-r--r-- root/root 15 2024-12-30 05:46 example3
Extracting files
To extract files we can use x switch:
tar -xvf Example.tar
example1
example2
example3
Keep in mind that by default if an extracted file already exists, tar will replace it with extracted file.
Compressing files
Ways to compress files
gzip (GNU zip) - it uses the extension of .tar.gz or .tgz
- It is most commonly used compression utility in Linux.
- To decompress files we can use gunzip
bzip2 - it uses the extension of .tar.bz2
It works similarly to gzip but has better compression ratios which mean that compressed files will be smaller
To decompress use bunzip2
compress - it uses the extension of .tar.z
This is least commonly used tool, but it’s easy to remember.
To decompress use uncompress
You can also use gunzip do decompress files which were compressed using compress
Compression example using gzip
Let’s compress our Example.tar using gzip
gzip Example.tar
ls -l
-rw-r--r-- 1 root root 10 Dec 30 05:35 example1
-rw-r--r-- 1 root root 15 Dec 30 05:46 example2
-rw-r--r-- 1 root root 15 Dec 30 05:46 example3
-rw-r--r-- 1 root root 199 Dec 30 05:46 Example.tar.gz
You can see that now our compressed file size is only 199. Before it was 10240!
Let’s decompress it:
gunzip Example.tar.gz
ls -l
-rw-r--r-- 1 root root 10 Dec 30 05:35 example1
-rw-r--r-- 1 root root 15 Dec 30 05:46 example2
-rw-r--r-- 1 root root 15 Dec 30 05:46 example3
-rw-r--r-- 1 root root 10240 Dec 30 05:46 Example.tar
As you can see our file size is same as before - 10240.
Credits
I’m learning using this book:
Linux Basics For Hackers by OCCUPYTHEWEB (MASTER OTP)
Subscribe to my newsletter
Read articles from Jonas Satkauskas directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
