Today I've learned: Compression and Archiving

Intro

Pentesters/hackers often need to share files and scripts. Most of the time we need to send more than one file and sometimes those files can be BIG. That’s where the archiving and compression comes handy.

On Windows machines it’s common case to see .zip files. These are the archives.

What is Compression?

Basically compression makes data smaller requiring less storage capacity and making the data easier to send.

There are two types of compression:

  1. Lossy

  2. Lossless

Lossy

Lossy compression is very good when we need smaller file sizes. It’s efficient and effective, but it also has draw backs. Files lose integrity of the information.

Some of the most popular Lossy compression algorithms are -

  1. .mp3

  2. .mp4

  3. .jpg

If we compare the sounds of .mp3 file and same sounds of .flac (uncompressed) file, we could hear the difference. Same with .nef (RAW) and .jpeg photos.

.nef (RAW) size would be ~40MB and .jpeg would compress the same photograph to around 11MB.

We cannot use lossy compression when dealing with software sending because data integrity is crucial.

In this article we’re focusing on lossless compression

There are several ways to compress files. They use different compression algorithms and have different compression ratios.

Tarring the files

First thing that we need to do before compressing files is to combine them into one file.

For that we’ll use tar .

Tar stands for “tape archive”, because in the prehistoric days systems used tapes to store data

When we create single file with tar it is called archive, tar file, tarball.

Let’s create example files to combine them into one file:

echo "Example file 1" > example1
echo "Example file 2" > example2
echo "Example file 3" > example3

Let’s see the files that we’ve created (use long listing to see the file size):

└─# ls -l
total 12
-rw-r--r-- 1 root root 10 Dec 30 05:35 example1
-rw-r--r-- 1 root root 19 Dec 30 05:35 examaple2
-rw-r--r-- 1 root root 23 Dec 30 05:35 examaple3

Now we can combine the files into one file:

tar -cvf Example.tar example1 example2 example3  
example1
example2
example3

c - stands for create

f - write to the following file

v - optional parameter which lists files that tar is dealing with

You can check the options using tar —help

Now when we long list our files we can see that there’s a new file called Example.tar:

ls -l
-rw-r--r-- 1 root root    10 Dec 30 05:35 example1
-rw-r--r-- 1 root root    15 Dec 30 05:46 example2
-rw-r--r-- 1 root root    15 Dec 30 05:46 example3
-rw-r--r-- 1 root root 10240 Dec 30 05:46 Example.tar

You can see that the size of tarball is way bigger than the files sizes combined. It becomes less significant with larger files.

Display files without extracting them

We can see what files are inside the tarball without extraction by doing so:

tar -tvf Example.tar
-rw-r--r-- root/root        10 2024-12-30 05:35 example1
-rw-r--r-- root/root        15 2024-12-30 05:46 example2
-rw-r--r-- root/root        15 2024-12-30 05:46 example3

Extracting files

To extract files we can use x switch:

tar -xvf Example.tar
example1
example2
example3

Keep in mind that by default if an extracted file already exists, tar will replace it with extracted file.

Compressing files

Ways to compress files

  1. gzip (GNU zip) - it uses the extension of .tar.gz or .tgz

    • It is most commonly used compression utility in Linux.
  • To decompress files we can use gunzip
  1. bzip2 - it uses the extension of .tar.bz2

    • It works similarly to gzip but has better compression ratios which mean that compressed files will be smaller

    • To decompress use bunzip2

  2. compress - it uses the extension of .tar.z

    • This is least commonly used tool, but it’s easy to remember.

    • To decompress use uncompress

    • You can also use gunzip do decompress files which were compressed using compress

Compression example using gzip

Let’s compress our Example.tar using gzip

gzip Example.tar
ls -l
-rw-r--r-- 1 root root  10 Dec 30 05:35 example1
-rw-r--r-- 1 root root  15 Dec 30 05:46 example2
-rw-r--r-- 1 root root  15 Dec 30 05:46 example3
-rw-r--r-- 1 root root 199 Dec 30 05:46 Example.tar.gz

You can see that now our compressed file size is only 199. Before it was 10240!

Let’s decompress it:

gunzip Example.tar.gz
ls -l
-rw-r--r-- 1 root root    10 Dec 30 05:35 example1
-rw-r--r-- 1 root root    15 Dec 30 05:46 example2
-rw-r--r-- 1 root root    15 Dec 30 05:46 example3
-rw-r--r-- 1 root root 10240 Dec 30 05:46 Example.tar

As you can see our file size is same as before - 10240.

Credits

I’m learning using this book:

Linux Basics For Hackers by OCCUPYTHEWEB (MASTER OTP)

10
Subscribe to my newsletter

Read articles from Jonas Satkauskas directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jonas Satkauskas
Jonas Satkauskas