12.0 Introduction
Linux distributions provide several different sets of commands for compressing and archiving files and directories. Archiving combines multiple files into one, which eliminates the overhead in individual files and makes it easier to transmit. Compression makes the files smaller by removing redundant information.
12.1 Step 1
The gzip
and gunzip
commands are used to compress and
uncompress a file, respectively. The gzip
command replaces the original file with the compressed .gz
file.
gzip [OPTION]... [FILE]...
In the following example, the longfile.txt
file is replaced with the compressed longfile.txt.gz
file after using the gzip
command:
cd Documents/
ls longfile.txt
gzip longfile.txt
ls longfile*
sysadmin@localhost:~$ cd Documents/
sysadmin@localhost:~/Documents$ ls longfile.txt
longfile.txt
sysadmin@localhost:~/Documents$ gzip longfile.txt
sysadmin@localhost:~/Documents$ ls longfile*
longfile.txt.gz
The gzip
command should be used with caution since its default behavior is to replace
the original file specified with a compressed version.
12.2 Step 2
The gunzip command reverses the action of gzip, so the .gz file is uncompressed and replaced by the original file.
gunzip [OPTION]... [FILE]...
Use the -l option with gunzip to list the amount of compression of an existing file and then use the gunzip command alone to decompress the longfile.txt.gz file:
gunzip -l longfile.txt.gz
gunzip longfile.txt.gz
ls longfile*
sysadmin@localhost:~/Documents$ gunzip -l longfile.txt.gz
compressed uncompressed ratio uncompressed_name
341 66540 99.5% longfile.txt
sysadmin@localhost:~/Documents$ gunzip longfile.txt.gz
sysadmin@localhost:~/Documents$ ls longfile*
longfile.txt
12.3 Step 3
To retain the original file while compressing using the gzip
command, use the -c
option. To do this with the animals.txt
file, execute the following
commands:
gzip –c animals.txt > animals.txt.gz
ls –l animals*
sysadmin@localhost:~/Documents$ gzip -c animals.txt > animals.txt.gz
sysadmin@localhost:~/Documents$ ls -l animals*
-rw-r--r-- 1 sysadmin sysadmin 42 Apr 24 16:24 animals.txt
-rw-rw-r-- 1 sysadmin sysadmin 71 May 21 03:14 animals.txt.gz
12.4 Step 4
The zcat
command is used
to display the contents of a compressed file without actually uncompressing it.
Use the following command to view the contents of the compressed words.gz
file:
zcat animals.txt.gz
sysadmin@localhost:~/Documents$ zcat animals.txt.gz
sysadmin@localhost:~/Documents$ zcat animals.txt.gz
1 retriever
2 badger
3 bat
4 wolf
5 eagle
12.5 Step 5
The gzip and gunzip commands support recursion with the -r option. In order to be able to compress files with the gzip command recursively, a user needs to have the correct permissions on the directories the files are in. Typically, this is limited to directories within the user's own home directory.
Create a directory called my_directory, which contains three files: file1, file2, and file3.
To avoid having to repeatedly type the same file or directory name, type the first few characters of the file name and press the Tab key. Alternatively, you can use the Esc+. (the Escape Key and the period . character) shortcut to recall the last file name used.
mkdir mydirectory
touch mydirectory/file1 mydirectory/file2 mydirectory/file3
ls mydirectory/
sysadmin@localhost:~/Documents$ mkdir mydirectory
sysadmin@localhost:~/Documents$ touch mydirectory/file1 mydirectory/file2 mydirectory/file3
sysadmin@localhost:~/Documents$ ls mydirectory/
file1 file2 file3
Now, use the gzip command recursively on the my_directory directory by executing the following commands:
gzip -r mydirectory/
ls mydirectory/
sysadmin@localhost:~/Documents$ gzip -r mydirectory/
sysadmin@localhost:~/Documents$ ls mydirectory/
file1.gz file2.gz file3.gz
Use the gunzip command to recursively uncompress the files in the mydirectory directory:
gunzip –r mydirectory/
sysadmin@localhost:~/Documents$ gunzip -r mydirectory/
sysadmin@localhost:~/Documents$ ls mydirectory/
file1 file2 file3
Permissions can have an impact on file management commands, such as the gzip and gunzip commands. To gzip or gunzip a file within a directory, a user must have the write and execute permission on a directory as well as the read permission on the file.
12.6 Step 6
The bzip2
and bunzip2
commands work in a nearly
identical fashion to the gzip
and gunzip
commands:
bzip2 [OPTION]... [FILE]...
bunzip2 [OPTION]... [FILE]...
To compress the longfile.txt
file in the current directory using the bzip2
command, execute the following commands:
bzip2 longfile.txt
ls –l longfile*
sysadmin@localhost:~/Documents$ bzip2 longfile.txt
sysadmin@localhost:~/Documents$ ls -l longfile*
-rw-r--r-- 1 sysadmin sysadmin 188 Apr 24 16:24 longfile.txt.bz2
Similar to the gzip
and gunzip
commands, the bzip2
and bunzip2
commands are also used to compress and uncompress
a file. The compression algorithm used by both commands is different, but the usage
is very similar. The extension of the files created by bzip2
command is .bz2
.
12.7 Step 7
The bzcat
command prints
the content of specified files compressed with the bzip2
command to the standard output.
bzcat [OPTION]... [FILE]...
To view the compressed lonfile.txt.bz2
file, use the following command:
sysadmin@localhost:~/Documents$ bzcat longfile.txt.bz2
hello
inkling
jogger
apple
banana
cat
dog
elephant
flower
grapes
hello
inkling
jogger
apple
banana
cat
dog
elephant
flower
grapes
hello
inkling
jogger
sysadmin@localhost:~/Documents$
12.8 Step 8
To extract the compressed longfile.txt.bz2
file created in the example above, execute the following commands:
bunzip2 longfile.txt.bz2
ls –l longfile*
sysadmin@localhost:~/Documents$ bunzip2 longfile.txt.bz2
sysadmin@localhost:~/Documents$ ls -l longfile*
-rw-r--r-- 1 sysadmin sysadmin 66540 Apr 24 16:24 longfile.txt
While the gzip
command supports recursion with the -r
option, the bzip2
command
does not support a separate option for recursion. So, bzip2
cannot be used to compress a
nested directory structure.
12.9 Step 9
Another compression tool similar to gzip
and bzip2
is the xz
command.
xz [OPTION]... [FILE]...
To compress the longfile.txt
file in the current directory using the bzip2
command, execute the following commands:
sysadmin@localhost:~/Documents$ xz longfile.txt
sysadmin@localhost:~/Documents$ ls -l longfile*
-rw-r--r-- 1 sysadmin sysadmin 220 Apr 24 16:24 longfile.txt.xz
As demonstrated in the output above, when a new compressed file is created,
the .xz
extension is added
to the file name.
12.10 Step 10
The xzcat command is used to print the contents of files compressed with the xz command to standard output on the terminal without uncompressing the target file.
xzcat [FILE]...
To view the compressed longfile.txt.gz file, use the following command:
xzcat longfile.txt
sysadmin@localhost:~/Documents$ xzcat longfile.txt.xz
hello
inkling
jogger
apple
banana
cat
dog
elephant
flower
grapes
hello
inkling
jogger
apple
banana
cat
dog
elephant
flower
grapes
hello
inkling
jogger
sysadmin@localhost:~/Documents$
12.11 Step 11
Next, use the unxz command to uncompress the longfile.txt.xz file:
sysadmin@localhost:~/Documents$ unxz longfile.txt.xz
sysadmin@localhost:~/Documents$ ls -l longfile*
-rw-r--r-- 1 sysadmin sysadmin 66540 Apr 24 16:24 longfile.txt
12.12 Step 12
An archive is a single file, which consists of many files, though not necessarily compressed. The tar command is typically used to make archives within Linux. The tar command provides three main functions: creating, viewing, and extracting archives:
- Create: Make a new archive out of a series of files.
- Extract: Pull one or more files out of an archive.
- List: Show the contents of the archive without extracting.
Creating an archive with the tar command requires two named options:
-c |
Create an archive. |
-f ARCHIVE |
Use the ARCHIVE file. The argument ARCHIVE will be the name of the resulting archive file. |
To create a tar archive of the /etc/vim directory and place the tar file in the current directory, execute the following commands:
tar –cf vim.tar /etc/vim
ls –l *.tar
sysadmin@localhost:~$ tar -cf vim.tar /etc/vim
tar: Removing leading `/' from member names
sysadmin@localhost:~$ ls -l *.tar
-rw-rw-r-- 1 sysadmin sysadmin 10240 Apr 18 17:26 vim.tar
12.13 Step 13
Use the -t
option to the tar
command to view a list (table of
contents) of a tar file.
|
List the files in the archive. |
|
Operate on the given archive. |
To view the contents of the vim.tar
file, execute the following command:
tar –tf vim.tar
sysadmin@localhost:~$ tar -tf vim.tar
etc/vim/
etc/vim/vimrc
etc/vim/vimrc.tiny
12.14 Step 14
The verbose -v
option can
be used with the tar
command
to view the table of contents of the archive. To view the detailed listing of
the contents of the vim.tar
file, execute the following command:
tar –tvf vim.tar
sysadmin@localhost:~$ tar -tvf vim.tar
drwxr-xr-x root/root 0 2019-03-22 17:41 etc/vim/
-rw-r--r-- root/root 2469 2018-04-10 21:31 etc/vim/vimrc
-rw-r--r-- root/root 662 2018-04-10 21:31 etc/vim/vimrc.tiny
12.15 Step 15
To extract the files from the tar file, use the -x
option.
|
Extract files from an archive. |
|
Operate on the given archive. |
To extract the files from the vim.tar
into the current directory, execute the following commands:
tar –xf vim.tar
ls etc/vim
sysadmin@localhost:~$ tar -xf vim.tar
sysadmin@localhost:~$ ls etc/vim
vimrc vimrc.tiny
12.16 Step 16
To extract the files from the vim.tar
into another directory, use the -C
option to the tar
command.
For example, execute the following commands:
tar –xvf vim.tar -C /tmp
ls –R /tmp/etc
sysadmin@localhost:~$ tar -xvf vim.tar -C /tmp
etc/vim/
etc/vim/vimrc
etc/vim/vimrc.tiny
sysadmin@localhost:~$ ls -R /tmp/etc
/tmp/etc:
vim
/tmp/etc/vim:
vimrc vimrc.tiny
12.17 Step 17
Archiving files is an efficient way of making backups and transferring large
files. The most commonly used compression utilities are zip
and unzip
. The zip
command is very useful for creating archives that can easily be shared across
multiple operating systems.
zip [OPTIONS]... [FILE]...
The -r
option allows the zip
command to compress multiple
directories into a single file recursively. To archive all the files and
directories in the /etc/perl
directory to the myperl.zip
file, execute the following command:
zip –r myperl.zip /etc/perl
sysadmin@localhost:~$ zip -r myperl.zip /etc/perl
adding: etc/perl/ (stored 0%)
adding: etc/perl/CPAN/ (stored 0%)
adding: etc/perl/Net/ (stored 0%)
adding: etc/perl/Net/libnet.cfg (deflated 52%)
The zip
command will add
the .zip
extension to files
by default, as you can see by executing the following command:
ls -l m*
sysadmin@localhost:~/Documents$ ls -l myperl*
-rw-rw-r-- 1 sysadmin sysadmin 947 May 21 03:34 myperl.zip
12.18 Step 18
To view the contents of a zip file without unpacking it, use the unzip
command with the list -l
option. To list the contents of the myperl.zip
file without unzipping it,
execute the following command:
unzip –l myperl.zip
sysadmin@localhost:~$ unzip -l myperl.zip
Archive: myperl.zip
Length Date Time Name
--------- ---------- ----- ----
0 2019-04-17 20:09 etc/perl/
0 2018-11-19 15:54 etc/perl/CPAN/
0 2019-04-17 20:09 etc/perl/Net/
611 2018-11-19 15:54 etc/perl/Net/libnet.cfg
--------- -------
611 4 files
12.19 Step 19
The unzip
command is used
to extract the files from the zip archive file. To extract the contents of the myperl.zip
file, execute the following
command:
unzip myperl.zip
sysadmin@localhost:~$ unzip myperl.zip
Archive: myperl.zip
creating: etc/perl/
creating: etc/perl/CPAN/
creating: etc/perl/Net/
inflating: etc/perl/Net/libnet.cfg
12.20 Step 20
The cpio
command is
another archival command, which can merge multiple files into a single file.
The cpio
command
works with the original POSIX specification and should be available on all
Linux and Unix systems. It is considered a legacy application, and although
administrators need to be aware of it, most systems provide better alternatives
for archiving directories.
The cpio
command operates
in two modes: copy-in mode
and copy-out mode.
The copy-out mode is used to create a new archive file. The -o
option puts the cpio
command into copy-out mode. The
files can be provided via standard input or redirected from the output of
another command to produce a file stream which will be archived.
To archive all the *.conf
files in the current directory, use the following command:
cp /etc/*.conf .
ls *.conf | cpio -ov > conf.cpio
ls *.cpio
sysadmin@localhost:~$ cp /etc/*.conf .
sysadmin@localhost:~$ ls *.conf | cpio -ov > conf.cpio
adduser.conf
ca-certificates.conf
debconf.conf
deluser.conf
fuse.conf
gai.conf
hdparm.conf
host.conf
ld.so.conf
libaudit.conf
logrotate.conf
ltrace.conf
mke2fs.conf
nsswitch.conf
pam.conf
popularity-contest.conf
resolv.conf
rsyslog.conf
sysctl.conf
ucf.conf
updatedb.conf
88 blocks
sysadmin@localhost:~$ ls *.cpio
conf.cpio
In the example above, the verbose -v
option is used to list the files that the cpio
command processes.
12.21 Step 21
The copy-in mode is used to extract files from a cpio archive file. The -i
option enables copy-in mode. In
addition, the -u
option can
be used to overwrite existing files and the -d
option is used to indicated that directories should be created. To extract the
files from the conf.cpio
file into the current directory, first delete the original files and then use
the cat command to send the data into the cpio
command:
rm *.conf
cat conf.cpio | cpio -iud
ls *.conf
sysadmin@localhost:~$ rm *.conf
sysadmin@localhost:~$ cat conf.cpio | cpio -iud
88 blocks
sysadmin@localhost:~$ ls *.conf
adduser.conf hdparm.conf mke2fs.conf sysctl.conf
ca-certificates.conf host.conf nsswitch.conf ucf.conf
debconf.conf ld.so.conf pam.conf updatedb.conf
deluser.conf libaudit.conf popularity-contest.conf
fuse.conf logrotate.conf resolv.conf
gai.conf ltrace.conf rsyslog.conf
12.22 Step 22
The dd command is a utility for copying files or entire partitions at the bit level. It can be used to clone or delete entire disks or partitions, creating large "empty" files to be used as swap files and copy raw data to removable devices. The dd command uses special arguments to specify how it will work. The following illustrates some of the more commonly used arguments:
if=FILE |
The input file to be read. |
of=FILE |
The output file to be written. |
bs=SIZE |
The block size to be used. By default, the value is considered to be in bytes. Use the following suffixes to specify other units: K, M, G, and T for kilobytes, megabytes, gigabytes, and terabytes. |
count=NUMBER |
The number of blocks to read from the input file. |
To create a file named /tmp/swapex with 500 "one megabyte" size blocks of zeroes, execute the following command:
dd if=/dev/zero of=/tmp/swapex bs=1M count=500
sysadmin@localhost:~$ dd if=/dev/zero of=/tmp/swapex bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB, 500 MiB) copied, 8.80912 s, 59.5 MB/s
To verify that the swapex file was created in the /tmp directory, execute the following command:
ls -lh /tmp
sysadmin@localhost:~$ ls -lh /tmp
total 501M
drwxrwxr-x 3 sysadmin sysadmin 4.0K Apr 18 17:31 etc
-rw-rw-r-- 1 root root 2.2K Apr 17 22:31 inside_setup.sh
-rw-rw-r-- 1 sysadmin sysadmin 500M Apr 18 17:39 swapex