Linux Professional Institute Learning Logo.
Skip to main content
  • Home
    • All Resources
    • LPI Learning Materials
    • Become a Contributor
    • Publishing Partners
    • Become a Publishing Partner
    • About
    • FAQ
    • Contributors
    • Roadmap
    • Contact
  • LPI.org
103.3 Lesson 2
Topic 101: System Architecture
101.1 Determine and configure hardware settings
  • 101.1 Lesson 1
101.2 Boot the system
  • 101.2 Lesson 1
101.3 Change runlevels / boot targets and shutdown or reboot system
  • 101.3 Lesson 1
Topic 102: Linux Installation and Package Management
102.1 Design hard disk layout
  • 102.1 Lesson 1
102.2 Install a boot manager
  • 102.2 Lesson 1
102.3 Manage shared libraries
  • 102.3 Lesson 1
102.4 Use Debian package management
  • 102.4 Lesson 1
102.5 Use RPM and YUM package management
  • 102.5 Lesson 1
102.6 Linux as a virtualization guest
  • 102.6 Lesson 1
Topic 103: GNU and Unix Commands
103.1 Work on the command line
  • 103.1 Lesson 1
  • 103.1 Lesson 2
103.2 Process text streams using filters
  • 103.2 Lesson 1
103.3 Perform basic file management
  • 103.3 Lesson 1
  • 103.3 Lesson 2
103.4 Use streams, pipes and redirects
  • 103.4 Lesson 1
  • 103.4 Lesson 2
103.5 Create, monitor and kill processes
  • 103.5 Lesson 1
  • 103.5 Lesson 2
103.6 Modify process execution priorities
  • 103.6 Lesson 1
103.7 Search text files using regular expressions
  • 103.7 Lesson 1
  • 103.7 Lesson 2
103.8 Basic file editing
  • 103.8 Lesson 1
Topic 104: Devices, Linux Filesystems, Filesystem Hierarchy Standard
104.1 Create partitions and filesystems
  • 104.1 Lesson 1
104.2 Maintain the integrity of filesystems
  • 104.2 Lesson 1
104.3 Control mounting and unmounting of filesystems
  • 104.3 Lesson 1
104.5 Manage file permissions and ownership
  • 104.5 Lesson 1
104.6 Create and change hard and symbolic links
  • 104.6 Lesson 1
104.7 Find system files and place files in the correct location
  • 104.7 Lesson 1
How to get certified
  1. Topic 103: GNU and Unix Commands
  2. 103.3 Perform basic file management
  3. 103.3 Lesson 2

103.3 Lesson 2

Certificate:

LPIC-1

Version:

5.0

Topic:

103 GNU and Unix Commands

Objective:

103.3 Perform basic file management

Lesson:

2 of 2

Introduction

How to Find Files

As you use your machine, files progressively grow in number and size. Sometimes it becomes difficult to locate a particular file. Fortunately, Linux provides find to quickly search and locate files. find uses the following syntax:

find STARTING_PATH OPTIONS EXPRESSION
STARTING_PATH

defines the directory where the search begins.

OPTIONS

controls the behavior and adds specific criteria to optimize the search process.

EXPRESSION

defines the search query.

$ find . -name "myfile.txt"
./myfile.txt

The starting path in this case is the current directory. The option -name specifies that the search is based on the name of the file. myfile.txt is the name of the file to search. When using file globbing, be sure to include the expression in quotation marks:

$ find /home/frank -name "*.png"
/home/frank/Pictures/logo.png
/home/frank/screenshot.png

This command finds all files ending with .png starting from /home/frank/ directory and beneath. If you do not understand the usage of the asterisk (*), it is covered in the previous lesson.

Using Criteria to Speed Search

Use find to locate files based on type, size or time. By specifying one or more options, the desired results are obtained in less time.

Switches to finding files based on type include:

-type f

file search.

-type d

directory search.

-type l

symbolic link search.

$ find . -type d -name "example"

This command finds all directories in the current directory and below, that have the name example.

Other criteria which could be used with find include:

-name

performs a search based on the given name.

-iname

searches based on the name, however, the case is not important (i.e. the test case myFile is similar to MYFILE).

-not

returns those results that do not match the test case.

-maxdepth N

searches the current directory as well as subdirectories N levels deep.

Locating Files by Modification Time

find also allows to filter a directory hierarchy based on when the file was modified:

$ sudo find / -name "*.conf" -mtime 7
/etc/logrotate.conf

This command would search for all files in the entire file system (the starting path is the root directory, i.e. /) that end with the characters .conf and have been modified in the last seven days. This command would require elevated privileges to access directories starting at the base of the system’s directory structure, hence the use of sudo here. The argument passed to mtime represents the number of days since the file was last modified.

Locating Files by Size

find can also locate files by size. For example, searching for files larger than 2G in /var:

$ sudo find /var -size +2G
/var/lib/libvirt/images/debian10.qcow2
/var/lib/libvirt/images/rhel8.qcow2

The -size option displays files of sizes corresponding to the argument passed. Some example arguments include:

-size 100b

files which are exactly 100 bytes.

-size +100k

files taller than 100 kilobytes.

-size -20M

files smaller than 20 megabytes.

-size +2G

files larger than 2 gigabytes.

Note

To find empty files we can use: find . -size 0b  or  find . -empty.

Acting on the Result Set

Once a search is done, it is possible to perform an action on the resulting set by using -exec:

$ find . -name "*.conf" -exec chmod 644 '{}' \;

This filters every object in the current directory (.) and below for file names ending with .conf and then executes the chmod 644 command to modify file permissions on the results.

For now, do not bother with the meaning of '{}' \; as it will be discussed later.

Using grep to Filter for Files Based on Content

grep is used to search for the occurrence of a keyword.

Consider a situation where we are to find files based on content:

$ find . -type f -exec grep "lpi" '{}' \; -print
./.bash_history
Alpine/M
helping/M

This would search every object in the current directory hierarchy (.) that is a file (-type f) and then executes the command grep "lpi" for every file that satisfies the conditions. The files that match these conditions are printed on the screen (-print). The curly braces ({}) are a placeholder for the find match results. The {} are enclosed in single quotes (') to avoid passing grep files with names containing special characters. The -exec command is terminated with a semicolon (;), which should be escaped (\;) to avoid interpretation by the shell.

Adding the option -delete to the end of an expression would delete all files that match. This option should be used when you are certain that the results only match the files that you wish to delete.

In the example below, find locates all files in the hierarchy starting at the current directory then deletes all files that end with the characters .bak:

$ find . -name "*.bak" -delete

Archiving Files

The tar Command (Archiving and Compresssion)

The tar command, short for “tape archive(r)”, is used to create tar archives by converting a group of files into an archive. Archives are created so as to easily move or backup a group of files. Think of tar as a tool that creates a glue onto which files can be attached, grouped and easily moved.

tar also has the ability to extract tar archives, display a list of the files included in the archive as well as add additional files to an existing archive.

The tar command syntax is as follows:

tar [OPERATION_AND_OPTIONS] [ARCHIVE_NAME] [FILE_NAME(S)]
OPERATION

Only one operation argument is allowed and required. The most frequently used operations are:

--create (-c)

Create a new tar archive.

--extract (-x)

Extract the entire archive or one or more files from an archive.

--list (-t)

Display a list of the files included in the archive.

OPTIONS

The most frequently used options are:

--verbose (-v)

Show the files being processed by the tar command.

--file=archive-name (-f archive-name)

Specifies the archive file name.

ARCHIVE_NAME

The name of the archive.

FILE_NAME(S)

A space-separated list of file names to be extracted. If not provided the entire archive is extracted.

Creating an Archive

Let’s say we have a directory named stuff in the current directory and we want to save it to a file named archive.tar. We would run the following command:

$ tar -cvf archive.tar stuff
stuff/
stuff/service.conf

Here’s what those switches actually mean:

-c

Create an archive.

-v

Display progress in the terminal while creating the archive, also known as “verbose” mode. The -v is always optional in these commands, but it is helpful.

-f

Allows to specify the filename of the archive.

In general to archive a single directory or a single file on Linux, we use:

tar -cvf NAME-OF-ARCHIVE.tar /PATH/TO/DIRECTORY-OR-FILE
Note

tar works recursively. It will perform the required action on every subsequent directory inside the directory specified.

To archive multiple directories at once, we list all the directories delimiting them by a space in the section /PATH/TO/DIRECTORY-OR-FILE:

$ tar -cvf archive.tar stuff1 stuff2

This would produce an archive of stuff1 and stuff2 in archive.tar

Extracting an Archive

We can extract an archive using tar:

$ tar -xvf archive.tar
stuff/
stuff/service.conf

This will extract the contents of archive.tar to the current directory.

This command is the same as the archive creation command used above, except the -x switch that replaces the -c switch.

To extract the contents of the archive to a specific directory we use -C:

$ tar -xvf archive.tar -C /tmp

This will extract the contents of archive.tar to the /tmp directory.

$ ls /tmp
stuff

Compressing with tar

The GNU tar command included with Linux distributions can create a .tar archive and then compress it with gzip or bzip2 compression in a single command:

$ tar -czvf name-of-archive.tar.gz stuff

This command would create a compressed file using the gzip algorithm (-z).

While gzip compression is most frequently used to create .tar.gz or .tgz files, tar also supports bzip2 compression. This allows the creation of bzip2 compressed files, often named .tar.bz2, .tar.bz or .tbz files.

To do so, we replace -z for gzip with -j for bzip2:

$ tar -cjvf name-of-archive.tar.bz stuff

To decompress the file, we replace -c with -x, where x stands for “extract”:

$ tar -xzvf archive.tar.gz

gzip is faster, but it generally compresses a bit less, so you get a somewhat larger file. bzip2 is slower, but it compresses a bit more, so you get a somewhat smaller file. In general, though, gzip and bzip2 are practically the same thing and both will work similarly.

Alternatively we may apply gzip or bzip2 compression using gzip command for gzip compressions and the bzip command for bzip compressions. For example, to apply gzip compression, use:

gzip FILE-TO-COMPRESS
gzip

creates the compressed file with the same name but with a .gz ending.

gzip

removes the original files after creating the compressed file.

The bzip2 command works in a similar fashion.

To uncompress the files we use either gunzip or bunzip2 depending on the algorithm used to compressed a file.

The cpio Command

cpio stands for “copy in, copy out”. It is used to process archive files such as *.cpio or *.tar files.

cpio performs the following operations:

  • Copying files to an archive.

  • Extracting files from an archive.

It takes the list of files from the standard input (mostly output from ls).

To create a cpio archive, we use:

$ ls | cpio -o > archive.cpio

The -o option instructs cpio to create an output. In this case, the output file created is archive.cpio. The ls command lists the contents of the current directory which are to be archived.

To extract the archive we use :

$ cpio -id < archive.cpio

The -i option is used to perform the extract. The -d option would create the destination folder. The character < represents standard input. The input file to be extracted is archive.cpio.

The dd Command

dd copies data from one location to another. The command line syntax of dd differs from many other Unix programs, it uses the syntax option=value for its command line options rather than the GNU standard -option value or --option=value formats:

$ dd if=oldfile of=newfile

This command would copy the content of oldfile into newfile, where if= is the input file and of= refers to the output file.

Note

The dd command typically will not output anything to the screen until the command has finished. By providing the status=progress option, the console will display the amount of work getting done by the command. For example: dd status=progress if=oldfile of=newfile.

dd is also used in changing data to upper/lower case or writing directly to block devices such as /dev/sdb:

$ dd if=oldfile of=newfile conv=ucase

This would copy all the contents of oldfile into newfile and capitalise all of the text.

The following command will backup the whole hard disk located at /dev/sda to a file named backup.dd:

$ dd if=/dev/sda of=backup.dd bs=4096

Guided Exercises

  1. Consider the following listing:

    $ find /home/frank/Documents/ -type d
    /home/frank/Documents/
    /home/frank/Documents/animal
    /home/frank/Documents/animal/domestic
    /home/frank/Documents/animal/wild
    • What kind of files would this command output?

    • In which directory does the search begin?

  2. A user wishes to compress his backup folder. He uses the following command:

    $ tar cvf /home/frank/backup.tar.gz /home/frank/dir1

    Which option is lacking to compress the backup using the gzip algorithm?

Explorational Exercises

  1. As system administrator, it is required to perform regular checks in order to remove voluminous files. These voluminous files are located in /var and end with a .backup extension.

    • Write down the command, using find, to locate these files:

    • An analysis of the sizes of these files reveals that they range from 100M to 1000M. Complete the previous command with this new information, so that you may locate those backup files ranging from 100M to 1000M:

    • Finally, complete this command, with the delete action so that these files will be removed:

  2. In the /var directory, there exist four backup files:

    db-jan-2018.backup
    db-feb-2018.backup
    db-march-2018.backup
    db-apr-2018.backup
    • Using tar, specify the command that would create an archive file with the name db-first-quarter-2018.backup.tar:

    • Using tar, specify the command that would create the archive and compress it using gzip. Take note that the resulting file name should end with .gz:

Summary

In this section, you learned:

  • How to find files with find.

  • How to add search criteria based on time, file type or size by supplying argument to find.

  • How to act on a returned set.

  • How to archive, compress and decompress files using tar.

  • Processing archives with cpio.

  • Copying files with dd.

Answers to Guided Exercises

  1. Consider the following listing:

    $ find /home/frank/Documents/ -type d
    /home/frank/Documents/
    /home/frank/Documents/animal
    /home/frank/Documents/animal/domestic
    /home/frank/Documents/animal/wild
    • What kind of files would this command output?

      Directories.

    • In which directory does the search begins?

      /home/frank/Documents

  2. A user wishes to compress his backup folder. He uses the following command:

    $ tar cvf /home/frank/backup.tar.gz /home/frank/dir1

    Which option is lacking to compress the backup using the gzip algorithm?

    Option -z.

Answers to Explorational Exercises

  1. As system administrator, it is required of you to perform regular checks in order to remove voluminous files. These voluminous files are located in /var and end with a .backup extension.

    • Write down the command, using find, to locate these files:

      $ find /var -name *.backup
    • An analysis of the sizes of these files reveals that they range from 100M to 1000M. Complete the previous command with this new information, so that you may locate those backup files ranging from 100M to 1000M:

      $ find /var -name *.backup -size +100M -size -1000M
    • Finally, complete this command, with the delete action so that these files will be removed:

      $ find /var -name *.backup -size +100M -size -1000M -delete
  2. In the /var directory, there exist four backup files:

    db-jan-2018.backup
    db-feb-2018.backup
    db-march-2018.backup
    db-apr-2018.backup
    • Using tar, specify the command that would create an archive file with the name db-first-quarter-2018.backup.tar:

      $ tar -cvf db-first-quarter-2018.backup.tar db-jan-2018.backup db-feb-2018.backup db-march-2018.backup db-apr-2018.backup
    • Using tar, specify the command that would create the archive and compress it using gzip. Take note that the resulting file name should end with .gz:

      $ tar -zcvf db-first-quarter-2018.backup.tar.gz db-jan-2018.backup db-feb-2018.backup db-march-2018.backup db-apr-2018.backup

Linux Professional Insitute Inc. All rights reserved. Visit the Learning Materials website: https://learning.lpi.org
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Next Lesson

103.4 Use streams, pipes and redirects (103.4 Lesson 1)

Read next lesson

Linux Professional Insitute Inc. All rights reserved. Visit the Learning Materials website: https://learning.lpi.org
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

LPI is a non-profit organization.

© 2023 Linux Professional Institute (LPI) is the global certification standard and career support organization for open source professionals. With more than 200,000 certification holders, it's the world’s first and largest vendor-neutral Linux and open source certification body. LPI has certified professionals in over 180 countries, delivers exams in multiple languages, and has hundreds of training partners.

Our purpose is to enable economic and creative opportunities for everybody by making open source knowledge and skills certification universally accessible.

  • LinkedIn
  • flogo-RGB-HEX-Blk-58 Facebook
  • Twitter
  • Contact Us
  • Privacy and Cookie Policy

Spot a mistake or want to help improve this page? Please let us know.

© 1999–2023 The Linux Professional Institute Inc. All rights reserved.