Linux Professional Institute Learning Logo.
Skip to main content
  • Home
    • All Resources
    • LPI Learning Materials
    • Become a Contributor
    • Publishing Partners
    • Become a Publishing Partner
    • About
    • FAQ
    • Contributors
    • Roadmap
    • Contact
  • LPI.org
4.3 Lesson 1
Topic 1: The Linux Community and a Career in Open Source
1.1 Linux Evolution and Popular Operating Systems
  • 1.1 Lesson 1
1.2 Major Open Source Applications
  • 1.2 Lesson 1
1.3 Open Source Software and Licensing
  • 1.3 Lesson 1
1.4 ICT Skills and Working in Linux
  • 1.4 Lesson 1
Topic 2: Finding Your Way on a Linux System
2.1 Command Line Basics
  • 2.1 Lesson 1
  • 2.1 Lesson 2
2.2 Using the Command Line to Get Help
  • 2.2 Lesson 1
2.3 Using Directories and Listing Files
  • 2.3 Lesson 1
  • 2.3 Lesson 2
2.4 Creating, Moving and Deleting Files
  • 2.4 Lesson 1
Topic 3: The Power of the Command Line
3.1 Archiving Files on the Command Line
  • 3.1 Lesson 1
3.2 Searching and Extracting Data from Files
  • 3.2 Lesson 1
  • 3.2 Lesson 2
3.3 Turning Commands into a Script
  • 3.3 Lesson 1
  • 3.3 Lesson 2
Topic 4: The Linux Operating System
4.1 Choosing an Operating System
  • 4.1 Lesson 1
4.2 Understanding Computer Hardware
  • 4.2 Lesson 1
4.3 Where Data is Stored
  • 4.3 Lesson 1
  • 4.3 Lesson 2
4.4 Your Computer on the Network
  • 4.4 Lesson 1
Topic 5: Security and File Permissions
5.1 Basic Security and Identifying User Types
  • 5.1 Lesson 1
5.2 Creating Users and Groups
  • 5.2 Lesson 1
5.3 Managing File Permissions and Ownership
  • 5.3 Lesson 1
5.4 Special Directories and Files
  • 5.4 Lesson 1
How to get certified
  1. Topic 4: The Linux Operating System
  2. 4.3 Where Data is Stored
  3. 4.3 Lesson 1

4.3 Lesson 1

Certificate:

Linux Essentials

Version:

1.6

Topic:

4 The Linux Operating System

Objective:

4.3 Where Data is Stored

Lesson:

1 of 2

Introduction

For an operating system, everything is considered data. For Linux, everything is considered a file: programs, regular files, directories, block devices (hard disks, etc.), character devices (consoles, etc.), kernel processes, sockets, partitions, links, etc. The Linux directory structure, starting from the root /, is a collection of files containing data. The fact that everything is a file is a powerful feature of Linux as it allows for tweaking virtually every single aspect of the system.

In this lesson we will be discussing the different locations in which important data is stored as established by the Linux Filesystem Hierarchy Standard (FHS). Some of these locations are real directories which store data persistently on disks, whereas others are pseudo filesystems loaded in memory which give us access to kernel subsystem data such as running processes, use of memory, hardware configuration and so on. The data stored in these virtual directories is used by a number of commands that allow us to monitor and handle it.

Programs and their Configuration

Important data on a Linux system are — no doubt — its programs and their configuration files. The former are executable files containing sets of instructions to be run by the computer’s processor, whereas the latter are usually text documents that control the operation of a program. Executable files can be either binary files or text files. Executable text files are called scripts. Configuration data on Linux is traditionally stored in text files too, although there are various styles of representing configuration data.

Where Binary Files are Stored

Like any other file, executable files live in directories hanging ultimately from /. More specifically, programs are distributed across a three-tier structure: the first tier (/) includes programs that can be necessary in single-user mode, the second tier (/usr) contains most multi-user programs and the third tier (/usr/local) is used to store software that is not provided by the distribution and has been compiled locally.

Typical locations for programs include:

/sbin

It contains essential binaries for system administration such as parted or ip.

/bin

It contains essential binaries for all users such as ls, mv, or mkdir.

/usr/sbin

It stores binaries for system administration such as deluser, or groupadd.

/usr/bin

It includes most executable files — such as free, pstree, sudo or man — that can be used by all users.

/usr/local/sbin

It is used to store locally installed programs for system administration that are not managed by the system’s package manager.

/usr/local/bin

It serves the same purpose as /usr/local/sbin but for regular user programs.

Recently some distributions started to replace /bin and /sbin with symbolic links to /usr/bin and /usr/sbin.

Note

The /opt directory is sometimes used to store optional third-party applications.

Apart from these directories, regular users can have their own programs in either:

  • /home/$USER/bin

  • /home/$USER/.local/bin

Tip

You can find out what directories are available for you to run binaries from by referencing the PATH variable with echo $PATH. For more information on PATH, review the lessons on variables and shell customization.

We can find the location of programs with the which command:

$ which git
/usr/bin/git

Where Configuration Files are Stored

The /etc Directory

In the early days of Unix there was a folder for each type of data, such as /bin for binaries and /boot for the kernel(s). However, /etc (meaning et cetera) was created as a catch-all directory to store any files that did not belong in the other categories. Most of these files were configuration files. With the passing of time more and more configuration files were added so /etc became the main folder for configuration files of programs. As said above, a configuration file usually is a local, plain text (as opposed to binary) file which controls the operation of a program.

In /etc we can find different patterns for config files names:

  • Files with an ad hoc extension or no extension at all, for example

    group

    System group database.

    hostname

    Name of the host computer.

    hosts

    List of IP addresses and their hostname translations.

    passwd

    System user database — made up of seven fields separated by colons providing information about the user.

    profile

    System-wide configuration file for Bash.

    shadow

    Encrypted file for user passwords.

  • Initialization files ending in rc:

    bash.bashrc

    System-wide .bashrc file for interactive bash shells.

    nanorc

    Sample initialization file for GNU nano (a simple text editor that normally ships with any distribution).

  • Files ending in .conf:

    resolv.conf

    Config file for the resolver — which provide access to the Internet Domain Name System (DNS).

    sysctl.conf

    Config file to set system variables for the kernel.

  • Directories with the .d suffix:

    Some programs with a unique config file (*.conf or otherwise) have evolved to have a dedicated *.d directory which help build modular, more robust configurations. For example, to configure logrotate, you will find logrotate.conf, but also the logrotate.d directories.

    This approach comes in handy in those cases where different applications need configurations for the same specific service. If, for example, a web server package contains a logrotate configuration, this configuration can now be placed in a dedicated file in the logrotate.d directory. This file can be updated by the webserver package without interfering with the remaining logrotate configuration. Likewise, packages can add specific tasks by placing files in the /etc/cron.d directory instead of modifying /etc/crontab.

    In Debian — and Debian derivatives — such an approach has been applied to the list of reliable sources read by the package management tool apt: apart from the classic /etc/apt/sources.list, now we find the /etc/apt/sources.list.d directory:

    $ ls /etc/apt/sources*
    /etc/apt/sources.list
    /etc/apt/sources.list.d:
Configuration Files in HOME (Dotfiles)

At user level, programs store their configurations and settings in hidden files in the user’s home directory (also represented ~). Remember, hidden files start with a dot (.) — hence their name: dotfiles.

Some of these dotfiles are Bash scripts that customize the user’s shell session and are sourced as soon as the user logs into the system:

.bash_history

It stores the command line history.

.bash_logout

It includes commands to execute when leaving the login shell.

.bashrc

Bash’s initialization script for non-login shells.

.profile

Bash’s initialization script for login shells.

Note

Refer to the lesson on “Command Line Basics” to learn more about Bash and its init files.

Other user-specific programs' config files get sourced when their respective programs are started: .gitconfig, .emacs.d, .ssh, etc.

The Linux Kernel

Before any process can run, the kernel must be loaded into a protected area of memory. After that, the process with PID 1 (more often than not systemd nowadays) sets off the chain of processes, that is to say, one process starts other(s) and so on. Once the processes are active, the Linux kernel is in charge of allocating resources to them (keyboard, mouse, disks, memory, network interfaces, etc).

Note

Prior to systemd, /sbin/init was always the first process in a Linux system as part of the System V Init system manager. In fact, you still find /sbin/init currently but linked to /lib/systemd/systemd.

Where Kernels are Stored: /boot

The kernel resides in /boot — together with other boot-related files. Most of these files include the kernel version number components in their names (kernel version, major revision, minor revision and patch number).

The /boot directory includes the following types of files, with names corresponding with the respective kernel version:

config-4.9.0-9-amd64

Configuration settings for the kernel such as options and modules that were compiled along with the kernel.

initrd.img-4.9.0-9-amd64

Initial RAM disk image that helps in the startup process by loading a temporary root filesystem into memory.

System-map-4.9.0-9-amd64

The System-map (on some systems it will be named System.map) file contains memory address locations for kernel symbol names. Each time a kernel is rebuilt the file’s contents will change as the memory locations could be different. The kernel uses this file to lookup memory address locations for a particular kernel symbol, or vice-versa.

vmlinuz-4.9.0-9-amd64

The kernel proper in a self-extracting, space-saving, compressed format (hence the z in vmlinuz; vm stands for virtual memory and started to be used when the kernel first got support for virtual memory).

grub

Configuration directory for the grub2 bootloader.

Tip

Because it is a critical feature of the operating system, more than one kernel and its associated files are kept in /boot in case the default one becomes faulty and we have to fall back on a previous version to — at least — be able to boot the system up and fix it.

The /proc Directory

The /proc directory is one of the so-called virtual or pseudo filesystems since its contents are not written to disk, but loaded in memory. It is dynamically populated every time the computer boots up and constantly reflects the current state of the system. /proc includes information about:

  • Running processes

  • Kernel configuration

  • System hardware

Besides all the data concerning processes that we will see in the next lesson, this directory also stores files with information about the system’s hardware and the kernel’s configuration settings. Some of these files include:

/proc/cpuinfo

It stores information about the system’s CPU:

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
stepping	: 10
cpu MHz		: 3696.000
cache size	: 12288 KB
(...)
/proc/cmdline

It stores the strings passed to the kernel on boot:

$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.9.0-9-amd64 root=UUID=5216e1e4-ae0e-441f-b8f5-8061c0034c74 ro quiet
/proc/modules

It shows the list of modules loaded into the kernel:

$ cat /proc/modules
nls_utf8 16384 1 - Live 0xffffffffc0644000
isofs 40960 1 - Live 0xffffffffc0635000
udf 90112 0 - Live 0xffffffffc061e000
crc_itu_t 16384 1 udf, Live 0xffffffffc04be000
fuse 98304 3 - Live 0xffffffffc0605000
vboxsf 45056 0 - Live 0xffffffffc05f9000 (O)
joydev 20480 0 - Live 0xffffffffc056e000
vboxguest 327680 5 vboxsf, Live 0xffffffffc05a8000 (O)
hid_generic 16384 0 - Live 0xffffffffc0569000
(...)

The /proc/sys Directory

This directory includes kernel configuration settings in files classified into categories per subdirectory:

$ ls /proc/sys
abi  debug  dev  fs  kernel  net  user  vm

Most of these files act like a switch and — therefore — only contain either of two possible values: 0 or 1 (“on” or “off”). For instance:

/proc/sys/net/ipv4/ip_forward

The value that enables or disables our machine to act as a router (be able to forward packets):

$ cat /proc/sys/net/ipv4/ip_forward
0

There are some exceptions, though:

/proc/sys/kernel/pid_max

The maximum PID allowed:

$ cat /proc/sys/kernel/pid_max
32768
Warning

Be extra careful when changing the kernel settings as the wrong value may result in an unstable system.

Hardware Devices

Remember, in Linux “everything is a file”. This implies that hardware device information as well as the kernel’s own configuration settings are all stored in special files that reside in virtual directories.

The /dev Directory

The device directory /dev contains device files (or nodes) for all connected hardware devices. These device files are used as an interface between the devices and the processes using them. Each device file falls into one of two categories:

Block devices

Are those in which data is read and written in blocks which can be individually addressed. Examples include hard disks (and their partitions, like /dev/sda1), USB flash drives, CDs, DVDs, etc.

Character devices

Are those in which data is read and written sequentially one character at a time. Examples include keyboards, the text console (/dev/console), serial ports (such as /dev/ttyS0 and so on), etc.

When listing device files, make sure you use ls with the -l switch to differentiate between the two. We can — for instance — check for hard disks and partitions:

# ls -l /dev/sd*
brw-rw---- 1 root disk 8, 0 may 25 17:02 /dev/sda
brw-rw---- 1 root disk 8, 1 may 25 17:02 /dev/sda1
brw-rw---- 1 root disk 8, 2 may 25 17:02 /dev/sda2
(...)

Or for serial terminals (TeleTYpewriter):

# ls -l /dev/tty*
crw-rw-rw- 1 root tty     5,  0 may 25 17:26 /dev/tty
crw--w---- 1 root tty     4,  0 may 25 17:26 /dev/tty0
crw--w---- 1 root tty     4,  1 may 25 17:26 /dev/tty1
(...)

Notice how the first character is b for block devices and c for character devices.

Tip

The asterisk (*) is a globbing character than means 0 or more characters. Hence its importance in the ls -l /dev/sd* and ls -l /dev/tty* commands above. To learn more about these special characters, refer to the lesson on globbing.

Furthermore, /dev includes some special files which are quite useful for different programming purposes:

/dev/zero

It provides as many null characters as requested.

/dev/null

Aka bit bucket. It discards all information sent to it.

/dev/urandom

It generates pseudo-random numbers.

The /sys Directory

The sys filesystem (sysfs) is mounted on /sys. It was introduced with the arrival of kernel 2.6 and meant a great improvement on /proc/sys.

Processes need to interact with the devices in /dev and so the kernel needs a directory which contains information about these hardware devices. This directory is /sys and its data is orderly arranged into categories. For instance, to check on the MAC address of your network card (enp0s3), you would cat the following file:

$ cat /sys/class/net/enp0s3/address
08:00:27:02:b2:74

Memory and Memory Types

Basically, for a program to start running, it has to be loaded into memory. By and large, when we speak of memory we refer to Random Access Memory (RAM) and — when compared to mechanical hard disks — it has the advantage of being a lot faster. On the down side, it is volatile (i.e., once the computer shuts down, the data is gone).

Notwithstanding the aforementioned — when it comes to memory — we can differentiate two main types in a Linux system:

Physical memory

Also known as RAM, it comes in the form of chips made up of integrated circuits containing millions of transistors and capacitors. These, in turn, form memory cells (the basic building block of computer memory). Each of these cells has an associated hexadecimal code — a memory address — so that it can be referenced when needed.

Swap

Also known as swap space, it is the portion of virtual memory that lives on the hard disk and is used when there is no more RAM available.

On the other hand, there is the concept of virtual memory which is an abstraction of the total amount of usable, addressing memory (RAM, but also disk space) as seen by applications.

free parses /proc/meminfo and displays the amount of free and used memory in the system in a very clear manner:

$ free
              total        used        free      shared  buff/cache   available
Mem:        4050960     1474960     1482260       96900     1093740     2246372
Swap:       4192252           0     4192252

Let us explain the different columns:

total

Total amount of physical and swap memory installed.

used

Amount of physical and swap memory currently in use.

free

Amount of physical and swap memory currently not in use.

shared

Amount of physical memory used — mostly — by tmpfs.

buff/cache

Amount of physical memory currently in use by kernel buffers and the page cache and slabs.

available

Estimate of how much physical memory is available for new processes.

By default free shows values in kibibytes, but allows for a variety of switches to display its results in different units of measurement. Some of these options include:

-b

Bytes.

-m

Mebibytes.

-g

Gibibytes.

-h

Human-readable format.

-h is always comfortable to read:

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           3,9G        1,4G        1,5G         75M        1,0G        2,2G
Swap:          4,0G          0B        4,0G
Note

A kibibyte (KiB) equals 1,024 bytes while a kilobytes (KB) equals 1000 bytes. The same is respectively true for mebibytes, gibibytes, etc.

Guided Exercises

  1. Use the which command to find out the location of the following programs and complete the table:

    Program which command Path to Executable (output) User needs root privileges?

    swapon

    kill

    cut

    usermod

    cron

    ps

  2. Where are the following files to be found?

    File /etc ~

    .bashrc

    bash.bashrc

    passwd

    .profile

    resolv.conf

    sysctl.conf

  3. Explain the meaning of the number elements for kernel file vmlinuz-4.15.0-50-generic found in /boot:

    Number Element Meaning

    4

    15

    0

    50

  4. What command would you use to list all hard drives and partitions in /dev?

Explorational Exercises

  1. Device files for hard drives are represented based on the controllers they use — we saw /dev/sd* for drives using SCSI (Small Computer System Interface) and SATA (Serial Advanced Technology Attachment), but

    • How were old IDE (Integrated Drive Electronics) drives represented?

    • And modern NVMe (Non-Volatile Memory Express) drives?

  2. Take a look at the file /proc/meminfo. Compare the contents of this file to the output of the command free and identify which key from /proc/meminfo correspond to the following fields in the output of free:

    free output /proc/meminfo field

    total

    free

    shared

    buff/cache

    available

Summary

In this lesson you have learned about the location of programs and their configuration files in a Linux system. Important facts to remember are:

  • Basically, programs are to be found across a three-level directory structure: /, /usr and /usr/local. Each of these levels may contain bin and sbin directories.

  • Configuration files are stored in /etc and ~.

  • Dotfiles are hidden files that start with a dot (.).

We have also discussed the Linux kernel. Important facts are:

  • For Linux, everything is a file.

  • The Linux kernel lives in /boot together with other boot-related files.

  • For processes to start executing, the kernel has to first be loaded into a protected area of memory.

  • The kernel job is that of allocating system resources to processes.

  • The /proc virtual (or pseudo) filesystem stores important kernel and system data in a volatile way.

Likewise, we have explored hardware devices and learned the following:

  • The /dev directory stores special files (aka nodes) for all connected hardware devices: block devices or character devices. The former transfer data in blocks; the latter, one character at a time.

  • The /dev directory also contains other special files such as /dev/zero, /dev/null or /dev/urandom.

  • The /sys directory stores information about hardware devices arranged into categories.

Finally, we touched upon memory. We learned:

  • A program runs when it is loaded into memory.

  • What RAM (Random Access Memory) is.

  • What Swap is.

  • How to display the use of memory.

Commands used in this lesson:

cat

Concatenate/print file content.

free

Display amount of free and used memory in the system.

ls

List directory contents.

which

Show location of program.

Answers to Guided Exercises

  1. Use the which command to find out the location of the following programs and complete the table:

    Program which command Path to Binary (output) User needs root privileges?

    swapon

    which swapon

    /sbin/swapon

    Yes

    kill

    which kill

    /bin/kill

    No

    cut

    which cut

    /usr/bin/cut

    No

    usermod

    which usermod

    /usr/sbin/usermod

    Yes

    cron

    which cron

    /usr/sbin/cron

    Yes

    ps

    which ps

    /bin/ps

    No

  2. Where are the following files to be found?

    File /etc ~

    .bashrc

    No

    Yes

    bash.bashrc

    Yes

    No

    passwd

    Yes

    No

    .profile

    No

    Yes

    resolv.conf

    Yes

    No

    sysctl.conf

    Yes

    No

  3. Explain the meaning of the number elements for kernel file vmlinuz-4.15.0-50-generic found in /boot:

    Number Element Meaning

    4

    Kernel version

    15

    Major revision

    0

    Minor revision

    50

    Patch number

  4. What command would you use to list all hard drives and partitions in /dev?

    ls /dev/sd*

Answers to Explorational Exercises

  1. Device files for hard drives are represented based on the controllers they use — we saw /dev/sd* for drives using SCSI (Small Computer System Interface) and SATA (Serial Advanced Technology Attachment), but

    • How were old IDE (Integrated Drive Electronics) drives represented?

      /dev/hd*

    • And modern NVMe (Non-Volatile Memory Express) drives?

      /dev/nvme*

  2. Take a look at the file /proc/meminfo. Compare the contents of this file to the output of the command free and identify which key from /proc/meminfo correspond to the following fields in the output of free:

    free output /proc/meminfo field

    total

    MemTotal / SwapTotal

    free

    MemFree / SwapFree

    shared

    Shmem

    buff/cache

    Buffers, Cached and SReclaimable

    available

    MemAvailable

Linux Professional Insitute Inc. All rights reserved. Visit the Learning Materials website: https://learning.lpi.org
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Next Lesson

4.3 Where Data is Stored (4.3 Lesson 2)

Read next lesson

Linux Professional Insitute Inc. All rights reserved. Visit the Learning Materials website: https://learning.lpi.org
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

LPI is a non-profit organization.

© 2022 Linux Professional Institute (LPI) is the global certification standard and career support organization for open source professionals. With more than 200,000 certification holders, it's the world’s first and largest vendor-neutral Linux and open source certification body. LPI has certified professionals in over 180 countries, delivers exams in multiple languages, and has hundreds of training partners.

Our purpose is to enable economic and creative opportunities for everybody by making open source knowledge and skills certification universally accessible.

  • LinkedIn
  • flogo-RGB-HEX-Blk-58 Facebook
  • Twitter
  • Contact Us
  • Privacy and Cookie Policy

Spot a mistake or want to help improve this page? Please let us know.

© 1999–2022 The Linux Professional Institute Inc. All rights reserved.