User Tools

Site Tools


linux:files_system

Linux - Files System

/              Root
-> /bin        User Binaries
-> /boot       Boot Loader Files
-> /dev        Device Files
-> /etc        Configuration Files
-> /home       Home Directories
-> /lib        System Libraries
-> /media      Removable Devices
-> /mnt        Mount Directory
-> /opt        Optional Add-On Applications
-> /proc       Process Information
-> /sbin       System Binaries
-> /srv        Service Data
-> /tmp        Temporary Files
-> /usr        User System Resources
-> /var        Variable Files


This summarizes what are files, directories, inodes, hard links etc.

NOTE: These notes are relevant to all UNIX based systems, and not just Linux.


File systems

A File System contains names of directories and files.

Linux has a hierarchical file system consisting of directories, sub-directories, and data files.

Each object in the File System has a name in the file system tree.


Files

Files are a list of bytes, without structure.

Any necessary structure (e.g. for a database) is added by the programs that manipulate the data in the file.

Linux itself doesn't know about the internal structure of a database file. All it does is return bytes.

NOTE: Everything is a file.

The word file refers to anything in the file system, including directories, symbolic links, devices, etc.

The manual page for the find command says that it can `search for files, but it really means that it can search for any kind of thing, not just strictly a “file”.


Hardware devices are Files

Even hardware devices have file names.

Linux treats every device attached to it as if it were a list of bytes.

Therefore, everything, including network cards, hard drives, partitions, keyboards, printers, and plain files are treated as file-like objects and each has a name in the file system, for example:

  • Your computer memory is /dev/mem.
  • Your first hard disk is /dev/sda.
  • A terminal (keyboard and screen) is /dev/tty1.
ls -li /dev/mem /dev/sda /dev/tty1
 
  5 crw-r----- 1 root kmem 1, 1 Apr 29 10:53 /dev/mem
365 brw-rw---- 1 root disk 8, 0 Apr 29 10:53 /dev/sda
 20 crw--w---- 1 gdm  tty  4, 1 Apr 29 10:53 /dev/tty1

Most input and output devices and directories are treated as files in Linux.

If you have sufficient permissions, you can directly read all these devices using their file system names.

NOTE: Some recent versions of Unix have evolved directories into non-readable (non-file) objects.


inodes

Items in the file system are not stored by name.

They are stored using a numbered data structure called an index node or inode.

Everything in a Unix file system has a unique inode number that manages the storage and attributes for that thing: every file, directory, special file, device etc.

Files and directories are both managed with inodes.


Directories map file system names to inode numbers

Each inode is identified by a unique inode number that can be shown using the -i option to the ls command:

ls -l -i /usr/bin/perl*
 
38928562 -rwxr-xr-x 2 root root 2.1M Nov 19  2018 /usr/bin/perl
38928562 -rwxr-xr-x 2 root root 2.1M Nov 19  2018 /usr/bin/perl5.26.1
38933064 -rwxr-xr-x 1 root root  10K Nov 19  2018 /usr/bin/perl5.26-x86_64-linux-gnu
38933074 -rwxr-xr-x 2 root root  45K Nov 19  2018 /usr/bin/perlbug
38933075 -rwxr-xr-x 1 root root  125 Nov 19  2018 /usr/bin/perldoc
38929284 -rwxr-xr-x 1 root root  53K Mar  3  2018 /usr/bin/perli11ndoc
38933076 -rwxr-xr-x 1 root root  11K Nov 19  2018 /usr/bin/perlivp
38933074 -rwxr-xr-x 2 root root  45K Nov 19  2018 /usr/bin/perlthanks
  • File name /usr/bin/perl points to inode number 38928562.
  • Another name perl5.26.1 points to the same inode number.

NOTE: Each inode may have many names!

As can be seen above, both perl and perl5.26.1, point to the same inode number 38928562.

Each file name is mapped to only one single inode number, but one file inode number may have many names that map to it.

IMPORTANT: The actual data of a file is not stored in the directory where the file name is shown, but is actually stored in the inode somewhere else.

Names are separate from the things they name!

In this example:

  • The name of the file, i.e. perl, is stored with the directory, /usr/bin/.
  • However, the actual data within the perl file is actually stored elsewhere under inode 38928562.

Directories map file system names to inode numbers. In this example, file system name perl is mapped to inode number 38928562 that contains the actual data.

When you access the perl program by name, the system finds the perl name in a directory, paired with the inode number 38928562 that holds the actual data, and then the system has to go elsewhere on disk to that inode number 38928562 to access the data for the perl program.


Inodes contain pointers to disk blocks

An inode manages the disk storage space for a file or a directory.

The inode contains a list of pointers to the disk blocks that belong to that file or directory.

The disk blocks store the data for the inode.

The larger the file or directory, the more disk block pointers it needs in the inode.


Inodes contain attributes (owners, permissions, times, etc.)

The attributes of the file or directory are stored in the inode:

  • permissions
  • owner
  • group
  • size
  • access time
  • modify time
  • etc.

NOTE: The name of the file or directory is not stored with the inode.

Inodes have only numbers, attributes, and disk blocks – an inode does not contain its own name.

The names are kept separately, in directories.

A file name, or directory name, is stored with a directory elsewhere.

Directory inodes

  • The attributes of a directory inode apply only to that directory itself, not to the objects named *in* the directory, which have their own inodes.
  • The attributes of the objects named in the directory – such as names of files or sub-directories – are not stored in the directory; they are stored in the individual inodes of those things.

Inodes are unique inside a file system

Inode numbers are specific to a file system inside a disk partition.

Each file system has its own set of inode numbers.

Numbering is done separately for each file system, so different disk partitions may have file system objects with the same inode numbers.


Inodes are a fixed resource

Every Linux file system is created new with a large set of available inodes.

You can list the free inodes using df -i.

NOTE: Some types of Unix file systems can never make more inodes, even if there is lots of disk space available.

When all the inodes are used up, the file system can create no more files until some files are deleted to free some inodes.


File System Diagrams are Wrong

Most diagrams showing file systems and links in Unix texts are wrong and range from confusing to seriously misleading.

Here's the truth, complete with an ASCII-art file system diagram below.

The names for inodes (names for files, directories, devices, etc.) are stored on disk in directories.

Only the *names* and the associated *inode numbers* are stored in the directory; the actual *disk space* for whatever data is being named is stored in the numbered inode, not in the directory.

The names and numbers are kept in the directory; the names are *not* kept with the data, which is in the inode.

In the directory, beside each name, is the index number (inode number) indicating where to find the disk space used to actually store the thing being named.

You can see this name-inode pairing using ls -i:

ls -i /usr/bin/perl*
 
38928562 /usr/bin/perl
38928562 /usr/bin/perl5.26.1
38933064 /usr/bin/perl5.26-x86_64-linux-gnu
38933074 /usr/bin/perlbug
38933075 /usr/bin/perldoc
38929284 /usr/bin/perli11ndoc
38933076 /usr/bin/perlivp
38933074 /usr/bin/perlthanks

The crucial thing to know is that the names and the actual storage for the things being named are in *separate places*.

Most texts make the error of writing Unix file system diagrams that put the names right *on* the things that are being named.

That is misleading and the cause of many misunderstandings about Unix files and directories.

Names exist one level *above* (separate from) the items that they name:

  WRONG - names on things      RIGHT - names above things
  =======================      ==========================
                                                        
      R O O T            --->         [etc,bin,home]   <-- ROOT directory
     /   |   \                         /    |      \
  etc   bin   home       --->  [passwd]  [ls,rm]  [abcd0001]
   |   /   \    \                 |      /    \       |
   |  ls   rm  abcd0001  --->     |  <data>  <data>  [.bashrc]
   |               |              |                   |
  passwd       .bashrc   --->  <data>                <data>

Directories are lists of names and inode numbers, as shown by the square-bracketed lists in the diagram on the right, above. (The actual inode numbers are omitted from this small diagram.)

The name of each thing (file, directory, special file, etc.) is kept in a directory, separate from the storage space for the thing it names.

This allows inodes to have multiple names and to have names in multiple directories; all the names can refer to the same storage space by simply using the same inode number.

In the correct diagram on the right, the directories are lists of names in square brackets.

Directories give names to the objects below them in the tree.

The top directory on the right is the ROOT directory inode, containing the list of names under it: `etc`, `bin`, `home`, and others.

The line leading downwards from the name `bin` in the ROOT directory indicates that the name `bin` is paired with an inode number that is another directory inode containing the list of names in the `bin` directory, including names `ls` and `rm` and others.

The line leading down from `ls` in the `bin` directory inode leads to the data inode for the file `/bin/ls`.

There is no name kept with the data inode – the name is up in the directory above it.

The ROOT directory inode has no name because there is no directory above it to give it one!

Every other directory except ROOT has a name because there is a directory inode above it that contains its name.


Directories hold only names and inode numbers

To make a hierarchical file system, file system names are stored in directories.

Each Unix directory is itself an inode.

Like all inodes, directory inodes contain pointers to disk blocks and attribute information about the inode (permissions, owner, etc.), but what is stored in the disk blocks of a directory inode is not file data but directory data.

That directly data is simply a list of names and inode numbers.

The directory inode contains attribute information about the directory, itself, not about the things named in the directory. (Use ls -ld to see the attributes of the directory inode itself.)

NOTE: For each item in the file system, only the name and the inode number of the item is kept in the directory.

No data or attributes about the thing are kept in the directory, only the name of the item and its inode number.

The *name* of a file is kept in a directory, paired with its inode number.

The file's actual attributes and pointers to disk blocks are kept elsewhere, in the inode for the file.

This means that names are not kept in the same inodes with the things that they name.

Directories are what give names to inodes on Unix.

Directories can be thought of as “files containing lists of names and inode numbers”.

Files have disk blocks containing file data; directories also have disk blocks; but, the blocks contain lists of names and inode numbers.

NOTE: If a directory is damaged in Unix, only the names are lost, not any of the file data blocks or the file attributes.


Attributes are stored with the inode, not the name

A Unix directory is only a list of pairs of names and associated inode numbers.

The attribute information about an item named in a directory – the type, permissions, owner, etc. of the thing – is kept with the inode associated with the thing, not in the directory itself.

Reading a Unix directory tells you only some names and inode numbers; you know nothing about the types, sizes, owners, or modify times of those inodes unless you actually go out to each separate inode on disk and access it to read its attributes.

Without actually accessing the inode, you can't know the attributes of the inode; you can't even know if the inode is a file inode or a directory inode.

NOTE: Some modern Unix file systems also cache a second copy of the inode type in the directory to speed up common file system browsing operations.

To find out attribute information of a file system object, that information is stored in the inode of the object, not in the directory.

You must first use the inode number associated with the object to find the inode of the item and look at the item's attributes.

This is why ls or ls -i are much faster than ls -l on a huge directory:

  • ls or ls -i only need to read the names and inode numbers from the directory – no additional inode access is needed because no other attributes are being queried.
    • Reading the one directory inode is sufficient.
  • ls -l has to display attribute information for every object named in the directory, so it has to do a separate inode lookup to find out the inode attribute information for every inode in the directory.
    • A directory with 1000 names in it requires 1000 separate inode lookups to fetch the attributes!

No attribute information about the things named in the directory is kept in the directory (except on those modern file systems where caching of inode type is enabled).

The directory only contains pairs of names and inode numbers.

To find a thing by name, the system goes to a directory inode, looks up the name in the disk space allocated to that directory, finds the inode number associated with the name, then goes out to the disk a second time and finds that inode on the disk.

If that inode is another directory, the process repeats from left-to-right along the pathname until the inode of the last pathname component (on the far right in the pathname) is found.

Then the disk block pointers of that last inode can be used to find the data contents of the last pathname component.


Damaged directories create orphans

The name and inode number pairing in a Unix directory is the only connection between a name and the thing it names on disk.

The name is kept separate from the data belonging to the thing it names (the actual inode on disk).

If a disk error damages a directory inode or the directory disk blocks, file data is not usually lost; since, the actual data for the things named in the directory are stored in inodes separate from the directory itself.

If a directory is damaged, only the names of the things are lost and the inodes become “orphan” inodes without names.

The storage used for the things themselves is elsewhere on disk and may be undamaged.

You can run a file system recovery program such as fsck to recover the data (but not the names).

The name of an item (file, directory, etc.) and its inode number are the only things kept in a directory.

The directory storage for that name and number is managed by its own inode that is separate from the inode of each thing in the directory.

The name and number are stored in the directory inode; the data for the item named is stored in its own inode somewhere else.


Because (1) data in a file is managed by an inode with a unique number, (2) the name of the file is not kept in that inode, and (3) directories pair names with inode numbers, a Unix file (inode) can be given multiple names by having multiple name-and-inode pairs in one or more directories.

Inode `123` may be paired with the name `cat` in one directory and the same `123` may be paired with the name `dog` in the same or a different directory.

Either name leads to the same `123` file inode and the same data and attributes.

Though there appear to be two different files `cat` and `dog` in the directory, the only thing different between the two is the name – both names lead to the same inode and therefore to the same data and attributes (permissions, owner, etc.).

NOTE: You can use ls -i to see the inode numbers paired with each name, and the find command has a useful -inum expression operator.


Multiple names for the same inode are called “hard links”.

The system keeps a “link count” in each inode that counts the number of names each inode has been given.

The ln command can create a new name (a new hard link) in a directory for an existing file inode, increasing the file's inode link count.

The rm command removes a name (a hard link) from a directory, decreasing the file's inode link count.

When the link count for an inode goes to zero, the inode has no names and the inode is freed and recycled and all the storage and data used by the item is released.

The rm command does not remove *files*; it removes *names* for files.

When all the names for a file inode are removed, the system removes the inode itself and releases all the disk space.

As long as an inode has at least one name is some directory (a non-zero link count), it cannot be freed up and released.


Tracing Inodes in Pathnames

Slashes separate names in pathnames

When you look at a Unix pathname, remember that that the slashes separate the names of the pathname components.

All the components to the left of the rightmost slash must be directories, including the “empty” ROOT directory name to the left of the leftmost slash.

For example:

/home/peter/test

In the above example, there are three slashes and therefore four pathname components:

  1. The nameless ROOT directory is the start of this absolute pathname.
  2. Inside the above ROOT directory is the name of the `home` directory.
  3. Inside the above `home` directory is the name of the `peter` directory.
  4. Inside the above `peter` directory is the name of the `test` file.

NOTE: Slashes separate names.

The “empty” name in front of the first slash is the name of the ROOT directory.

The ROOT directory doesn't have a name!

The last (rightmost) component of a pathname can be a file or a directory (or any other thing, such as a symbolic link); for this example, let's assume `test` is a name for a file inode.


Names reside above the things they name

Below is a file system diagram written correctly, with the names for things shown in the directory one level above the things to which the names actually refer.

Each box represents an inode; the inode numbers for the box are given beside the box, on the left.

Inside the directory inodes you can see the pairing of names and inode numbers. (These inode numbers are made up – see your actual Unix system for some real inode numbers.)

One of the inodes below, `#12`, is not a directory; it is an inode for a file and the inode contains the file data.

The downward arrows in the diagram trace two paths (hard links) to the same `#12` file data, `/home/peter/test` and `/home/peter/Downloads/afile`.

We will trace the inodes for two pathnames in the diagram below:

  1. `/home/peter/test`
  2. `/home/peter/Downloads/afile`

Follow the downward-pointing arrows:

      +----+-----+-----------------------------------------+
  #2  |. 2 |.. 2 | home 5 | usr 9 | tmp 11 | etc 23 | ...  |
      +----+-----+--v--------------------------------------+
                    |  The inode #2 above is the ROOT directory. 
                    |  It has the name "home" in it. 
                    |  The *directory* "home" is not here; only the *name* is here. 
                    |  The ROOT directory itself does not have a name, because there is no directory above it to give it a name!
                    V  
      +----+-----+--------------------------------------------------------+
  #5  |. 5 |.. 2 | peter 31 | virginia 36 | felix 39 | abcd0001 21 | ...  |
      +----+-----+--v-----------------------------------------------------+
                    |  The inode #5 above is the "home" directory. 
                    |  The name "home" isn't here; it's up in the ROOT directory, above.
                    |  This directory has the name "peter" in it.
                    V
      +----+-----+--------------------------------------------------------------+
  #31 |. 31|.. 5 | test 12 | temp 15 | Documents 8 | Downloads 7 | demo 6 | ... |
      +----+-----+--v-----------------------------------v-----------------------+
                    |  The inode #31 above is the       |
                    |  "peter" directory.               |
                    |  The name "peter" isn't here;     |
                    |  it's up in the "home" directory, |
                    |  above.                           |
                    |  This directory has the names     |
                    |  "test " and "Downloads" in it.   |
                    |                                   V
      +----+-----+--|------------------------------------------+
  #7  |. 7 |.. 31|  |  afile 12 | morestuf 123 | junk 99 | ... |
      +----+-----+--|-------v----------------------------------+
                    |       |  The inode #7 above is the "Downloads" directory.
                    |       |  The name "Downloads" isn't here; 
                    |       |  it's up in the "peter" directory.
                    |       |  This directory has the name "afile" in it.
                    V       V
                   *-----------*  This inode #12 on the left is a file inode.
                   | file data |  It contains pointers to the data blocks for the file.
               #12 | file data |  This file inode has two names, "test" and "afile", but those names are not here.
                   | file data |  Those two names are up in the two directories that point to this file, above.
                   *-----------*  Because this inode has two names, it has a link count of two.
                                  

The absolute pathname `/home/peter/test` starts at the nameless ROOT directory, inode `#2`.

It travels through two more directory inodes and stops at file inode `#12`.

Using all four inode numbers, `/home/peter/test` could be written as `#2→#5→#31→#12`.

The absolute pathname `/home/peter/Downloads/afile` starts at the ROOT inode and travels through three more directory inodes.

It stops at the same `#12` file inode as `/home/peter/test`.

Using all five inode numbers, `/home/peter/Downloads/afile` could be written as `#2→#5→#31→#7→#12`.

Thus, `/home/peter/test` and `/home/peter/Downloads/afile` are two absolute pathnames leading to the same inode `#12` file data.

The names `test` and `afile` are two names for the same file and are called “hard links”.

Because the file inode `#12` has two names, it has a “link count” of two.

Let's examine each of the two pathnames and their inodes in more detail.


Tracing Pathname 1: `/home/peter/test`

NOTE: Remember: Directories are chunks of storage that pair names with inode numbers.

That is all that is in a directory: names and inode numbers.

The box below represents the layout of names and inode numbers inside the actual disk space given to the nameless ROOT directory, inode `#2`:

      +----+-----+-----------------------------------------+
  #2  |. 2 |.. 2 | home 5 | usr 9 | tmp 11 | etc 23 | ...  |
      +----+-----+-----------------------------------------+

If you look at the ROOT directory above, you will see that both the name `.` and the name `..` in this ROOT directory are paired with inode `#2`, the inode number of the ROOT directory itself.

Following either name `.` or `..` will lead to inode `#2` and right back to this same ROOT inode.

The ROOT directory is the only directory that is its own parent.

The above ROOT directory has the name `home` in it, paired with inode `#5`.

The actual disk *space* of the directory `home` is not here; only the *name* `home` is here, along with its own inode number `#5`.

To read the actual contents of the `home` directory, find the disk space managed by inode `#5` somewhere else on disk and look there.

NOTE: In fact, until we look up inode `#5` and find out that it is a directory inode, we have no way of even knowing that the name `home` is a name of a directory!

The above ROOT directory pairing of `home` with inode `#5` is what gives the `home` directory inode its name.

The name `home` is separate from the disk space for `home`.

The ROOT directory itself does not have a name; because, it has no parent directory to give it a name!

Let us move to the storage space for the `home` directory at inode `#5`.

The box below represents the layout of names and inode numbers inside the actual disk space given to the `home` directory, inode `#5`:

      +----+-----+--------------------------------------------------------+
  #5  |. 5 |.. 2 | peter 31 | virginia 36 | felix 39 | abcd0001 21 | ...  |
      +----+-----+--------------------------------------------------------+

The name `home` for this inode `#5`isn't found in this inode; the name `home` is given to inode `#5` up in the ROOT directory.

Names are separate from the things they name.

We see that the name `.` above leads back to this same `#5` inode, which is why `/home` and `/home/.` lead to the same `#5` inode.

We see that the name `..` above leads up to the parent `#2` inode (the ROOT inode), which is why `/home/..` leads us to `/` the ROOT.

The above `home` directory has the name `peter` in it, paired with inode `#31`.

The actual disk *space* of the directory `peter` is not here; only the *name* `peter` is here, along with its own inode number `#31`.

To read the actual contents of the `peter` directory, find the disk space manged by inode `#31` somewhere on disk and look there.

NOTE: In fact, until we look up inode `#31` and find out that it is a directory inode, we have no way of even knowing that the name `peter` is a name of a directory!

The above `home` directory pairing of `peter` with inode `#31` is what gives the `peter` directory inode its name.

The name `peter` is separate from the disk space for `peter`.

Let us move to the storage space for the `peter` directory at inode `#31`.

The box below represents the layout of names and inode numbers inside the actual disk space given to the `peter` directory, inode `#31`:

      +----+-----+--------------------------------------------------------------+
  #31 |. 31|.. 5 | test 12 | temp 15 | Documents 8 | Downloads 7 | demo 6 | ... |
      +----+-----+--------------------------------------------------------------+

The name `peter` for this inode isn't in this inode; the name `peter` is given to inode `#31` up in the `home` directory.

Names are separate from the things they name.

We see that the name `.` above leads back to this same `#31` inode, which is why `/home/peter` and `/home/peter/.` lead to the same `#31` inode.

We see that the name `..` above leads up to the parent `#5` inode (the `/home` inode), which is why `/home/peter/..` leads us to `/home`.

The above `peter` directory has the name `test` in it, paired with inode `#12`.

The actual disk *space* of the file `test` is not here; only the *name* `test` is here, along with its own inode number `#12`.

To read the actual data of the file `test`, find the disk space managed by inode `#12` somewhere on disk and look there.

NOTE: In fact, until we look up inode `#12` and find out that it is a plain file inode, we no way of even knowing that the name `test` is a name of a plain file!

The above `peter` directory pairing of `test` with inode `#12` is what gives the `test` file inode one of its two names.

The name `test` is separate from the disk space for `test`.

Let us move to the storage space for the `test` file at inode `#12`.

The box below represents the actual disk space given to the `test` file, inode `#12`:

      +-----------+
  #12 | file data |
      +-----------+

The name `test` for this inode isn't in this inode; the name `test` is up in the `peter` directory.

Names are separate from the things they name.

NOTE: This `test` inode is a file inode, not a directory inode, and the attributes of this inode will indicate that.

All the attributes of an inode – type, permissions, owner, group, modify date, etc. – are stored in the inode itself.

The only thing *not* stored in the inode is the *name* of the inode, which is always stored in the directory above the inode (the parent directory of the inode).

The inode for a file contains pointers to disk blocks that contain file data, not directory data.

There are no special directory names `.` and `..` in files.

There are no names here at all; the disk block pointers in this inode point to just file data (whatever is in the file).

This completes the inode trace for `/home/peter/test`: `#2→#5→#31→#12`

But `test` is just one of the names for inode `#12`; it has another name, too.


Tracing Pathname 2: `/home/peter/Downloads/afile`

NOTE: Remember: Directories are chunks of storage that pair names with inode numbers.

That is all that is in a directory: names and inode numbers.

Let's now trace the inode path for the name `/home/peter/Downloads/afile`.

This pathname is a “hard link” to `/home/peter/test`; both the `test` and `afile` names point to the same inode number.

Let's see how this is possible.

The trace from ROOT through `/home/peter` is the same as before.

Things change in our second trace because of `/home/peter/Downloads`.

If we look at the `peter` directory inode `#31` again, we see that the name `Downloads` is paired with inode `#7`:

      +----+-----+--------------------------------------------------------------+
  #31 |. 31|.. 5 | test 12 | temp 15 | Documents 8 | Downloads 7 | demo 6 | ... |
      +----+-----+--------------------------------------------------------------+

The above `peter` directory has the name `Downloads` in it, paired with inode `#7`.

The actual disk *space* of the directory `Downloads` is not here; only the *name* `Downloads` is here, along with its own inode number `#7`.

To read the actual contents of the directory `Downloads`, find the disk space managed by inode `#7` somewhere else on disk and look there.

NOTE: In fact, until we look up inode `#7` and find out that it is a directory inode, we no way of even knowing that the name `Downloads` is a name of a directory!

Let us move to the storage space for the `Downloads` directory at inode `#7`.

The box below represents the layout of names and inode numbers inside the actual disk space given to the `Downloads` directory, inode `#7`:

      +----+-----+--------------------------------------------+
  #7  |. 7 |.. 31|    afile 12 | morestuf 123 | junk 99 | ... |
      +----+-----+--------------------------------------------+

The name `Downloads` for this inode isn't in this inode; the name `Downloads` is given to inode `#7` up in the `peter` directory.

Names are separate from the things they name.

We see that the name `.` above leads back to this same `#7` inode, which is why `/home/peter/Downloads` and `/home/peter/Downloads/.` lead to the same `#7` inode.

We see that the name `..` above leads up to the parent `#31` inode (the `/home/peter` inode), which is why `/home/peter/Downloads/..` leads us to `/home/peter`.

The above `Downloads` directory has the name `aile` in it, paired with inode `#12`.

The actual disk *space* of the file `afile` is not here; only the *name* `afile` is here, along with its own inode number `#12`.

To read the actual data of the file `afile`, find the disk space managed by inode `#12` somewhere on disk and look there.

NOTE: In fact, until we look up inode `#12` and find out that it is a plain file inode, we no way of even knowing that the name `afile` is a name of a plain file!

The above `Downloads` directory pairing of `afile` with inode `#12` is what gives the `afile` file inode the other one of its two names.

The name `afile` is separate from the disk space for `afile`.

You will recall that we have seen inode `#12` in the previous trace.

Above, in the `peter` directory (inode `#31`), inode `#12` was also paired with the name `test`.

In the `Downloads` directory (inode `#7`), inode `#12` is paired with the name `afile`.

Inode `#12` therefore has two different names; both names `test` and `afile` are both hard links to the same inode `#12`, and the `ls` command can prove this:

ls -i /home/peter/test /home/peter/Downloads/afile
 
12 /home/peter/test   12 /home/peter/Downloads/afile

Having two names means the “link count” of inode `#12` is set to “`2`”.

Both names lead to the same `#12` inode and thus to the same data and same attributes.

This is *one* single file with *two* names.

A change to the file data using the name `test` changes the data in inode `#12`.

That changes file data for the name `afile` too; because, `test` and `afile` are two names for the same `#12` inode storage – they are two names that point to the same storage inode.

All the inode attributes – everything about data inode `#12` except its name – is kept with the inode.

The only thing different in a long listing of `test` and `afile` will be the names; everything else (file type, permissions, owner, group, link count, size, modification times, etc.) is part of inode `#12` and must therefore be identical for the two names.

Neither name is more “original” than the other; both names have equal status.

To release the `#12` inode storage, you have to delete both names so that the link count of inode `#12` drops to zero.


Summary Tracing Pathname 2: `/home/peter/Downloads/afile`

Let's summarize the inodes used in this pathname:

  • /home/peter/Downloads/afile

Start on the left and walk the tree of names and inodes left to right.

To be a valid Unix path, everything to the left of the rightmost slash must be a directory. (Thus, ROOT, `home`, `peter`, and `Downloads` must be directories, if this is a valid pathname.)

Start with the nameless ROOT directory in front of the first slash (ROOT doesn't have a name, since it does not appear in any parent directory) and look for the first pathname component (`home`) inside that ROOT directory (inside inode `#2`).

Let's trace the pathname:

  • Look in the ROOT directory (located in inode `#2`) for the name of the first pathname component: `home`.
  • We find the name `home` inside the ROOT directory, paired with inode `#5`.
  • Go back out to the disk to find inode `#5` that is the actual `home` directory.

NOTE: Note how the names are separate from the things they name.

The actual directory inode `#5` of the `home` directory is not the same as the inode `#2` of the ROOT directory that contains the directory name `home`.

The name is stored in a different place (`#2`) than the thing it names (`#5`).

  • In inode `#5`, the directory inode that has the name `home`, look for the name `peter`.
  • We find `peter` paired with inode `#31`.
  • Go back out to the disk to find inode `#31` that is the actual `peter` directory inode.
  • Again, the name `peter` is contained in directory inode `#5` (`home`) and that name is stored separately from inode `#31` that is the actual `peter` directory itself.

In inode `#31`, the directory inode that has the name `peter`, look for the name `Downloads`.

We find `Downloads` paired with inode `#7`.

Go back out to the disk to find inode `#7` that is the actual `Downloads` directory inode.

Again, the name `Downloads` is contained in directory inode `#31` (`peter`) and that name is stored separately from the inode `#7` that is the actual `Downloads` directory itself.

In inode `#7`, the directory inode that has the name `Downloads`, look for the name `afile`.

We find `afile` paired with inode `#12`.

Go back out to the disk to find inode `#12` that is the actual data of the file `afile`.

Again, the name `afile` is contained in directory inode `#7` (`Downloads`) and that name is stored separately from the inode `#12` that is the actual data of the file.

The name of a file is not part of the inode that makes up the actual file data.

We have found the inode that is the file data: inode `#12`.

The name of this file, `afile`, is stored up in inode `#7` that is the `Downloads` directory.

The name is separate from the data it names.


Every directory inode contains a name `..` (dot dot) paired with the number of the inode that is the unique parent directory of the inode.

That unique parent directory is the only one containing the name of this inode.

Because a directory can have only one parent, hard links are not permitted for directories. (The name `..` can only link to one parent directory.)

Directory inode `#5` above contains the name `..` paired with inode `#2` (the ROOT directory), and it is in that `#2` inode directory that we see that inode `#5` is paired with the name `home`.

The parent directory of inode `#5` is the directory that contains the name `home`.

Unlike directory inodes, file inodes contain no record of which parent directories give it its names.

The only thing that is recorded in the file inode is the number of names the inode has: the link count.

File inode `#12` has two names, so it has a link count of two.

The inode has no information about in which directories the two names are located.

There is no easy way to know which directories give a file inode its one or more names.

In a file inode, there is no name `..` to point to a parent directory, because a file inode might have hundreds or thousands of parent directories (thousands of names).

NOTE: If you have a file inode with multiple names (a link count larger than one) and you want to find the other names for the inode, you have to do a brute-force search in every directory on the file system to see which directories might have names paired with this inode number.

Usually, the multiple names for an inode are in the same directory or in directories that are closely related to each other (parent directories, sub-directories, or sibling directories), but that isn't always the case.

In the worst case, finding all the names for a file inode may require searching for that inode number in *every* directory in the whole file system, something that could take hours on a very large file system.

A file is deleted from disk only when its link count goes to zero, i.e. when all the names for the inode are removed.

Then the disk blocks for the inode are returned to the system for use by other files, and the inode (with a zero link count) is returned to the pool of free and available inodes.


Permissions on data vs. permissions on directories

Each Unix inode has a set of permissions that govern what a process can do to that inode.

Since names are stored in directory inodes, separate from the inodes of things they name, a process may have permissions to change the name of a thing in a directory inode without having permissions to change the data in the inode that is the thing itself, or vice-versa.

  • If a process has permission to change a directory inode, it can change the names of things in that directory inode, including adding names (e.g. ln), removing names (e.g. rm), or renaming names (e.g. mv).
    • These are operations on directory inodes.
  • If a process has permission to change a file inode, it can erase, append to, or change the content of the file itself.
    • The file inode is different from the inode containing the name of the file.

Names are stored in directory inodes, separate from the things they name, so a process may have permissions to change the content of a file (the file inode) without having permissions to change the name of the file (the directory inode containing the name), or vice-versa.

If file data inode `#12` above has appropriate permission attributes, a process could read or write the data in that file inode.

It is the permission attributes on the inode `#12` containing the file *data* that govern what a process can do with the *data* in the file.

The two names of the file, either `test` or `afile`, are stored up in directory inodes separate from the inode `#12`.

The permissions on the inodes of those two parent directories containing the *names* of the file do not control whether a process can modify the inode containing the *data* of the file.

It is the inode that contains the data (`#12`) that controls whether a process can read or write the data in that inode.

Directory inodes have permissions that control whether a process is allowed to pass through the directory to access the things named in the directory.

This is called *access* or *search* permission (`x`).

If the any of the inodes of the directories containing the names leading down to the file at inode `#12` don't give the process *search* permission, the process won't be able to reach the file's data inode that way and won't be able to access the file's data using those directories; but, perhaps some other directories may lead the process to the same inode `#12`, if the file has another name.

To access and read the data in a file path such as:

  • /home/peter/Downloads/afile

you need appropriate *search* permissions on the ROOT directory inode, the `home` directory inode, the `peter` directory inode, the `Downloads` directory inode, and finally *read* permissions on the `afile` file data inode `#12`.

It is the file data inode `#12` permissions that determine whether or not you can read or change the *data* of the file.

Reading or changing the data in the file requires permissions on the inode `#12` that contains the data blocks of the file itself.

It is the `Downloads` directory inode permissions (inode `#7`) that determine what you can do with the *name* `afile` of the file, because the `Downloads` directory (inode `#7`) is where the name `afile` is kept.

Changing, linking to, or removing the name of a file operates on the inode of the *directory* in which the file name appears; altering the name has nothing to do with reading or changing the inode that contains the data blocks of the file itself.

You can have no permissions on the inode that contains the data blocks of the file itself (it may even be owned by some other user) and still you may be able to rename or remove one of the names of the file from a directory on whose inode you do have permissions.

The name(s) of a file is (are) stored in separate inodes from the data blocks of the file.

Names are separate from the things that they name.

The permissions of the names are also separate from the permissions of the data.

  • Changing a *name* in a directory inode requires write and execute permissions on the *directory* inode containing the name.
    • No permissions are needed on the inode containing the *data* of the thing being renamed. NOTE: Some recent Linux kernels have added security that changes this.
  • Changing the *content* of a file only requires write permissions on the data inode of the *file* itself, not on the inode of any parent directory that holds one of the names of the file.

Names are separate from the things they name, so two sets of permissions (two inodes) are always involved when a process tries to access a thing.


Exercise Questions on Hard Links and Directories

  1. Normally when you do ls -l dir you see the permissions of the *contents* of the directory, not the directory itself.

What command and options are needed to see the access permissions and link count of the directory inode itself, instead of the *contents* of a directory?

- When you are inside a directory, what is the name you use to refer to the directory itself? (This name works inside any directory.)

What name always refers to the unique parent directory?

- How many links (names) does a brand new, empty directory have?

Why isn't it just one link, as it is for a new file? (In other words, why does a new file have one link and a new directory have more than that?)

- Why does creating a sub-directory in a directory cause the directory's link (name) count to increase by one for every sub-directory created?

  (Recall that a link count is a count of names.)

- Why doesn't the link (name) count of the directory increase when you create files in the directory?

- Give the Unix command and its output that shows the inode number and owners of the following directories.

Only show the given directory; do not show any other directories:

  a)  your current directory
  b)  your parent directory
  c)  your HOME directory
  d)  the directory named `/home`
  e)  the ROOT directory
  f)  the directory named `/root`

Use a command (and options) that will show only the directory itself, not its contents.


linux/files_system.txt · Last modified: 2022/07/09 13:44 by 185.192.71.121

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki