Linux namespace: mount namespace

Understanding mount namespace

Users usually use the mount command to mount ordinary file systems, but in fact, mount can mount many things. Even the Linux system with perfect functions now depends on the mount function for the normal operation of its kernel, such as mounting the root file system /. In fact, all mounting functions and mounting information are provided and maintained by the kernel. The mount command only initiates the mount() system call to request the kernel.

mount namespace can isolate a running environment with independent mount point information. The kernel knows how to maintain the mount point list of each namespace. That is, "the list of mount points between each namespace is independent, and their respective mounts do not affect each other.".

The kernel saves the mount point information of each process in / proc / < PID > / {mountinfo, mounts, mountstats}:

$ ls -1 /proc/$$/mount*
/proc/26276/mountinfo
/proc/26276/mounts
/proc/26276/mountstats

"Having independent mount point information means that each mnt namespace can have an independent directory hierarchy", which plays a great role in the container: the container can mount only its own file system.

When the mount namespace is created, the kernel will copy a list of mount point information of the current namespace to the new mnt namespace. After that, the two MNT namespaces have no relationship (not really no relationship, refer to shared subtrees later).

The way to create a mount namespace is to use the -- mount, -m option of the unshare command:

#Create mount+uts # namespace
#- m or -- mount means to create a mount} namespace
#You can create namespaces with multiple namespace types at the same time
unshare --mount --uts <program>

Let's do a simple experiment and mount 1. 0 in the root namespace The ISO file to the / mnt/iso1 directory, and mount 2.0 in the newly created mount+uts namespace ISO to /mnt/iso2 Directory:

[~]->$ cd
[~]->$ mkdir iso
[~]->$ cd iso
[iso]->$ mkdir -p iso1/dir1
[iso]->$ mkdir -p iso2/dir2  
[iso]->$ mkisofs -o 1.iso iso1  #Make iso1 directory into image file 1 iso
[iso]->$ mkisofs -o 2.iso iso2  #Make iso2 directory into image file 2 iso
[iso]->$ ls
1.iso  2.iso  iso1  iso2
[iso]->$ sudo mkdir /mnt/{iso1,iso2}
[iso]->$ ls -l /proc/$$/ns/mnt
lrwxrwxrwx 1 ... /proc/26276/ns/mnt -> 'mnt:[4026531840]'

#Mount 1 in root # namespace ISO to / mnt/iso1 directory
[iso]->$ sudo mount 1.iso /mnt/iso1  
mount: /mnt/iso: WARNING: device write-protected, mounted read-only.
[iso]->$ mount | grep iso1
/home/longshuai/iso/1.iso on /mnt/iso1 type iso9660

#Create mount+uts # namespace
[iso]->$ sudo unshare -m -u /bin/bash
#Although this namespace is the namespace of mount+uts
#Note that the inode s of mnt # namespace and uts # namespace are different
root@longshuai-vm:/home/longshuai/iso# ls -l /proc/$$/ns
lrwxrwxrwx ... cgroup -> 'cgroup:[4026531835]'
lrwxrwxrwx ... ipc -> 'ipc:[4026531839]'
lrwxrwxrwx ... mnt -> 'mnt:[4026532588]'
lrwxrwxrwx ... net -> 'net:[4026531992]'
lrwxrwxrwx ... pid -> 'pid:[4026531836]'
lrwxrwxrwx ... pid_for_children -> 'pid:[4026531836]'
lrwxrwxrwx ... user -> 'user:[4026531837]'
lrwxrwxrwx ... uts -> 'uts:[4026532589]'

#Modify the host name to ns1
root@longshuai-vm:/home/longshuai/iso# hostname ns1
root@longshuai-vm:/home/longshuai/iso# exec $SHELL

#In the namespace, you can see the mount information in the root # namespace
root@ns1:/home/longshuai/iso# mount | grep 'iso1' 
/home/longshuai/iso/1.iso1 on /mnt/iso1 type iso9660

#Mount 2 in namespace iso2
root@ns1:/home/longshuai/iso# mount 2.iso2 /mnt/iso2/
mount: /mnt/iso2: WARNING: device write-protected, mounted read-only.
root@ns1:/home/longshuai/iso# mount | grep 'iso[12]'
/home/longshuai/iso/1.iso1 on /mnt/iso1 type iso9660
/home/longshuai/iso/2.iso2 on /mnt/iso2 type iso9660

#Uninstall iso1 in namespace
root@ns1:/home/longshuai/iso# umount /mnt/iso1/
root@ns1:/home/longshuai/iso# mount | grep 'iso[12]' 
/home/longshuai/iso/2.iso2 on /mnt/iso2 type iso9660
root@ns1:/home/longshuai/iso# ls /mnt/iso1/
root@ns1:/home/longshuai/iso# ls /mnt/iso2
dir2

####Open another Shell terminal window
#The iso1 mount still exists, and there is no iso2 mount information
[iso]->$ mount | grep iso
/home/longshuai/iso/1.iso1 on /mnt/iso1 type iso9660
[iso]->$ ls /mnt/iso2
[iso]->$ ls /mnt/iso1
dir1

The above is the basic content of mount namespace. There is only one key point: when creating mnt namespace, the mount point information of the current namespace will be copied, and then the two namespaces will have no relationship.

mnt namespace: shared subtrees

Each mount point of Linux has an attribute that determines whether the mount point shares sub mount points, which is called shared subtrees. This attribute is used to determine whether the replica mount point will be affected synchronously when a child mount point is added or removed under a mount point.

Briefly talk about the shared subtrees feature. What's the use of this feature? Take mnt namespace as an example to briefly introduce the shared subtrees feature.

Assuming that a mnt namespace(ns1) is created based on the root namespace, ns1 will have the mount point information copy of the current root namespace. If you insert a new disk and format its partition, and then mount it in root namespace, by default, you will not see the newly mounted file system in ns1. This default behavior can be changed by modifying the shared subtrees property.

In fact, the purpose of creating a namespace is to create a completely isolated running environment. Therefore, by default, the shared subtrees attribute of all mount points in the mount namespace is set to private instead of copying the mount point information.

Therefore, if / mnt/foo in namespace ns1 is a mount point directory, a mnt namespace ns2 is created based on ns1. By default:

  • If you add a mount point / mnt/foo/bar in ns1 at this time, it will not affect / mnt/foo in ns2

  • If you add a mount point / mnt/foo/baz in ns2 at this time, it will not affect / mnt/foo in ns1

  • The same is true for removing mount points

But this default behavior can be changed.

unshare has an option - propagation private|shared|slave|unchanged to control the sharing mode of mount points when creating mnt namespace.

  • Private: indicates that the shared subtrees attribute of the mount points in the newly created mnt namespace is set to private, that is, the mount points of ns1 and ns2 do not affect each other

  • Shared: indicates that the shared subtrees attribute of the mount point in the newly created mnt namespace is set to shared, that is, the new or removed child mount points in ns1 or ns2 will be synchronized to the other party

  • Slave: indicates that the shared subtrees attribute of the mount points in the newly created mnt namespace is set to slave, that is, adding or removing child mount points in ns1 will affect ns2, but ns2 will not affect ns1

  • unchanged: indicates that the shared subtrees attribute of the mount point is also copied when the mount point information is copied, that is, mount point A is originally shared and will also be shared in mnt namespace

  • When the -- progapation option is not specified, the default value of the shared subtrees of the mount point in the created mount namespace is private

For example:

# root namespace: ns0
$ sudo mount --bind foo bar
$ sudo mount --make-shared bar    #The mount point is set to shared

#Create mnt namespace: ns1, and also copy the shared attribute
$ PS1="ns1$ " sudo unshare -m -u --propagation unchanged sh
[ns1]$ grep 'foo' /proc/self/mountinfo 
944 682 8:5 foo bar rw,relatime shared:1

#Add a new sub mount point under the bar mount point in ns1, and the sub mount point will be synchronized to ns0
#Because Foo and bar are bound, and the attribute of bar is shared, it will also be synchronized to the foo directory
[ns1]$ sudo mount --bind baz bar/subfoo
[ns1]$ tree foo bar 
foo
└── subfoo
    └── subbaz
bar
└── subfoo
    └── subbaz
[ns1]$ grep 'foo' /proc/self/mountinfo
944 682 8:5 foo bar rw,relatime shared:1
945 944 8:5 baz bar/subfoo rw,relatime shared:1
947 682 8:5 baz foo/subfoo rw,relatime shared:1

#View the mount point information of ns0 in the second window session
#It's synced
$ grep 'foo' /proc/self/mountinfo
622 29 8:5  foo bar rw,relatime shared:1
948 622 8:5 baz bar/subfoo rw,relatime shared:1
946 29 8:5  baz foo/subfoo rw,relatime shared:1
$ tree foo bar 
foo
└── subfoo
    └── subbaz
bar
└── subfoo
    └── subbaz

Tags: Virtualization

Posted by Wildthrust on Sat, 16 Apr 2022 02:30:04 +0930