08. Storage in linux


Storage basics
Note: Block devices stores in sdk directory.
lsblk
to see the list of block devices in the system.
ls -l /dev/ | grep “^b”
look files that has ‘b’ as the first character.
Traditional spinning hard-disk and SSD are called as block device, because here we read or write data in a chunk.
Here sda is the block device or total device whereas the subparts (sda1, sda2, and sda3) are the partitions of disk. Here, major number represent about the type of disk. Minor number represent to distinguish individual, physical or logical devices. The concept of partitions allows us to segment space and use each partition for a specific purpose.
The information about partitions is saved in a partition table. lsblk
is one among other.
sudo fdisk -l /dev/sda
list the partition table information, also can be use for delete and create partitions.
Type of disk partitions
- Primary; Extended; and Logical.
Primary: Used to boot an operating system. (Traditionally, disk were limited to not more than four primary partitions per disk).
Extended: It is a type of partition that can not be used on its own but can host logical partitions with the restriction of maximum four primary partitions. We can create extended partitions and curve out logical partitions inside it. An extended partition is like a disk drive in its own right. It has a partition table that points one or more logical partitions.
Logical: Logical partitions are those, created within an extended partition.
** How a disk is partitioned is defined by a partitioning scheme also known as a partition table.
It is a MBR (Master Boot Record) partitioning scheme.
There can only be four primary partitions in MBR.
Maximum size per disc is two terabytes. (If we want more partitions per disk, we would need to create the fourth partition as an extended partition and curve out logical partitions within it).
GPT stands for G-U-I-D partition table and is a more recent partitioning scheme that was created to address the limitations in MBR.
Theoretically gpt can have an unlimited number of partitions per disc. This is usually only limited by the restrictions imposed by the operating system itself. The disc size limitation of two terabyte does not exist with gpt.
Unless the operating system is going to be installed on the disc requires MBR, gpt is always the best choice when it comes to selecting a partitioning scheme.
lsblk
list the block devices.
gdisk /dev/sdb
creating partition on sdb disc (gdisk
is improved version of fdisk
that works with the gpt partition table) » Then follow the pompt and type ‘?’ to select option. After all checks press ‘w’ to overwrite existing partition which will create new partition called /dev/sdb1
.
lsblk
or, sudo fdisk -l /dev/sdb
to check the status of new partition. Or, gdisk -l
to see the partition table.
File system
Partitioning alone does not make a disk usable in the OS. The disk in the partitions are seen by the linux kernel as a raw disk. To write to a disk or partition, we must first create a filesystem.
The filesystem defines how data is stored on a disk. After creating a filesystem, we must mount it to a directory, and that’s when we can read or write data to it.
Most commonly used file system:» Extended file system series: EXT2, EXT3, and EXT4.
Both EXT2 and EXT3 allow a maximum file size of 2 TB and a maximum file size of 4 TB.
The significant difference between these two is that in EXT2, in case of unclean shutdown such as one caused by a power outage, it can take some time for the system to boot backup. EXT3 file system however, did not have this drawback. It implemented additional feature that allowed quicker startup after an ungraceful shutdown.
EXT4 further improves EXT3 filesystem and still one of the most common general purpose filesystem used today. It can support 16 TB of maximum file size and up to 1 Exabyte of volume size.
EXT3 and EXT4 are backward compatible.
An EXT4 filesystem can be mounted as an EXT3 OR EXT2 filesystem. Similarly an EXT3 can be mounted as EXT2.
mkfs.ext4 /dev/sdb1
to create an EXT4 filesystem.
mkdir /mnt/ext4
to create a folder on mnt
mount /dev/sdb1 /mnt/ext4
mounting the filesystem on the system.
To check if the if the file system is mounted or not using mount | grep /dev/sdb1
or, df -hP | grep /dev/sdb1
To make the mount be available after the system reboots need to add an entry to the /etc/fstab
file.
echo “/dev/sdb1 /mnt/ext4 ext4 rw 0 0” >> /etc/fstab
Dump option » set to disable (0) or taking (1) backups.
Pass field » priority check for the filesystem check tool to determine the order in which the filesystem should be checked during the boot after a crush.
- 0 » ignore the file system check
Commonly used external storage (DAS, NAS, SAN)
For a desktop environment, such as laptop, we can get away with using the onboard storage or attaching an external drive for our data needs.
But this is not feasible for an enterprise-grade server environment such as a production database or a webserver storing a lot of data. As a result we can use enterprise-grade high capacity external storage with high availability.
DAS » Direct Attached Storage
NAS » Network Attached Storage
SAN » Storage Area Network
This technology uses a fiber channel for providing high-speed storage.
DAS
External storage is attached directly to the host system that requires the space. The host operating system sees a connected desk device as a block device. There is no network or firewall between the storage and the host, which means this provides excellent performance at a very affordable cost. DAS generally has a faster response than a NAS device where the data traffic goes over the network. The downside is that since it is directly attached, it is dedicated to a single server.
As a result, this is not ideal for enterprise environments where multiple server need storage. And is more suitable for small businesses.
NAS
It is suitable for mid to large businesses. NAS storage device is generally located apart from the hosts that will consume space from it. The data traffic between the storage and the host is through the network. The physical distance between two may not be significant. They may located in the same rack in a data center, yet the data need to traverse through the network.
NAS is a file storage device unlike DAS and SAN both of which are block storage device.
The storage is provided to the hosts in the form of a directory or a share that is physically present in the NAS device but export via NFS to the hosts.
This type of storage is ideal for centralized shared storage that need to be accessed simultaneously by several different hosts. The performance of NAS and high-speed ethernet connectivity between two, can provide good performance and highly available storage solution.
SAN
SAN provides block storage used by enterprises for business critical applications that need to deliver high throughput and low latency. Storage allocated to host in the form of a LUN (Logical Unit Number). A LUN is a range of blocks provisioned from a pool of shared storage and presented to the server as a logical disk. The host system will detect this storage as a raw disk. We can then create partitions and file system on the top of it as we would with any other block device and then mount it on the system to store data. While SAN can also be ethernet-based, it mainly makes use of FCP (Fiber Channel Protocol). FCP is a high speed data transfer protocol that makes use of fiber channel switch to establish communication with the host. The host server make use of HBA (Host Bus Adapter).
NFS
NFS saves data in the form of files instead of blocks. It works on a server-client model. NFS works on a server client model.
Lets take an example of of software repository server. The directory /software/repos
exist on the repository server. This directory is then shared over the network using an NFS to the clients which, in this case, are employee laptops. The data we can see on the laptop may not physically reside on any of the attached disks. However, once mounted, it can be used as any other file system in the operating system. The term for directory sharing in NFS is called exporting.
The NFS server maintain the configuration file at /etc/exports
that define the clients which should be able to access the directories on the server.
In the ideal situation, there would be a network firewall between NFS server and the clients. As a result, specific ports may have to be opened between the server and the clients for the NFS solution to work.
Once the /etc/exports
file has been updated on the server, the directory is shared to the clients by using the exportfs
command.
exportfs -a
exports all the mounts defined in the /etc/exports
file.
exportfs -o 10.61.35.201:/software/repos
allows us to manually export a directory.
Once exported we should be able to mount it on a local directory such as /mnt/software/repos
using the mount command on the client side mount 10.61.112.101:/software/repos /mnt/software/repos
. The network share should now be mounted on the clients.
LVM (Logical Volume Manager)
LVM allows grouping of multiple physical volumes, which are hard disk or partitions into a volume group. From this volume group we can curve out logical volumes.
In this example, we have three partitions used, but in reality it can vary from one disk or partition to an unlimited number of disks or partitions. They can be a grouped to a single volume group.
- LVM allows the logical volumes to be resized dynamically as long as there is sufficient space in the volume group.
apt-get install lvm2
install the package lvm2.
The first step in configuring LVM is to identify free disk or partitions then create physical volume object for them. A physical volume object is how LVM identifies a disk or a partition. It is also called a PV.
pvcreate /dev/sdb
creating a physical volume using the device path (/dev/sdb
)
vgcreate caleston_vg /dev/sdb
to create volume group or VG with desired group name. A volume group can one or more physical volumes.
pvdisplay
to see the details about the physical volume. It lists all PVs, their names, the size, and volume group it is part of.
vgdisplay
to see more details of VG.
lvcreate -L 1G -n vol1 caleston_vg
to create logical volume. In this example we are creating 1gb volume in the caleston_vg volume group. The ‘L’ option stands for linear volume. This option enables us to make use of multiple physical volumes if available in the volume group to create a single logical volume.
lvdisplay
to see the logical volume. or can use lvs
command.
Once the volume is been created, we can create a file system on it using mkfs.ext4 /dev/caleston_vg/vol1
.
mount -t ext4 /dev/caleston_vg/vol1 /mnt/vol1
Lets create an ext4 file system and then mount. The logical volume is now available for use.
To resize the file system on vol1 while it is mounted, we need to first check if there is enough space in the VG.
vgs
to list the VGs and their details.
lvresize -L +1G -n /dev/caleston_vg/vol1
increase the volume by one gigabytes.
Now, if we check the size of the filesystem, df -hP /mnt/vol1
we can see it is still having the capacity of 1GB. Because only the logical volume has been resized, not the file system we have created on it. We need to use resize2fs /dev/caleston_vg/vol1
to resize the filesystem also. Now, running the df -hP /mnt/vol1
command will show the file system now resized to 2GB.
The logical volume with LVM are accessible at two places.
/dev/volume-group
/dev/mapper
References
Subscribe to my newsletter
Read articles from Arindam Baidya directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
