vdo icon indicating copy to clipboard operation
vdo copied to clipboard

How to see the physical space of a disk?

Open Baimax-123 opened this issue 3 years ago • 10 comments

I set up VDO on the disk and want to check the actual disk usage when deduplication is turned off and on What command should I use? sudo vdostats --hu? This should only be the size in VDO

Baimax-123 avatar Apr 21 '22 10:04 Baimax-123

Hi @Baimax-123, yes. Using the vdostats utility will provide you with a df-style output that shows the physical usage of the volume.

Here is some output to show an example.

[root@localhost ~]# vdo create --name vdo0 --device /dev/sda --vdoLogicalSize 1T
Creating VDO vdo0
      The VDO volume can address 12 GB in 6 data slabs, each 2 GB.
      It can grow to address at most 16 TB of physical storage in 8192 slabs.
      If a larger maximum size might be needed, use bigger slabs.
Starting VDO vdo0
Starting compression on VDO vdo0
VDO instance 0 volume is ready at /dev/mapper/vdo0
[root@localhost ~]# mkfs.xfs -K /dev/mapper/vdo0
meta-data=/dev/mapper/vdo0       isize=512    agcount=4, agsize=67108864 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=268435456, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=131072, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@localhost ~]# mkdir /mnt/vdo
[root@localhost ~]# mount /dev/mapper/vdo0 /mnt/vdo
[root@localhost ~]# lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda      7:0    0  15G  0 disk 
└─vdo0 252:0    0   1T  0 vdo  /mnt/vdo
vda    253:0    0  20G  0 disk 
└─vda1 253:1    0  20G  0 part /



# Note that the starting values show 7.2G physical used and 3G used on the
# filesystem.

[root@localhost ~]# df -h /mnt/vdo
Filesystem        Size  Used Avail Use% Mounted on
/dev/mapper/vdo0  1.0T  7.2G 1017G   1% /mnt/vdo
[root@localhost ~]# vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo0         15.0G      3.0G     12.0G  20%           99%


# Write 1G of unique data and see both values increase by 1G.

[root@localhost ~]# dd if=/dev/urandom of=/mnt/vdo/1G-file bs=1M count=1024 oflag=direct status=progress
1072693248 bytes (1.1 GB, 1023 MiB) copied, 24 s, 44.7 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 24.0207 s, 44.7 MB/s
[root@localhost ~]# df -h /mnt/vdo
Filesystem        Size  Used Avail Use% Mounted on
/dev/mapper/vdo0  1.0T  8.2G 1016G   1% /mnt/vdo
[root@localhost ~]# vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo0         15.0G      4.0G     11.0G  26%           33%


# Duplicate that data and see only the df (logical used) increase

[root@localhost ~]# cp -a /mnt/vdo/1G-file /mnt/vdo/1G-file-copied
[root@localhost ~]# sync
[root@localhost ~]# df -h /mnt/vdo
Filesystem        Size  Used Avail Use% Mounted on
/dev/mapper/vdo0  1.0T  9.2G 1015G   1% /mnt/vdo
[root@localhost ~]# vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/vdo0         15.0G      4.0G     11.0G  26%           59%

rhawalsh avatar Apr 22 '22 02:04 rhawalsh

Hi @Baimax-123, What are you specifically trying to find out? Using df, and vdostats are going to give you different metrics than smartctl. If you're just trying to understand usage information (i.e. How full is the device and how much more can I write to it?), then you should be using df, and vdostats for that information. But if you're instead interested in understanding wear leveling or something along those lines, then smartctl would be a way to go about getting that information. I don't think using smartctl to measure usage is going to tell you much other than that a read or write operation has happened on the device.

rhawalsh avatar Apr 22 '22 14:04 rhawalsh

Hi @rhawalsh, I have a SSD with compression, which can give the physical usage of the disk. However, this result is far from that given by df or vdostats . So maybe there is a gap between the actual physical usage in the standard disk and the above commands?

Baimax-123 avatar Apr 22 '22 15:04 Baimax-123

Hi @Baimax-123, I did not realize your SSD was doing compression as well. I would suggest that if you want to compare realistic numbers, it's probably better to look at the df, and vdostats outputs without any human readable numbers, since those are going to be rounded to the nearest GiB (or so...).

As you can tell the state of compression and/or deduplication affects the amount of data that actually gets cycled through the device. It will never be 1:1 because of the need to write out metadata, journal information for recovery, etc. Depending on the workload the ratio will vary up or down.

rhawalsh avatar Apr 22 '22 16:04 rhawalsh

Hi @rhawalsh , thanks, I will try it with your suggestion. And hope find some useful conclusions. Bast.

Baimax-123 avatar Apr 22 '22 16:04 Baimax-123

Hi @Baimax-123, Please feel free to ask any questions you might have along the way!

I also intended to mention that inspecting the output of vdostats --verbose may give you some additional clues to the amount of data being written to the underlying storage. Typically you should be able to look at statistics that mention 'write', 'flush', and/or 'fua' to help tie things together.

rhawalsh avatar Apr 22 '22 16:04 rhawalsh

Hi, @rhawalsh Where can I find a detailed explanation of the vdostats -- verbose command output project? Some data may be useful, but I can't understand its actual meaning For example: BIOS meta completed write

Baimax-123 avatar Apr 24 '22 06:04 Baimax-123

Hi, @rhawalsh, I use the FIO tool for random writing and iostat to monitor the VDO volume and hard disk at the same time. The VDO volume only has write BW, but there are both read and write BW in the hard disk (the approximate data is as follows: the read BW is the same as the write BW of the VDO volume, and the write BW is twice the write BW of the VDO volume). Can you roughly describe the role of this additional bandwidth introduction? Thanks.

As mentioned in the following table, VDO volumes are built directly on the hard disk. The FIO write command is: sudo fio -filename=/dev/mapper/vdo_2 --bs=4k --output write4k.log --direct=1 --iodepth=128 --rw=randwrite --ioengine=libaio --buffer_compress_percentage=54 --buffer_compress_chunk=4096 --offset=0 --size=100% --runtime=50000s --time_based=1 --group_reporting --numjobs=4 vdo_2 is the name.

Baimax-123 avatar May 05 '22 03:05 Baimax-123

Hi @Baimax-123, I apologize for the delayed response.

To get some information about the output from vdostats --verbose, I'd point you at the RHEL docs, specifically "Table 30.9. vdostats --verbose Output" if the anchor doesn't put you there initially.

The IO for VDO involves doing read-compares when we encounter duplicate data. So if the block comes in, VDO hashes it and sends to UDS for advice, and UDS claims that it is a duplicate and likely at a particular block, the VDO device will then go read that block to make sure that it actually is a duplicate. In the event that it's actually not a duplicate, the VDO device can then write it out as it normally would with a unique block. So it is for reasons like this that you're seeing a bunch of read traffic, despite a purely write workload.

Please keep in mind that my description of the IO pattern is generalized. If you want/need more detail then you could ask for more details and I can try to get someone who is more knowledgeable than I am to provide better information. Of course you're always free to browse the code yourself as well, but that might be more work than its worth.

rhawalsh avatar May 06 '22 16:05 rhawalsh

Thanks, @rhawalsh. I have see that. Meanwhile, there is another question: VDO has the functions of Deduplication and Compression. You can know from the function name that the read-compares should exist. But if you turn off Deduplication and Compression, will the read-compares still be exist?

And I will browse the code and hope to learn more about VDO. The amount of VDO code is still quite large. If I want to see the specific operation from VDO layer to physical storage layer, can you help me point out where to start? I believe it will save a lot of time. Thanks, again!

Baimax-123 avatar May 07 '22 01:05 Baimax-123