stratisd
stratisd copied to clipboard
Investigate 16 MiB of sector-by-sector synchronous writes when an encrypted pool is destroyed
The problem was identified via an experiment using dm-delay to slow down writes to underlying block devices. The interesting thing that was noticed was the large quantity of synchronous sector-by-sector writes when destroying an encrypted pool.
---(begin script)--- #!/bin/bash set -e set -x
DRVSZ1=$(blockdev --getsz /dev/nvme0n1p3) DRVSZ2=$(blockdev --getsz /dev/nvme0n1p4)
dmsetup create delaydev1 --table '0 '"$DRVSZ1"' delay /dev/nvme0n1p3 0 1' dmsetup create delaydev2 --table '0 '"$DRVSZ2"' delay /dev/nvme0n1p4 0 1'
echo "12345" > keyfile stratis --propagate key set --keyfile-path keyfile test_key stratis --propagate pool create --key-desc test_key test_pool /dev/mapper/delaydev1 lsblk stratis --propagate blockdev list stratis --propagate key list stratis --propagate fs create test_pool test_fs stratis --propagate pool add-data test_pool /dev/mapper/delaydev2 stratis --propagate blockdev list lsblk stratis --propagate pool list stratis --propagate key unset test_key stratis --propagate fs destroy test_pool test_fs echo "Sleeping..." sleep 20 stratis --propagate pool destroy test_pool
dmsetup remove delaydev2 dmsetup remove delaydev1 ---(end script)---
I executed blktrace on the test device just before the "stratis pool destroy" command (at the "sleep 20" command), and I saw what looked like a series of per-sector synchronous writes for 16 MB per device. The writes to both devices took about 132 seconds.
259,0 11 35 0.109004347 19740 A WS 125829160 + 24 <- (259,3) 40
259,0 11 36 0.109006231 19740 Q WS 125829160 + 24 [kworker/11:0]
259,0 11 37 0.109008997 19740 G WS 125829160 + 24 [kworker/11:0]
259,0 11 38 0.109012264 19740 D WS 125829160 + 24 [kworker/11:0]
259,0 11 39 0.109042074 0 C WS 125829160 + 24 [0]
259,0 11 40 0.111002931 19740 A WS 125829152 + 8 <- (259,3) 32
259,0 11 41 0.111003928 19740 Q WS 125829152 + 8 [kworker/11:0]
259,0 11 42 0.111007431 19740 G WS 125829152 + 8 [kworker/11:0]
259,0 11 43 0.111010848 19740 D WS 125829152 + 8 [kworker/11:0]
259,0 11 44 0.111027223 0 C WS 125829152 + 8 [0]
259,0 11 45 0.113005881 19740 A WS 125829120 + 1 <- (259,3) 0
259,0 11 46 0.113006545 19740 Q WS 125829120 + 1 [kworker/11:0]
259,0 11 47 0.113009595 19740 G WS 125829120 + 1 [kworker/11:0]
259,0 11 48 0.113012288 19740 D WS 125829120 + 1 [kworker/11:0]
259,0 11 49 0.113026644 0 C WS 125829120 + 1 [0]
259,0 11 50 0.115002668 19740 A WS 125829121 + 1 <- (259,3) 1
259,0 11 51 0.115003418 19740 Q WS 125829121 + 1 [kworker/11:0]
259,0 11 52 0.115005712 19740 G WS 125829121 + 1 [kworker/11:0]
259,0 11 53 0.115008401 19740 D WS 125829121 + 1 [kworker/11:0]
259,0 11 54 0.115022571 0 C WS 125829121 + 1 [0]
259,0 11 55 0.117003059 19740 A WS 125829122 + 1 <- (259,3) 2
259,0 11 56 0.117003635 19740 Q WS 125829122 + 1 [kworker/11:0]
259,0 11 57 0.117005612 19740 G WS 125829122 + 1 [kworker/11:0]
259,0 11 58 0.117007655 19740 D WS 125829122 + 1 [kworker/11:0]
259,0 11 59 0.117020818 0 C WS 125829122 + 1 [0]
259,0 11 60 0.119002112 19740 A WS 125829123 + 1 <- (259,3) 3
259,0 11 61 0.119002842 19740 Q WS 125829123 + 1 [kworker/11:0]
259,0 11 62 0.119005102 19740 G WS 125829123 + 1 [kworker/11:0]
259,0 11 63 0.119008009 19740 D WS 125829123 + 1 [kworker/11:0]
259,0 11 64 0.119023325 0 C WS 125829123 + 1 [0]
259,0 11 65 0.121002569 709 A WS 125829124 + 1 <- (259,3) 4
259,0 11 66 0.121003242 709 Q WS 125829124 + 1 [kworker/11:2]
259,0 11 67 0.121005542 709 G WS 125829124 + 1 [kworker/11:2]
259,0 11 68 0.121007612 709 D WS 125829124 + 1 [kworker/11:2]
259,0 11 69 0.121021395 0 C WS 125829124 + 1 [0]
259,0 11 70 0.123005062 709 A WS 125829125 + 1 <- (259,3) 5
259,0 11 71 0.123005896 709 Q WS 125829125 + 1 [kworker/11:2]
259,0 11 72 0.123008505 709 G WS 125829125 + 1 [kworker/11:2]
259,0 11 73 0.123011569 709 D WS 125829125 + 1 [kworker/11:2]
259,0 11 74 0.123026095 0 C WS 125829125 + 1 [0]
...
259,0 0 81941 65.651002775 1928 A WS 125861882 + 1 <- (259,3) 32762
259,0 0 81942 65.651003275 1928 Q WS 125861882 + 1 [kworker/0:1]
259,0 0 81943 65.651004958 1928 G WS 125861882 + 1 [kworker/0:1]
259,0 0 81944 65.651006878 1928 D WS 125861882 + 1 [kworker/0:1]
259,0 0 81945 65.651020071 0 C WS 125861882 + 1 [0]
259,0 0 81946 65.653000905 1928 A WS 125861883 + 1 <- (259,3) 32763
259,0 0 81947 65.653001429 1928 Q WS 125861883 + 1 [kworker/0:1]
259,0 0 81948 65.653002979 1928 G WS 125861883 + 1 [kworker/0:1]
259,0 0 81949 65.653004775 1928 D WS 125861883 + 1 [kworker/0:1]
259,0 0 81950 65.653019518 0 C WS 125861883 + 1 [0]
259,0 0 81951 65.655002752 1928 A WS 125861884 + 1 <- (259,3) 32764
259,0 0 81952 65.655003266 1928 Q WS 125861884 + 1 [kworker/0:1]
259,0 0 81953 65.655004889 1928 G WS 125861884 + 1 [kworker/0:1]
259,0 0 81954 65.655006792 1928 D WS 125861884 + 1 [kworker/0:1]
259,0 0 81955 65.655021458 0 C WS 125861884 + 1 [0]
259,0 0 81956 65.657000963 1928 A WS 125861885 + 1 <- (259,3) 32765
259,0 0 81957 65.657001449 1928 Q WS 125861885 + 1 [kworker/0:1]
259,0 0 81958 65.657003056 1928 G WS 125861885 + 1 [kworker/0:1]
259,0 0 81959 65.657004886 1928 D WS 125861885 + 1 [kworker/0:1]
259,0 0 81960 65.657021748 0 C WS 125861885 + 1 [0]
259,0 0 81961 65.659002770 1928 A WS 125861886 + 1 <- (259,3) 32766
259,0 0 81962 65.659003323 1928 Q WS 125861886 + 1 [kworker/0:1]
259,0 0 81963 65.659004829 1928 G WS 125861886 + 1 [kworker/0:1]
259,0 0 81964 65.659006706 1928 D WS 125861886 + 1 [kworker/0:1]
259,0 0 81965 65.659019779 0 C WS 125861886 + 1 [0]
259,0 0 81966 65.661000870 1928 A WS 125861887 + 1 <- (259,3) 32767
259,0 0 81967 65.661001356 1928 Q WS 125861887 + 1 [kworker/0:1]
259,0 0 81968 65.661002883 1928 G WS 125861887 + 1 [kworker/0:1]
259,0 0 81969 65.661004696 1928 D WS 125861887 + 1 [kworker/0:1]
259,0 0 81970 65.661019572 0 C WS 125861887 + 1 [0]
259,0 0 81971 65.663002996 1928 A WS 125861888 + 1 <- (259,3) 32768
259,0 0 81972 65.663003476 1928 Q WS 125861888 + 1 [kworker/0:1]
259,0 0 81973 65.663005080 1928 G WS 125861888 + 1 [kworker/0:1]
259,0 0 81974 65.663006956 1928 D WS 125861888 + 1 [kworker/0:1]
259,0 0 81975 65.663019999 0 C WS 125861888 + 1 [0]
@mulkieran I believe this is expected. For wiping crypt devices on destroy, we have a wipe block of a sector and the metadata for LUKS2 is approximately 16MiB. Would you like to increase the wipe block size? If so, we should probably consult with the cryptsetup maintainers to see if there are any security implications.
@jbaublitz I see that it is what stratisd is asking for. I don't think we need to do anything now that it's explained. Feel free to close if you're satisfied.