stratisd icon indicating copy to clipboard operation
stratisd copied to clipboard

Investigate 16 MiB of sector-by-sector synchronous writes when an encrypted pool is destroyed

Open mulkieran opened this issue 5 years ago • 2 comments

The problem was identified via an experiment using dm-delay to slow down writes to underlying block devices. The interesting thing that was noticed was the large quantity of synchronous sector-by-sector writes when destroying an encrypted pool.

---(begin script)--- #!/bin/bash set -e set -x

DRVSZ1=$(blockdev --getsz /dev/nvme0n1p3) DRVSZ2=$(blockdev --getsz /dev/nvme0n1p4)

dmsetup create delaydev1 --table '0 '"$DRVSZ1"' delay /dev/nvme0n1p3 0 1' dmsetup create delaydev2 --table '0 '"$DRVSZ2"' delay /dev/nvme0n1p4 0 1'

echo "12345" > keyfile stratis --propagate key set --keyfile-path keyfile test_key stratis --propagate pool create --key-desc test_key test_pool /dev/mapper/delaydev1 lsblk stratis --propagate blockdev list stratis --propagate key list stratis --propagate fs create test_pool test_fs stratis --propagate pool add-data test_pool /dev/mapper/delaydev2 stratis --propagate blockdev list lsblk stratis --propagate pool list stratis --propagate key unset test_key stratis --propagate fs destroy test_pool test_fs echo "Sleeping..." sleep 20 stratis --propagate pool destroy test_pool

dmsetup remove delaydev2 dmsetup remove delaydev1 ---(end script)---

I executed blktrace on the test device just before the "stratis pool destroy" command (at the "sleep 20" command), and I saw what looked like a series of per-sector synchronous writes for 16 MB per device. The writes to both devices took about 132 seconds.

259,0   11       35     0.109004347 19740  A  WS 125829160 + 24 <- (259,3) 40
259,0   11       36     0.109006231 19740  Q  WS 125829160 + 24 [kworker/11:0]
259,0   11       37     0.109008997 19740  G  WS 125829160 + 24 [kworker/11:0]
259,0   11       38     0.109012264 19740  D  WS 125829160 + 24 [kworker/11:0]
259,0   11       39     0.109042074     0  C  WS 125829160 + 24 [0]
259,0   11       40     0.111002931 19740  A  WS 125829152 + 8 <- (259,3) 32
259,0   11       41     0.111003928 19740  Q  WS 125829152 + 8 [kworker/11:0]
259,0   11       42     0.111007431 19740  G  WS 125829152 + 8 [kworker/11:0]
259,0   11       43     0.111010848 19740  D  WS 125829152 + 8 [kworker/11:0]
259,0   11       44     0.111027223     0  C  WS 125829152 + 8 [0]
259,0   11       45     0.113005881 19740  A  WS 125829120 + 1 <- (259,3) 0
259,0   11       46     0.113006545 19740  Q  WS 125829120 + 1 [kworker/11:0]
259,0   11       47     0.113009595 19740  G  WS 125829120 + 1 [kworker/11:0]
259,0   11       48     0.113012288 19740  D  WS 125829120 + 1 [kworker/11:0]
259,0   11       49     0.113026644     0  C  WS 125829120 + 1 [0]
259,0   11       50     0.115002668 19740  A  WS 125829121 + 1 <- (259,3) 1
259,0   11       51     0.115003418 19740  Q  WS 125829121 + 1 [kworker/11:0]
259,0   11       52     0.115005712 19740  G  WS 125829121 + 1 [kworker/11:0]
259,0   11       53     0.115008401 19740  D  WS 125829121 + 1 [kworker/11:0]
259,0   11       54     0.115022571     0  C  WS 125829121 + 1 [0]
259,0   11       55     0.117003059 19740  A  WS 125829122 + 1 <- (259,3) 2
259,0   11       56     0.117003635 19740  Q  WS 125829122 + 1 [kworker/11:0]
259,0   11       57     0.117005612 19740  G  WS 125829122 + 1 [kworker/11:0]
259,0   11       58     0.117007655 19740  D  WS 125829122 + 1 [kworker/11:0]
259,0   11       59     0.117020818     0  C  WS 125829122 + 1 [0]
259,0   11       60     0.119002112 19740  A  WS 125829123 + 1 <- (259,3) 3
259,0   11       61     0.119002842 19740  Q  WS 125829123 + 1 [kworker/11:0]
259,0   11       62     0.119005102 19740  G  WS 125829123 + 1 [kworker/11:0]
259,0   11       63     0.119008009 19740  D  WS 125829123 + 1 [kworker/11:0]
259,0   11       64     0.119023325     0  C  WS 125829123 + 1 [0]
259,0   11       65     0.121002569   709  A  WS 125829124 + 1 <- (259,3) 4
259,0   11       66     0.121003242   709  Q  WS 125829124 + 1 [kworker/11:2]
259,0   11       67     0.121005542   709  G  WS 125829124 + 1 [kworker/11:2]
259,0   11       68     0.121007612   709  D  WS 125829124 + 1 [kworker/11:2]
259,0   11       69     0.121021395     0  C  WS 125829124 + 1 [0]
259,0   11       70     0.123005062   709  A  WS 125829125 + 1 <- (259,3) 5
259,0   11       71     0.123005896   709  Q  WS 125829125 + 1 [kworker/11:2]
259,0   11       72     0.123008505   709  G  WS 125829125 + 1 [kworker/11:2]
259,0   11       73     0.123011569   709  D  WS 125829125 + 1 [kworker/11:2]
259,0   11       74     0.123026095     0  C  WS 125829125 + 1 [0]
...

259,0    0    81941    65.651002775  1928  A  WS 125861882 + 1 <- (259,3) 32762
259,0    0    81942    65.651003275  1928  Q  WS 125861882 + 1 [kworker/0:1]
259,0    0    81943    65.651004958  1928  G  WS 125861882 + 1 [kworker/0:1]
259,0    0    81944    65.651006878  1928  D  WS 125861882 + 1 [kworker/0:1]
259,0    0    81945    65.651020071     0  C  WS 125861882 + 1 [0]
259,0    0    81946    65.653000905  1928  A  WS 125861883 + 1 <- (259,3) 32763
259,0    0    81947    65.653001429  1928  Q  WS 125861883 + 1 [kworker/0:1]
259,0    0    81948    65.653002979  1928  G  WS 125861883 + 1 [kworker/0:1]
259,0    0    81949    65.653004775  1928  D  WS 125861883 + 1 [kworker/0:1]
259,0    0    81950    65.653019518     0  C  WS 125861883 + 1 [0]
259,0    0    81951    65.655002752  1928  A  WS 125861884 + 1 <- (259,3) 32764
259,0    0    81952    65.655003266  1928  Q  WS 125861884 + 1 [kworker/0:1]
259,0    0    81953    65.655004889  1928  G  WS 125861884 + 1 [kworker/0:1]
259,0    0    81954    65.655006792  1928  D  WS 125861884 + 1 [kworker/0:1]
259,0    0    81955    65.655021458     0  C  WS 125861884 + 1 [0]
259,0    0    81956    65.657000963  1928  A  WS 125861885 + 1 <- (259,3) 32765
259,0    0    81957    65.657001449  1928  Q  WS 125861885 + 1 [kworker/0:1]
259,0    0    81958    65.657003056  1928  G  WS 125861885 + 1 [kworker/0:1]
259,0    0    81959    65.657004886  1928  D  WS 125861885 + 1 [kworker/0:1]
259,0    0    81960    65.657021748     0  C  WS 125861885 + 1 [0]
259,0    0    81961    65.659002770  1928  A  WS 125861886 + 1 <- (259,3) 32766
259,0    0    81962    65.659003323  1928  Q  WS 125861886 + 1 [kworker/0:1]
259,0    0    81963    65.659004829  1928  G  WS 125861886 + 1 [kworker/0:1]
259,0    0    81964    65.659006706  1928  D  WS 125861886 + 1 [kworker/0:1]
259,0    0    81965    65.659019779     0  C  WS 125861886 + 1 [0]
259,0    0    81966    65.661000870  1928  A  WS 125861887 + 1 <- (259,3) 32767
259,0    0    81967    65.661001356  1928  Q  WS 125861887 + 1 [kworker/0:1]
259,0    0    81968    65.661002883  1928  G  WS 125861887 + 1 [kworker/0:1]
259,0    0    81969    65.661004696  1928  D  WS 125861887 + 1 [kworker/0:1]
259,0    0    81970    65.661019572     0  C  WS 125861887 + 1 [0]
259,0    0    81971    65.663002996  1928  A  WS 125861888 + 1 <- (259,3) 32768
259,0    0    81972    65.663003476  1928  Q  WS 125861888 + 1 [kworker/0:1]
259,0    0    81973    65.663005080  1928  G  WS 125861888 + 1 [kworker/0:1]
259,0    0    81974    65.663006956  1928  D  WS 125861888 + 1 [kworker/0:1]
259,0    0    81975    65.663019999     0  C  WS 125861888 + 1 [0]

mulkieran avatar Jun 18 '20 13:06 mulkieran

@mulkieran I believe this is expected. For wiping crypt devices on destroy, we have a wipe block of a sector and the metadata for LUKS2 is approximately 16MiB. Would you like to increase the wipe block size? If so, we should probably consult with the cryptsetup maintainers to see if there are any security implications.

jbaublitz avatar Jul 19 '22 15:07 jbaublitz

@jbaublitz I see that it is what stratisd is asking for. I don't think we need to do anything now that it's explained. Feel free to close if you're satisfied.

mulkieran avatar Jul 20 '22 16:07 mulkieran