bladebit icon indicating copy to clipboard operation
bladebit copied to clipboard

Add temp directory for file transfer because of slow hdd write speed

Open GrayTsar opened this issue 3 years ago • 15 comments

Creating a plot takes 400 seconds on my system.  Transferring the plot to an HDD also takes around 400 seconds.

Here is where the problem is. When a new plot starts, during phase 1 table 2, the plotter waits for the old plot to be transferred. Which increases the time for every subsequent plot by 400 seconds.

First plot finishes in 400 seconds. Every other plot afterwards takes 800 seconds.

The request. Add an option for a temp directory.  Instead of writing directly to the final directory, the plot first moves to the temp directory (which is preferably an SSD). Then move the plot from the temp directory to the final directory.

GrayTsar avatar Jul 13 '21 13:07 GrayTsar

Or like madMax, use a separate process to copy the plot file to the final folder.

laudney avatar Jul 14 '21 08:07 laudney

If u have >=512GB ram it would be better do not use DMA/SYNC but rely on big dirty system cache

index 0e7ce41..7268138 100644
--- a/src/platform/unix/FileStream_Unix.cpp
+++ b/src/platform/unix/FileStream_Unix.cpp
@@ -33,8 +33,8 @@ bool FileStream::Open( const char* path, FileStream& file, FileMode mode, FileAc
                mode == FileMode::Append ? O_APPEND : 0;

     #if PLATFORM_IS_LINUX
-        if( IsFlagSet( flags, FileFlags::NoBuffering ) )
-            fdFlags |= O_DIRECT | O_SYNC;
+        //if( IsFlagSet( flags, FileFlags::NoBuffering ) )
+        //    fdFlags |= O_DIRECT | O_SYNC;

         if( IsFlagSet( flags, FileFlags::LargeFile )  )
             fdFlags |= O_LARGEFILE;
@@ -219,7 +219,8 @@ bool FileStream::Flush()
     if( !IsOpen() )
         return false;

-    int r = fsync( _fd );
+    //int r = fsync( _fd );
+    int r = 0;

     if( r )
     {

For this to work you may also need to increase dirty bytes a LOT :), I have:

vm.dirty_background_bytes = 16000000000 vm.dirty_bytes = 200000000000

This way I can plot directly on HDD with ~200mb/s write speed without any stops of plot generation between plots.

mocksoul avatar Jul 16 '21 13:07 mocksoul

75 plots/day on 2690v2!

image

mocksoul avatar Jul 16 '21 13:07 mocksoul

If u have >=512GB ram it would be better do not use DMA/SYNC but rely on big dirty system cache

This is a great suggestion, and should probably be added as a configurable option. Didn't think people would use HDDs for this plotter, but that's a great way to do it.

I am also planning on adding the ability to allow for more RAM to be used to create new buffers to write the last buffers to disk if you're being interrupted, since most people using this will have 512 anyway, they might prefer using a bit more.

harold-b avatar Jul 16 '21 15:07 harold-b

在我的系统上创建绘图需要 400 秒。  将绘图传输到 HDD 也需要大约 400 秒。

这就是问题所在。 当新图开始时,在第 1 阶段表 2 期间,绘图仪等待旧图传输。 这将每个后续绘图的时间增加了 400 秒。

第一个绘图在 400 秒内完成。 之后每隔一个图需要 800 秒。

请求。为临时目录添加一个选项。  绘图不是直接写入最终目录,而是首先移动到临时目录(最好是 SSD)。 然后将绘图从临时目录移动到最终目录。

What kind of OS do you use

drake618 avatar Jul 20 '21 08:07 drake618

在我的系统上创建绘图需要 400 秒。  将绘图传输到 HDD 也需要大约 400 秒。 这就是问题所在。 当新图开始时,在第 1 阶段表 2 期间,绘图仪等待旧图传输。 这将每个后续绘图的时间增加了 400 秒。 第一个绘图在 400 秒内完成。 之后每隔一个图需要 800 秒。 请求。为临时目录添加一个选项。  绘图不是直接写入最终目录,而是首先移动到临时目录(最好是 SSD)。 然后将绘图从临时目录移动到最终目录。

What kind of OS do you use

AMD Ryzen Threadripper PRO 3995WX ASUS Pro WS WRX80E-Sage SE WIFI 8x Samsung RDIMM 64GB, DDR4-3200, CL22-22-22, reg ECC Ubuntu 21.04

GrayTsar avatar Jul 26 '21 11:07 GrayTsar

I have some slow write final storage on sas expansion enclosure and seeing the plot write straight to HDD taking increasingly longer as the plotting continues. Writing to SSD is obviously faster but only have limited space and eventually have to dump the SSD to slow final storage, so "Disk Jockeying" takes even longer... Thread count : 56 System Memory: 502/503 GiB. 6Gb/s SAS 7200 HDD

Finished plotting in 818.56 seconds (13.64 minutes). Finished plotting in 1471.40 seconds (24.52 minutes). Finished plotting in 1474.56 seconds (24.58 minutes). Finished plotting in 1471.11 seconds (24.52 minutes). Finished plotting in 1480.57 seconds (24.68 minutes). Finished plotting in 1483.21 seconds (24.72 minutes). Finished plotting in 1497.18 seconds (24.95 minutes). Finished plotting in 1530.15 seconds (25.50 minutes). Finished plotting in 1535.24 seconds (25.59 minutes). Finished plotting in 1559.21 seconds (25.99 minutes). Finished plotting in 1569.62 seconds (26.16 minutes). Finished plotting in 1604.77 seconds (26.75 minutes). Finished plotting in 1616.78 seconds (26.95 minutes). Finished plotting in 1637.37 seconds (27.29 minutes). Finished plotting in 1646.65 seconds (27.44 minutes). Finished plotting in 1686.26 seconds (28.10 minutes). Finished plotting in 1710.67 seconds (28.51 minutes). Finished plotting in 1758.56 seconds (29.31 minutes). Finished plotting in 1765.18 seconds (29.42 minutes). Finished plotting in 1816.72 seconds (30.28 minutes).

CharlieTemplar avatar Jul 31 '21 11:07 CharlieTemplar

seeing the plot write straight to HDD taking increasingly longer as the plotting continue

this is expected for rotational media (i.e. hdd's) - constant read/write speed varies. Usually os/nas/whatever optimises that by filling "outer" tracks first. Thus, after been filled up with data it can be 2x slower (270mb/s vs 115mb/s for regular 7200 SATA hdd)

mocksoul avatar Jul 31 '21 12:07 mocksoul

seeing the plot write straight to HDD taking increasingly longer as the plotting continue

this is expected for rotational media (i.e. hdd's) - constant read/write speed varies. Usually os/nas/whatever optimises that by filling "outer" tracks first. Thus, after been filled up with data it can be 2x slower (270mb/s vs 115mb/s for regular 7200 SATA hdd)

Ok, that makes sense. I thought maybe some buffer or cache was filling up and new instance of bladebit might reset that, but, seems to indeed be progressively slower as the disk fills... (7200x 3TB SAS) Size Used Avail Use% 2.8T 2.3T 463G 84% System Memory: 502/503 GiB. Memory required: 416 GiB. Finished plotting in 816.21 seconds (13.60 minutes). Finished plotting in 1922.10 seconds (32.04 minutes). Finished plotting in 1933.37 seconds (32.22 minutes).

Thanks

CharlieTemplar avatar Jul 31 '21 13:07 CharlieTemplar

I have some slow write final storage on sas expansion enclosure and seeing the plot write straight to HDD taking increasingly longer as the plotting continues. Writing to SSD is obviously faster but only have limited space and eventually have to dump the SSD to slow final storage, so "Disk Jockeying" takes even longer... Thread count : 56 System Memory: 502/503 GiB. 6Gb/s SAS 7200 HDD

Finished plotting in 818.56 seconds (13.64 minutes). Finished plotting in 1471.40 seconds (24.52 minutes). Finished plotting in 1474.56 seconds (24.58 minutes). Finished plotting in 1471.11 seconds (24.52 minutes). Finished plotting in 1480.57 seconds (24.68 minutes). Finished plotting in 1483.21 seconds (24.72 minutes). Finished plotting in 1497.18 seconds (24.95 minutes). Finished plotting in 1530.15 seconds (25.50 minutes). Finished plotting in 1535.24 seconds (25.59 minutes). Finished plotting in 1559.21 seconds (25.99 minutes). Finished plotting in 1569.62 seconds (26.16 minutes). Finished plotting in 1604.77 seconds (26.75 minutes). Finished plotting in 1616.78 seconds (26.95 minutes). Finished plotting in 1637.37 seconds (27.29 minutes). Finished plotting in 1646.65 seconds (27.44 minutes). Finished plotting in 1686.26 seconds (28.10 minutes). Finished plotting in 1710.67 seconds (28.51 minutes). Finished plotting in 1758.56 seconds (29.31 minutes). Finished plotting in 1765.18 seconds (29.42 minutes). Finished plotting in 1816.72 seconds (30.28 minutes).

Try to add "big_writes" as mounting option for HDD. For madmax plotter it increased my writing speed really good

regonsite avatar Aug 04 '21 11:08 regonsite

For this to work you may also need to increase dirty bytes a LOT :), I have:

vm.dirty_background_bytes = 16000000000 vm.dirty_bytes = 200000000000

This way I can plot directly on HDD with ~200mb/s write speed without any stops of plot generation between plots.

how can this be set / enabled in ubuntu 20.04 ?? plz, provide full instructions

ari2asem avatar Aug 05 '21 10:08 ari2asem

For this to work you may also need to increase dirty bytes a LOT :), I have: vm.dirty_background_bytes = 16000000000 vm.dirty_bytes = 200000000000 This way I can plot directly on HDD with ~200mb/s write speed without any stops of plot generation between plots.

how can this be set / enabled in ubuntu 20.04 ?? plz, provide full instructions

i founded by myself. use command

`sudo sysctl -w vm.dirty_background_bytes=16000000000

sudo sysctl -w vm.dirty_bytes=200000000000`

i have to verify if these settings have positive effect on plotting speed. right now plotted 5 plots in row and it took totally about 90 minutes (for 5 plots). so average is 18 minutes per 1 plot. final directory is on sata-600 7200 rpm disk.

my final vm-settings are:

`vm.dirty_background_bytes = 16000000000

vm.dirty_background_ratio = 0

vm.dirty_bytes = 200000000000

vm.dirty_expire_centisecs = 3000

vm.dirty_ratio = 0

vm.dirty_writeback_centisecs = 200

vm.dirtytime_expire_seconds = 180`

EDIT: .............after normal reboot all these values got back to default. i had to set them manually back to my desired values

ari2asem avatar Aug 05 '21 21:08 ari2asem

If u have >=512GB ram it would be better do not use DMA/SYNC but rely on big dirty system cache

index 0e7ce41..7268138 100644
--- a/src/platform/unix/FileStream_Unix.cpp
+++ b/src/platform/unix/FileStream_Unix.cpp
@@ -33,8 +33,8 @@ bool FileStream::Open( const char* path, FileStream& file, FileMode mode, FileAc
                mode == FileMode::Append ? O_APPEND : 0;

     #if PLATFORM_IS_LINUX
-        if( IsFlagSet( flags, FileFlags::NoBuffering ) )
-            fdFlags |= O_DIRECT | O_SYNC;
+        //if( IsFlagSet( flags, FileFlags::NoBuffering ) )
+        //    fdFlags |= O_DIRECT | O_SYNC;

         if( IsFlagSet( flags, FileFlags::LargeFile )  )
             fdFlags |= O_LARGEFILE;
@@ -219,7 +219,8 @@ bool FileStream::Flush()
     if( !IsOpen() )
         return false;

-    int r = fsync( _fd );
+    //int r = fsync( _fd );
+    int r = 0;

     if( r )
     {

For this to work you may also need to increase dirty bytes a LOT :), I have:

vm.dirty_background_bytes = 16000000000 vm.dirty_bytes = 200000000000

This way I can plot directly on HDD with ~200mb/s write speed without any stops of plot generation between plots.

Can I just use your forked version? Would it already have this set?

PuNkYsHuNgRy avatar Aug 09 '21 02:08 PuNkYsHuNgRy

If u have >=512GB ram it would be better do not use DMA/SYNC but rely on big dirty system cache

This is a great suggestion, and should probably be added as a configurable option. Didn't think people would use HDDs for this plotter, but that's a great way to do it.

I am also planning on adding the ability to allow for more RAM to be used to create new buffers to write the last buffers to disk if you're being interrupted, since most people using this will have 512 anyway, they might prefer using a bit more.

@harold-b Any news on that?

preutrn avatar Sep 29 '21 11:09 preutrn

Unfortunately no news yet. Focus on other features currently trumps this, but I still have it pending.

harold-b avatar Oct 24 '21 18:10 harold-b