sftp icon indicating copy to clipboard operation
sftp copied to clipboard

Refactor: Start using filexfer

Open puellanivis opened this issue 4 years ago • 6 comments

So, I’m checking in and pushing a PR very early here, so that others can get a sense of how the client refactor will look like after integrating the filexfer.

The early attention here should be on the Client.close() function, which felt would be the quickest way to show off how things would change. The changes to Conn and clientConn are to support the changes in the close() function, and allow for a parallel old-way/new-way hand-off from one set of wheels to the other while the car is still running. Before the PR is merged, it will eventually only have the new-way.

puellanivis avatar Apr 21 '21 13:04 puellanivis

It’s a spring cleaning, some touch up (came across a few todos that I have good experience in and could roll it out really quick while I was there already) along with the refactor.

So far, I’ve mostly only refactored the client side so far. There have been some changes to the server code to remove some utility functions that were mostly used by the client code, so it was more of a cleanup rather than a focused refactor. I’ve been sure to keep the code working on both sides the whole way, so I’m thinking that since this will probably already be a pretty chonky PR, I’ll push the client refactor first, then we can do the server side work in a separate PR.

puellanivis avatar Apr 24 '21 07:04 puellanivis

From:

sftp.clean$ go tool pprof memprofile.out 
(pprof) top10
Showing nodes accounting for 500.49GB, 98.94% of 505.86GB total
Dropped 170 nodes (cum <= 2.53GB)
Showing top 10 nodes out of 28
      flat  flat%   sum%        cum   cum%
  403.43GB 79.75% 79.75%   403.43GB 79.75%  github.com/pkg/sftp.recvPacket
   40.09GB  7.92% 87.68%    40.09GB  7.92%  github.com/pkg/sftp.(*bufPool).Get
   15.94GB  3.15% 90.83%    15.94GB  3.15%  github.com/pkg/sftp.(*sshFxpWritePacket).marshalPacket
   15.48GB  3.06% 93.89%    15.48GB  3.06%  github.com/pkg/sftp.(*allocator).GetPage
    8.21GB  1.62% 95.51%     8.21GB  1.62%  github.com/pkg/sftp.(*sshFxInitPacket).MarshalBinary
    7.27GB  1.44% 96.95%     7.27GB  1.44%  github.com/pkg/sftp.(*sshFxpOpenPacket).MarshalBinary
    5.20GB  1.03% 97.97%     5.25GB  1.04%  github.com/pkg/sftp.benchmarkWriteTo
    2.19GB  0.43% 98.41%     3.55GB   0.7%  github.com/pkg/sftp.benchmarkReadFrom
    2.10GB  0.42% 98.82%     3.59GB  0.71%  github.com/pkg/sftp.benchmarkWrite
    0.59GB  0.12% 98.94%     2.69GB  0.53%  github.com/pkg/sftp.(*File).writeChunkAt

to:

sftp$ go tool pprof memprofile.out 
(pprof) top10
Showing nodes accounting for 93.60GB, 98.37% of 95.16GB total
Dropped 130 nodes (cum <= 0.48GB)
Showing top 10 nodes out of 42
      flat  flat%   sum%        cum   cum%
   54.95GB 57.75% 57.75%    54.95GB 57.75%  github.com/pkg/sftp.(*bufPool).Get
   16.50GB 17.34% 75.08%    16.50GB 17.34%  github.com/pkg/sftp.(*allocator).GetPage
    8.24GB  8.66% 83.74%     8.24GB  8.66%  github.com/pkg/sftp.(*sshFxInitPacket).MarshalBinary
    5.10GB  5.36% 89.11%     5.15GB  5.41%  github.com/pkg/sftp.benchmarkWriteTo
    2.21GB  2.32% 91.43%    28.50GB 29.95%  github.com/pkg/sftp.(*clientConn).sendPacket
    2.19GB  2.30% 93.73%     3.61GB  3.79%  github.com/pkg/sftp.benchmarkReadFrom
    1.97GB  2.07% 95.80%     3.05GB  3.21%  github.com/pkg/sftp.benchmarkWrite
    0.86GB   0.9% 96.71%     0.86GB   0.9%  github.com/pkg/sftp.(*delayedWriter).Write
    0.80GB  0.84% 97.54%    27.30GB 28.69%  github.com/pkg/sftp.(*File).readChunkAt
    0.78GB  0.82% 98.37%     0.78GB  0.82%  github.com/pkg/sftp.(*File).WriteTo.func2

puellanivis avatar Apr 24 '21 14:04 puellanivis

benchstats:

name                          old time/op    new time/op     delta
AllocatorSerial-12               208ns ±18%      218ns ±13%      ~     (p=0.113 n=10+9)
AllocatorParallel/1-12           377ns ± 1%      339ns ± 1%   -10.10%  (p=0.000 n=9+9)
AllocatorParallel/2-12           386ns ± 1%      374ns ± 1%    -3.25%  (p=0.000 n=10+10)
AllocatorParallel/4-12           434ns ± 4%      425ns ± 5%      ~     (p=0.105 n=10+10)
AllocatorParallel/8-12           484ns ±10%      467ns ±14%      ~     (p=0.123 n=10+10)
Read1k-12                       53.7ms ± 3%     54.1ms ± 5%      ~     (p=0.393 n=10+10)
Read16k-12                      6.98ms ± 4%     5.83ms ± 4%   -16.52%  (p=0.000 n=10+10)
Read32k-12                      5.61ms ±20%     4.03ms ± 6%   -28.10%  (p=0.000 n=10+10)
Read128k-12                     3.85ms ± 4%     3.03ms ± 4%   -21.28%  (p=0.000 n=10+10)
Read512k-12                     3.60ms ± 2%     2.79ms ± 4%   -22.45%  (p=0.000 n=10+9)
Read1MiB-12                     4.06ms ± 5%     2.94ms ± 3%   -27.60%  (p=0.000 n=9+10)
Read4MiB-12                     4.63ms ± 2%     3.79ms ± 5%   -18.13%  (p=0.000 n=10+10)
Read4MiBDelay10Msec-12          71.3ms ± 1%     69.3ms ± 1%    -2.77%  (p=0.000 n=10+10)
Read4MiBDelay50Msec-12           313ms ± 0%      311ms ± 0%    -0.82%  (p=0.000 n=10+10)
Read4MiBDelay150Msec-12          914ms ± 0%      912ms ± 0%    -0.30%  (p=0.000 n=10+10)
Write1k-12                       100ms ± 8%      103ms ± 5%      ~     (p=0.063 n=10+10)
Write16k-12                     12.8ms ± 4%     13.2ms ± 3%    +3.41%  (p=0.009 n=10+8)
Write32k-12                     9.72ms ± 4%    10.01ms ± 5%    +2.94%  (p=0.035 n=10+10)
Write128k-12                    9.93ms ± 5%    10.24ms ± 7%      ~     (p=0.052 n=10+10)
Write512k-12                    9.86ms ± 3%    10.44ms ± 6%    +5.95%  (p=0.000 n=10+10)
Write1MiB-12                    10.0ms ± 5%     10.3ms ± 7%      ~     (p=0.143 n=10+10)
Write4MiB-12                    10.1ms ± 5%     10.6ms ± 7%    +5.06%  (p=0.009 n=10+10)
Write4MiBDelay10Msec-12          3.40s ± 0%      3.39s ± 0%      ~     (p=0.113 n=9+10)
Write4MiBDelay50Msec-12          16.3s ± 0%      16.3s ± 0%      ~     (p=0.340 n=9+9)
Write4MiBDelay150Msec-12         48.7s ± 0%      48.7s ± 0%      ~     (p=0.243 n=10+9)
ReadFrom1k-12                   9.36ms ± 1%     9.35ms ± 3%      ~     (p=0.720 n=10+9)
ReadFrom16k-12                  9.47ms ± 1%     9.54ms ± 3%      ~     (p=0.400 n=9+10)
ReadFrom32k-12                  9.78ms ± 7%     9.73ms ± 3%      ~     (p=0.780 n=10+9)
ReadFrom128k-12                 10.1ms ± 3%     10.3ms ± 7%      ~     (p=0.497 n=9+10)
ReadFrom512k-12                 10.1ms ± 4%     10.2ms ± 4%      ~     (p=0.165 n=10+10)
ReadFrom1MiB-12                 10.3ms ± 5%     10.4ms ± 3%      ~     (p=0.315 n=10+10)
ReadFrom4MiB-12                 10.5ms ± 7%     10.5ms ± 6%      ~     (p=0.739 n=10+10)
ReadFrom4MiBDelay10Msec-12       3.41s ± 0%      3.41s ± 0%      ~     (p=0.529 n=10+10)
ReadFrom4MiBDelay50Msec-12       16.4s ± 0%      16.4s ± 0%    +0.03%  (p=0.000 n=10+9)
ReadFrom4MiBDelay150Msec-12      48.8s ± 0%      48.8s ± 0%    +0.00%  (p=0.003 n=10+10)
WriteTo1k-12                    6.07ms ± 1%     5.03ms ± 1%   -17.12%  (p=0.000 n=10+10)
WriteTo16k-12                   6.11ms ± 1%     5.12ms ± 1%   -16.18%  (p=0.000 n=9+9)
WriteTo32k-12                   6.47ms ± 1%     5.39ms ± 4%   -16.68%  (p=0.000 n=9+10)
WriteTo128k-12                  6.64ms ± 3%     5.64ms ± 5%   -15.15%  (p=0.000 n=10+10)
WriteTo512k-12                  6.74ms ± 3%     5.75ms ± 3%   -14.61%  (p=0.000 n=10+10)
WriteTo1MiB-12                  6.86ms ± 2%     5.77ms ± 2%   -15.91%  (p=0.000 n=9+10)
WriteTo4MiB-12                  6.95ms ± 3%     5.84ms ± 3%   -16.01%  (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12        107ms ± 1%      107ms ± 1%      ~     (p=0.905 n=9+10)
WriteTo4MiBDelay50Msec-12        510ms ± 0%      510ms ± 0%      ~     (p=0.796 n=10+10)
WriteTo4MiBDelay150Msec-12       1.51s ± 0%      1.51s ± 0%      ~     (p=0.971 n=10+10)
CopyDown10MiBDelay10Msec-12      108ms ± 2%      104ms ± 1%    -3.14%  (p=0.000 n=9+10)
CopyDown10MiBDelay50Msec-12      469ms ± 0%      465ms ± 0%    -0.85%  (p=0.000 n=10+10)
CopyDown10MiBDelay150Msec-12     1.37s ± 0%      1.37s ± 0%    -0.38%  (p=0.000 n=9+10)
CopyUp10MiBDelay10Msec-12        3.41s ± 0%      3.41s ± 0%      ~     (p=0.579 n=10+10)
CopyUp10MiBDelay50Msec-12        16.3s ± 0%      16.3s ± 0%      ~     (p=0.739 n=10+10)
CopyUp10MiBDelay150Msec-12       48.5s ± 0%      48.5s ± 0%      ~     (p=0.052 n=10+10)
MarshalInit-12                   105ns ±21%       93ns ±16%   -11.33%  (p=0.027 n=10+10)
MarshalOpen-12                  99.4ns ±27%     39.5ns ± 6%   -60.30%  (p=0.000 n=10+10)
MarshalWriteWorstCase-12        95.8ns ±22%     40.3ns ± 3%   -57.92%  (p=0.000 n=10+10)
MarshalWrite1k-12               99.3ns ±10%     39.6ns ± 0%   -60.10%  (p=0.000 n=10+8)

name                          old alloc/op   new alloc/op    delta
AllocatorSerial-12               72.0B ± 0%      72.0B ± 0%      ~     (all equal)
AllocatorParallel/1-12           72.0B ± 0%      72.0B ± 0%      ~     (all equal)
AllocatorParallel/2-12           72.0B ± 0%      72.0B ± 0%      ~     (all equal)
AllocatorParallel/4-12           72.0B ± 0%      72.0B ± 0%      ~     (all equal)
AllocatorParallel/8-12           72.0B ± 0%      72.0B ± 0%      ~     (all equal)
Read1k-12                       6.99MB ± 0%     0.70MB ± 0%   -89.94%  (p=0.000 n=10+8)
Read16k-12                      5.99MB ± 0%     0.04MB ± 0%   -99.26%  (p=0.000 n=10+8)
Read32k-12                      6.63MB ± 0%     0.02MB ± 0%   -99.66%  (p=0.000 n=10+10)
Read128k-12                     6.77MB ± 0%     0.04MB ± 0%   -99.47%  (p=0.000 n=9+8)
Read512k-12                     7.25MB ± 0%     0.03MB ± 1%   -99.60%  (p=0.000 n=10+10)
Read1MiB-12                     7.91MB ± 0%     0.03MB ± 3%   -99.61%  (p=0.000 n=9+10)
Read4MiB-12                     10.5MB ± 0%      0.0MB ± 6%   -99.60%  (p=0.000 n=10+10)
Read4MiBDelay10Msec-12          10.5MB ± 0%      0.1MB ± 8%   -99.17%  (p=0.000 n=10+8)
Read4MiBDelay50Msec-12          10.5MB ± 0%      0.2MB ±21%   -98.30%  (p=0.000 n=10+10)
Read4MiBDelay150Msec-12         10.5MB ± 0%      0.3MB ±17%   -97.27%  (p=0.000 n=10+10)
Write1k-12                      3.36MB ± 0%     1.89MB ± 0%   -43.73%  (p=0.000 n=10+10)
Write16k-12                      212kB ± 0%      120kB ± 0%   -43.45%  (p=0.000 n=10+9)
Write32k-12                      107kB ± 0%       61kB ± 0%   -43.15%  (p=0.000 n=10+9)
Write128k-12                    72.1kB ± 0%     60.7kB ± 0%   -15.84%  (p=0.000 n=10+10)
Write512k-12                    63.5kB ± 0%     60.7kB ± 0%    -4.38%  (p=0.000 n=10+9)
Write1MiB-12                    62.0kB ± 0%     60.7kB ± 0%    -2.17%  (p=0.000 n=10+10)
Write4MiB-12                    60.9kB ± 0%     60.7kB ± 0%    -0.28%  (p=0.000 n=10+10)
Write4MiBDelay10Msec-12         10.6MB ± 0%     10.6MB ± 0%    +0.39%  (p=0.000 n=9+10)
Write4MiBDelay50Msec-12         10.6MB ± 0%     10.6MB ± 0%    +0.38%  (p=0.000 n=9+9)
Write4MiBDelay150Msec-12        10.6MB ± 0%     10.6MB ± 0%    +0.39%  (p=0.000 n=8+9)
ReadFrom1k-12                   93.7kB ± 0%     93.6kB ± 0%    -0.09%  (p=0.002 n=9+8)
ReadFrom16k-12                  93.7kB ± 0%     93.6kB ± 0%    -0.11%  (p=0.000 n=10+10)
ReadFrom32k-12                  93.7kB ± 0%     93.7kB ± 0%    -0.08%  (p=0.000 n=10+9)
ReadFrom128k-12                 93.7kB ± 0%     93.7kB ± 0%    -0.07%  (p=0.006 n=10+9)
ReadFrom512k-12                 93.7kB ± 0%     93.7kB ± 0%    -0.04%  (p=0.030 n=10+10)
ReadFrom1MiB-12                 93.7kB ± 0%     93.7kB ± 0%    -0.04%  (p=0.028 n=10+9)
ReadFrom4MiB-12                 93.7kB ± 0%     93.7kB ± 0%      ~     (p=0.071 n=9+7)
ReadFrom4MiBDelay10Msec-12      10.6MB ± 0%     10.6MB ± 0%    +0.39%  (p=0.000 n=8+10)
ReadFrom4MiBDelay50Msec-12      10.6MB ± 0%     10.6MB ± 0%    +0.38%  (p=0.000 n=9+10)
ReadFrom4MiBDelay150Msec-12     10.6MB ± 0%     10.6MB ± 0%    +0.39%  (p=0.000 n=10+10)
WriteTo1k-12                    15.3MB ± 0%      2.7MB ± 0%   -82.25%  (p=0.000 n=10+10)
WriteTo16k-12                   15.3MB ± 0%      2.7MB ± 0%   -82.24%  (p=0.000 n=9+10)
WriteTo32k-12                   15.3MB ± 0%      2.7MB ± 0%   -82.24%  (p=0.000 n=10+10)
WriteTo128k-12                  15.3MB ± 0%      2.7MB ± 0%   -82.25%  (p=0.000 n=10+9)
WriteTo512k-12                  15.3MB ± 0%      2.7MB ± 0%   -82.24%  (p=0.000 n=10+10)
WriteTo1MiB-12                  15.3MB ± 0%      2.7MB ± 0%   -82.25%  (p=0.000 n=10+10)
WriteTo4MiB-12                  15.3MB ± 0%      2.7MB ± 0%   -82.24%  (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12       15.3MB ± 0%      2.8MB ± 3%   -81.81%  (p=0.000 n=10+10)
WriteTo4MiBDelay50Msec-12       15.3MB ± 0%      3.0MB ± 3%   -80.71%  (p=0.000 n=10+10)
WriteTo4MiBDelay150Msec-12      15.3MB ± 0%      3.2MB ±15%   -79.40%  (p=0.000 n=10+10)
CopyDown10MiBDelay10Msec-12     15.3MB ± 0%      2.8MB ± 4%   -81.72%  (p=0.000 n=10+9)
CopyDown10MiBDelay50Msec-12     15.3MB ± 0%      2.9MB ± 2%   -81.17%  (p=0.000 n=10+8)
CopyDown10MiBDelay150Msec-12    15.3MB ± 0%      3.2MB ± 5%   -79.21%  (p=0.000 n=10+10)
CopyUp10MiBDelay10Msec-12       10.6MB ± 0%     10.7MB ± 0%    +0.79%  (p=0.000 n=9+10)
CopyUp10MiBDelay50Msec-12       10.6MB ± 0%     10.7MB ± 0%    +0.77%  (p=0.000 n=9+8)
CopyUp10MiBDelay150Msec-12      10.6MB ± 0%     10.7MB ± 0%    +0.77%  (p=0.000 n=10+10)
MarshalInit-12                   48.0B ± 0%      48.0B ± 0%      ~     (all equal)
MarshalOpen-12                   48.0B ± 0%       0.0B       -100.00%  (p=0.000 n=10+10)
MarshalWriteWorstCase-12         48.0B ± 0%       0.0B       -100.00%  (p=0.000 n=10+10)
MarshalWrite1k-12                48.0B ± 0%       0.0B       -100.00%  (p=0.000 n=10+10)

name                          old allocs/op  new allocs/op   delta
AllocatorSerial-12                2.00 ± 0%       2.00 ± 0%      ~     (all equal)
AllocatorParallel/1-12            2.00 ± 0%       2.00 ± 0%      ~     (all equal)
AllocatorParallel/2-12            2.00 ± 0%       2.00 ± 0%      ~     (all equal)
AllocatorParallel/4-12            2.00 ± 0%       2.00 ± 0%      ~     (all equal)
AllocatorParallel/8-12            2.00 ± 0%       2.00 ± 0%      ~     (all equal)
Read1k-12                        30.7k ± 0%      15.4k ± 0%   -50.00%  (p=0.000 n=9+10)
Read16k-12                       1.94k ± 0%      0.97k ± 0%   -49.95%  (p=0.000 n=10+10)
Read32k-12                         982 ± 0%        492 ± 0%   -49.90%  (p=0.000 n=10+10)
Read128k-12                      1.25k ± 0%      0.67k ± 0%   -46.71%  (p=0.002 n=8+10)
Read512k-12                      1.14k ± 0%      0.58k ± 0%   -49.16%  (p=0.000 n=10+10)
Read1MiB-12                      1.22k ± 0%      0.61k ± 0%   -49.89%  (p=0.000 n=9+8)
Read4MiB-12                      1.32k ± 0%      0.79k ± 0%   -40.21%  (p=0.000 n=10+8)
Read4MiBDelay10Msec-12           1.57k ± 0%      1.06k ± 1%   -32.85%  (p=0.000 n=10+10)
Read4MiBDelay50Msec-12           1.58k ± 1%      1.09k ± 1%   -31.02%  (p=0.000 n=10+10)
Read4MiBDelay150Msec-12          1.59k ± 2%      1.14k ± 2%   -28.02%  (p=0.000 n=10+10)
Write1k-12                       82.0k ± 0%      41.0k ± 0%   -49.99%  (p=0.000 n=10+10)
Write16k-12                      5.16k ± 0%      2.59k ± 0%   -49.84%  (p=0.000 n=10+10)
Write32k-12                      2.60k ± 0%      1.31k ± 0%   -49.69%  (p=0.000 n=9+9)
Write128k-12                     2.12k ± 0%      1.31k ± 0%   -38.29%  (p=0.000 n=10+9)
Write512k-12                     2.00k ± 0%      1.31k ± 0%   -34.58%  (p=0.000 n=10+9)
Write1MiB-12                     1.98k ± 0%      1.31k ± 0%   -33.89%  (p=0.000 n=8+10)
Write4MiB-12                     1.96k ± 0%      1.31k ± 0%   -33.35%  (p=0.000 n=8+10)
Write4MiBDelay10Msec-12          2.62k ± 2%      1.98k ± 2%   -24.46%  (p=0.000 n=9+10)
Write4MiBDelay50Msec-12          2.61k ± 0%      1.96k ± 0%   -25.05%  (p=0.000 n=9+9)
Write4MiBDelay150Msec-12         2.61k ± 0%      1.96k ± 1%   -24.83%  (p=0.000 n=8+9)
ReadFrom1k-12                    1.97k ± 0%      1.31k ± 0%   -33.26%  (p=0.000 n=8+10)
ReadFrom16k-12                   1.97k ± 0%      1.31k ± 0%   -33.27%  (p=0.000 n=9+10)
ReadFrom32k-12                   1.97k ± 0%      1.31k ± 0%   -33.25%  (p=0.000 n=8+8)
ReadFrom128k-12                  1.97k ± 0%      1.31k ± 0%   -33.26%  (p=0.000 n=9+10)
ReadFrom512k-12                  1.97k ± 0%      1.31k ± 0%   -33.26%  (p=0.000 n=8+10)
ReadFrom1MiB-12                  1.97k ± 0%      1.31k ± 0%   -33.25%  (p=0.000 n=8+8)
ReadFrom4MiB-12                  1.97k ± 0%      1.31k ± 0%   -33.25%  (p=0.000 n=8+8)
ReadFrom4MiBDelay10Msec-12       2.61k ± 0%      1.97k ± 1%   -24.59%  (p=0.000 n=8+10)
ReadFrom4MiBDelay50Msec-12       2.62k ± 1%      1.97k ± 1%   -24.81%  (p=0.000 n=9+10)
ReadFrom4MiBDelay150Msec-12      2.62k ± 1%      1.98k ± 2%   -24.65%  (p=0.000 n=10+10)
WriteTo1k-12                     2.29k ± 0%      1.76k ± 0%   -23.14%  (p=0.000 n=10+10)
WriteTo16k-12                    2.29k ± 0%      1.76k ± 0%   -23.11%  (p=0.000 n=10+8)
WriteTo32k-12                    2.29k ± 0%      1.76k ± 0%   -23.11%  (p=0.000 n=10+10)
WriteTo128k-12                   2.29k ± 0%      1.76k ± 0%   -23.11%  (p=0.000 n=10+6)
WriteTo512k-12                   2.29k ± 0%      1.76k ± 0%   -23.07%  (p=0.000 n=10+10)
WriteTo1MiB-12                   2.29k ± 0%      1.76k ± 0%   -23.09%  (p=0.000 n=10+10)
WriteTo4MiB-12                   2.29k ± 0%      1.76k ± 0%   -23.05%  (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12        2.68k ± 0%      2.17k ± 1%   -19.04%  (p=0.000 n=10+10)
WriteTo4MiBDelay50Msec-12        2.69k ± 1%      2.24k ± 1%   -16.77%  (p=0.000 n=10+10)
WriteTo4MiBDelay150Msec-12       2.71k ± 1%      2.32k ± 2%   -14.51%  (p=0.000 n=10+10)
CopyDown10MiBDelay10Msec-12      2.68k ± 0%      2.18k ± 0%   -18.94%  (p=0.000 n=10+10)
CopyDown10MiBDelay50Msec-12      2.69k ± 1%      2.21k ± 1%   -17.82%  (p=0.000 n=10+10)
CopyDown10MiBDelay150Msec-12     2.70k ± 1%      2.31k ± 1%   -14.34%  (p=0.000 n=10+10)
CopyUp10MiBDelay10Msec-12        2.60k ± 0%      1.97k ± 1%   -24.23%  (p=0.000 n=9+10)
CopyUp10MiBDelay50Msec-12        2.61k ± 1%      1.96k ± 0%   -24.88%  (p=0.000 n=9+8)
CopyUp10MiBDelay150Msec-12       2.62k ± 1%      1.97k ± 1%   -24.82%  (p=0.000 n=10+10)
MarshalInit-12                    1.00 ± 0%       1.00 ± 0%      ~     (all equal)
MarshalOpen-12                    1.00 ± 0%       0.00       -100.00%  (p=0.000 n=10+10)
MarshalWriteWorstCase-12          1.00 ± 0%       0.00       -100.00%  (p=0.000 n=10+10)
MarshalWrite1k-12                 1.00 ± 0%       0.00       -100.00%  (p=0.000 n=10+10)

name                          old speed      new speed       delta
Read1k-12                      195MB/s ± 3%    194MB/s ± 6%      ~     (p=0.393 n=10+10)
Read16k-12                    1.50GB/s ± 4%   1.80GB/s ± 4%   +19.78%  (p=0.000 n=10+10)
Read32k-12                    1.88GB/s ±18%   2.60GB/s ± 5%   +38.14%  (p=0.000 n=10+10)
Read128k-12                   2.73GB/s ± 4%   3.47GB/s ± 4%   +27.07%  (p=0.000 n=10+10)
Read512k-12                   2.91GB/s ± 2%   3.75GB/s ± 4%   +28.98%  (p=0.000 n=10+9)
Read1MiB-12                   2.58GB/s ± 5%   3.56GB/s ± 3%   +38.10%  (p=0.000 n=9+10)
Read4MiB-12                   2.27GB/s ± 2%   2.77GB/s ± 5%   +22.23%  (p=0.000 n=10+10)
Read4MiBDelay10Msec-12         147MB/s ± 1%    151MB/s ± 1%    +2.85%  (p=0.000 n=10+10)
Read4MiBDelay50Msec-12        33.5MB/s ± 0%   33.8MB/s ± 0%    +0.82%  (p=0.000 n=10+10)
Read4MiBDelay150Msec-12       11.5MB/s ± 0%   11.5MB/s ± 0%    +0.31%  (p=0.000 n=10+10)
Write1k-12                     105MB/s ± 7%    102MB/s ± 5%      ~     (p=0.063 n=10+10)
Write16k-12                    820MB/s ± 4%    793MB/s ± 3%    -3.32%  (p=0.009 n=10+8)
Write32k-12                   1.08GB/s ± 5%   1.05GB/s ± 5%    -2.85%  (p=0.035 n=10+10)
Write128k-12                  1.06GB/s ± 5%   1.03GB/s ± 7%      ~     (p=0.052 n=10+10)
Write512k-12                  1.06GB/s ± 3%   1.00GB/s ± 5%    -5.55%  (p=0.000 n=10+10)
Write1MiB-12                  1.05GB/s ± 6%   1.02GB/s ± 6%      ~     (p=0.143 n=10+10)
Write4MiB-12                  1.04GB/s ± 5%   0.99GB/s ± 7%    -4.72%  (p=0.009 n=10+10)
Write4MiBDelay10Msec-12       3.09MB/s ± 0%   3.09MB/s ± 0%      ~     (all equal)
Write4MiBDelay50Msec-12        640kB/s ± 0%    640kB/s ± 0%      ~     (all equal)
Write4MiBDelay150Msec-12       220kB/s ± 0%    220kB/s ± 0%      ~     (all equal)
ReadFrom1k-12                 1.12GB/s ± 1%   1.12GB/s ± 3%      ~     (p=0.720 n=10+9)
ReadFrom16k-12                1.11GB/s ± 2%   1.10GB/s ± 3%      ~     (p=0.400 n=9+10)
ReadFrom32k-12                1.07GB/s ± 7%   1.07GB/s ± 7%      ~     (p=0.529 n=10+10)
ReadFrom128k-12               1.03GB/s ± 7%   1.02GB/s ± 7%      ~     (p=0.739 n=10+10)
ReadFrom512k-12               1.04GB/s ± 4%   1.03GB/s ± 4%      ~     (p=0.165 n=10+10)
ReadFrom1MiB-12               1.02GB/s ± 6%   1.01GB/s ± 3%      ~     (p=0.315 n=10+10)
ReadFrom4MiB-12               1.00GB/s ± 7%   1.00GB/s ± 6%      ~     (p=0.739 n=10+10)
ReadFrom4MiBDelay10Msec-12    3.08MB/s ± 0%   3.07MB/s ± 0%    -0.16%  (p=0.022 n=9+10)
ReadFrom4MiBDelay50Msec-12     640kB/s ± 0%    640kB/s ± 0%      ~     (all equal)
ReadFrom4MiBDelay150Msec-12    210kB/s ± 0%    210kB/s ± 0%      ~     (all equal)
WriteTo1k-12                  1.73GB/s ± 1%   2.08GB/s ± 1%   +20.66%  (p=0.000 n=10+10)
WriteTo16k-12                 1.72GB/s ± 1%   2.05GB/s ± 1%   +19.31%  (p=0.000 n=9+9)
WriteTo32k-12                 1.62GB/s ± 1%   1.95GB/s ± 4%   +20.07%  (p=0.000 n=9+10)
WriteTo128k-12                1.58GB/s ± 3%   1.86GB/s ± 5%   +17.91%  (p=0.000 n=10+10)
WriteTo512k-12                1.56GB/s ± 3%   1.82GB/s ± 3%   +17.13%  (p=0.000 n=10+10)
WriteTo1MiB-12                1.53GB/s ± 2%   1.82GB/s ± 2%   +18.92%  (p=0.000 n=9+10)
WriteTo4MiB-12                1.51GB/s ± 3%   1.80GB/s ± 3%   +19.06%  (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12     97.6MB/s ± 1%   97.7MB/s ± 1%      ~     (p=0.921 n=9+10)
WriteTo4MiBDelay50Msec-12     20.5MB/s ± 0%   20.6MB/s ± 0%      ~     (p=0.807 n=10+10)
WriteTo4MiBDelay150Msec-12    6.93MB/s ± 0%   6.93MB/s ± 0%      ~     (p=1.000 n=9+10)
CopyDown10MiBDelay10Msec-12   97.5MB/s ± 2%  100.7MB/s ± 1%    +3.23%  (p=0.000 n=9+10)
CopyDown10MiBDelay50Msec-12   22.3MB/s ± 0%   22.5MB/s ± 0%    +0.85%  (p=0.000 n=10+10)
CopyDown10MiBDelay150Msec-12  7.63MB/s ± 0%   7.67MB/s ± 0%    +0.44%  (p=0.000 n=8+10)
CopyUp10MiBDelay10Msec-12     3.08MB/s ± 0%   3.08MB/s ± 0%      ~     (p=1.000 n=10+10)
CopyUp10MiBDelay50Msec-12      640kB/s ± 0%    640kB/s ± 0%      ~     (all equal)
CopyUp10MiBDelay150Msec-12     220kB/s ± 0%    220kB/s ± 0%      ~     (all equal)

puellanivis avatar Apr 24 '21 14:04 puellanivis

This looks like a great work, thank you

drakkan avatar Apr 25 '21 15:04 drakkan

I was just looking at the memory profile from the new version:

      flat  flat%   sum%        cum   cum%
    8.24GB  8.66% 83.74%     8.24GB  8.66%  github.com/pkg/sftp.(*sshFxInitPacket).MarshalBinary

:laughing: 8.66% of the allocations are from a packet that is sent only once per connection!

puellanivis avatar Apr 25 '21 19:04 puellanivis

Converting to draft, as there’s a lot of merge conflicts that have to be resolved…

puellanivis avatar Jul 02 '21 08:07 puellanivis

I’m going to go ahead and close this. There’s so many conflicts now that it’s probably best to just restart. (The PR will be kept around as a reference anyways, so, no point in keeping it in an “open” state.)

puellanivis avatar May 19 '23 05:05 puellanivis