Refactor: Start using filexfer
So, I’m checking in and pushing a PR very early here, so that others can get a sense of how the client refactor will look like after integrating the filexfer.
The early attention here should be on the Client.close() function, which felt would be the quickest way to show off how things would change. The changes to Conn and clientConn are to support the changes in the close() function, and allow for a parallel old-way/new-way hand-off from one set of wheels to the other while the car is still running. Before the PR is merged, it will eventually only have the new-way.
It’s a spring cleaning, some touch up (came across a few todos that I have good experience in and could roll it out really quick while I was there already) along with the refactor.
So far, I’ve mostly only refactored the client side so far. There have been some changes to the server code to remove some utility functions that were mostly used by the client code, so it was more of a cleanup rather than a focused refactor. I’ve been sure to keep the code working on both sides the whole way, so I’m thinking that since this will probably already be a pretty chonky PR, I’ll push the client refactor first, then we can do the server side work in a separate PR.
From:
sftp.clean$ go tool pprof memprofile.out
(pprof) top10
Showing nodes accounting for 500.49GB, 98.94% of 505.86GB total
Dropped 170 nodes (cum <= 2.53GB)
Showing top 10 nodes out of 28
flat flat% sum% cum cum%
403.43GB 79.75% 79.75% 403.43GB 79.75% github.com/pkg/sftp.recvPacket
40.09GB 7.92% 87.68% 40.09GB 7.92% github.com/pkg/sftp.(*bufPool).Get
15.94GB 3.15% 90.83% 15.94GB 3.15% github.com/pkg/sftp.(*sshFxpWritePacket).marshalPacket
15.48GB 3.06% 93.89% 15.48GB 3.06% github.com/pkg/sftp.(*allocator).GetPage
8.21GB 1.62% 95.51% 8.21GB 1.62% github.com/pkg/sftp.(*sshFxInitPacket).MarshalBinary
7.27GB 1.44% 96.95% 7.27GB 1.44% github.com/pkg/sftp.(*sshFxpOpenPacket).MarshalBinary
5.20GB 1.03% 97.97% 5.25GB 1.04% github.com/pkg/sftp.benchmarkWriteTo
2.19GB 0.43% 98.41% 3.55GB 0.7% github.com/pkg/sftp.benchmarkReadFrom
2.10GB 0.42% 98.82% 3.59GB 0.71% github.com/pkg/sftp.benchmarkWrite
0.59GB 0.12% 98.94% 2.69GB 0.53% github.com/pkg/sftp.(*File).writeChunkAt
to:
sftp$ go tool pprof memprofile.out
(pprof) top10
Showing nodes accounting for 93.60GB, 98.37% of 95.16GB total
Dropped 130 nodes (cum <= 0.48GB)
Showing top 10 nodes out of 42
flat flat% sum% cum cum%
54.95GB 57.75% 57.75% 54.95GB 57.75% github.com/pkg/sftp.(*bufPool).Get
16.50GB 17.34% 75.08% 16.50GB 17.34% github.com/pkg/sftp.(*allocator).GetPage
8.24GB 8.66% 83.74% 8.24GB 8.66% github.com/pkg/sftp.(*sshFxInitPacket).MarshalBinary
5.10GB 5.36% 89.11% 5.15GB 5.41% github.com/pkg/sftp.benchmarkWriteTo
2.21GB 2.32% 91.43% 28.50GB 29.95% github.com/pkg/sftp.(*clientConn).sendPacket
2.19GB 2.30% 93.73% 3.61GB 3.79% github.com/pkg/sftp.benchmarkReadFrom
1.97GB 2.07% 95.80% 3.05GB 3.21% github.com/pkg/sftp.benchmarkWrite
0.86GB 0.9% 96.71% 0.86GB 0.9% github.com/pkg/sftp.(*delayedWriter).Write
0.80GB 0.84% 97.54% 27.30GB 28.69% github.com/pkg/sftp.(*File).readChunkAt
0.78GB 0.82% 98.37% 0.78GB 0.82% github.com/pkg/sftp.(*File).WriteTo.func2
benchstats:
name old time/op new time/op delta
AllocatorSerial-12 208ns ±18% 218ns ±13% ~ (p=0.113 n=10+9)
AllocatorParallel/1-12 377ns ± 1% 339ns ± 1% -10.10% (p=0.000 n=9+9)
AllocatorParallel/2-12 386ns ± 1% 374ns ± 1% -3.25% (p=0.000 n=10+10)
AllocatorParallel/4-12 434ns ± 4% 425ns ± 5% ~ (p=0.105 n=10+10)
AllocatorParallel/8-12 484ns ±10% 467ns ±14% ~ (p=0.123 n=10+10)
Read1k-12 53.7ms ± 3% 54.1ms ± 5% ~ (p=0.393 n=10+10)
Read16k-12 6.98ms ± 4% 5.83ms ± 4% -16.52% (p=0.000 n=10+10)
Read32k-12 5.61ms ±20% 4.03ms ± 6% -28.10% (p=0.000 n=10+10)
Read128k-12 3.85ms ± 4% 3.03ms ± 4% -21.28% (p=0.000 n=10+10)
Read512k-12 3.60ms ± 2% 2.79ms ± 4% -22.45% (p=0.000 n=10+9)
Read1MiB-12 4.06ms ± 5% 2.94ms ± 3% -27.60% (p=0.000 n=9+10)
Read4MiB-12 4.63ms ± 2% 3.79ms ± 5% -18.13% (p=0.000 n=10+10)
Read4MiBDelay10Msec-12 71.3ms ± 1% 69.3ms ± 1% -2.77% (p=0.000 n=10+10)
Read4MiBDelay50Msec-12 313ms ± 0% 311ms ± 0% -0.82% (p=0.000 n=10+10)
Read4MiBDelay150Msec-12 914ms ± 0% 912ms ± 0% -0.30% (p=0.000 n=10+10)
Write1k-12 100ms ± 8% 103ms ± 5% ~ (p=0.063 n=10+10)
Write16k-12 12.8ms ± 4% 13.2ms ± 3% +3.41% (p=0.009 n=10+8)
Write32k-12 9.72ms ± 4% 10.01ms ± 5% +2.94% (p=0.035 n=10+10)
Write128k-12 9.93ms ± 5% 10.24ms ± 7% ~ (p=0.052 n=10+10)
Write512k-12 9.86ms ± 3% 10.44ms ± 6% +5.95% (p=0.000 n=10+10)
Write1MiB-12 10.0ms ± 5% 10.3ms ± 7% ~ (p=0.143 n=10+10)
Write4MiB-12 10.1ms ± 5% 10.6ms ± 7% +5.06% (p=0.009 n=10+10)
Write4MiBDelay10Msec-12 3.40s ± 0% 3.39s ± 0% ~ (p=0.113 n=9+10)
Write4MiBDelay50Msec-12 16.3s ± 0% 16.3s ± 0% ~ (p=0.340 n=9+9)
Write4MiBDelay150Msec-12 48.7s ± 0% 48.7s ± 0% ~ (p=0.243 n=10+9)
ReadFrom1k-12 9.36ms ± 1% 9.35ms ± 3% ~ (p=0.720 n=10+9)
ReadFrom16k-12 9.47ms ± 1% 9.54ms ± 3% ~ (p=0.400 n=9+10)
ReadFrom32k-12 9.78ms ± 7% 9.73ms ± 3% ~ (p=0.780 n=10+9)
ReadFrom128k-12 10.1ms ± 3% 10.3ms ± 7% ~ (p=0.497 n=9+10)
ReadFrom512k-12 10.1ms ± 4% 10.2ms ± 4% ~ (p=0.165 n=10+10)
ReadFrom1MiB-12 10.3ms ± 5% 10.4ms ± 3% ~ (p=0.315 n=10+10)
ReadFrom4MiB-12 10.5ms ± 7% 10.5ms ± 6% ~ (p=0.739 n=10+10)
ReadFrom4MiBDelay10Msec-12 3.41s ± 0% 3.41s ± 0% ~ (p=0.529 n=10+10)
ReadFrom4MiBDelay50Msec-12 16.4s ± 0% 16.4s ± 0% +0.03% (p=0.000 n=10+9)
ReadFrom4MiBDelay150Msec-12 48.8s ± 0% 48.8s ± 0% +0.00% (p=0.003 n=10+10)
WriteTo1k-12 6.07ms ± 1% 5.03ms ± 1% -17.12% (p=0.000 n=10+10)
WriteTo16k-12 6.11ms ± 1% 5.12ms ± 1% -16.18% (p=0.000 n=9+9)
WriteTo32k-12 6.47ms ± 1% 5.39ms ± 4% -16.68% (p=0.000 n=9+10)
WriteTo128k-12 6.64ms ± 3% 5.64ms ± 5% -15.15% (p=0.000 n=10+10)
WriteTo512k-12 6.74ms ± 3% 5.75ms ± 3% -14.61% (p=0.000 n=10+10)
WriteTo1MiB-12 6.86ms ± 2% 5.77ms ± 2% -15.91% (p=0.000 n=9+10)
WriteTo4MiB-12 6.95ms ± 3% 5.84ms ± 3% -16.01% (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12 107ms ± 1% 107ms ± 1% ~ (p=0.905 n=9+10)
WriteTo4MiBDelay50Msec-12 510ms ± 0% 510ms ± 0% ~ (p=0.796 n=10+10)
WriteTo4MiBDelay150Msec-12 1.51s ± 0% 1.51s ± 0% ~ (p=0.971 n=10+10)
CopyDown10MiBDelay10Msec-12 108ms ± 2% 104ms ± 1% -3.14% (p=0.000 n=9+10)
CopyDown10MiBDelay50Msec-12 469ms ± 0% 465ms ± 0% -0.85% (p=0.000 n=10+10)
CopyDown10MiBDelay150Msec-12 1.37s ± 0% 1.37s ± 0% -0.38% (p=0.000 n=9+10)
CopyUp10MiBDelay10Msec-12 3.41s ± 0% 3.41s ± 0% ~ (p=0.579 n=10+10)
CopyUp10MiBDelay50Msec-12 16.3s ± 0% 16.3s ± 0% ~ (p=0.739 n=10+10)
CopyUp10MiBDelay150Msec-12 48.5s ± 0% 48.5s ± 0% ~ (p=0.052 n=10+10)
MarshalInit-12 105ns ±21% 93ns ±16% -11.33% (p=0.027 n=10+10)
MarshalOpen-12 99.4ns ±27% 39.5ns ± 6% -60.30% (p=0.000 n=10+10)
MarshalWriteWorstCase-12 95.8ns ±22% 40.3ns ± 3% -57.92% (p=0.000 n=10+10)
MarshalWrite1k-12 99.3ns ±10% 39.6ns ± 0% -60.10% (p=0.000 n=10+8)
name old alloc/op new alloc/op delta
AllocatorSerial-12 72.0B ± 0% 72.0B ± 0% ~ (all equal)
AllocatorParallel/1-12 72.0B ± 0% 72.0B ± 0% ~ (all equal)
AllocatorParallel/2-12 72.0B ± 0% 72.0B ± 0% ~ (all equal)
AllocatorParallel/4-12 72.0B ± 0% 72.0B ± 0% ~ (all equal)
AllocatorParallel/8-12 72.0B ± 0% 72.0B ± 0% ~ (all equal)
Read1k-12 6.99MB ± 0% 0.70MB ± 0% -89.94% (p=0.000 n=10+8)
Read16k-12 5.99MB ± 0% 0.04MB ± 0% -99.26% (p=0.000 n=10+8)
Read32k-12 6.63MB ± 0% 0.02MB ± 0% -99.66% (p=0.000 n=10+10)
Read128k-12 6.77MB ± 0% 0.04MB ± 0% -99.47% (p=0.000 n=9+8)
Read512k-12 7.25MB ± 0% 0.03MB ± 1% -99.60% (p=0.000 n=10+10)
Read1MiB-12 7.91MB ± 0% 0.03MB ± 3% -99.61% (p=0.000 n=9+10)
Read4MiB-12 10.5MB ± 0% 0.0MB ± 6% -99.60% (p=0.000 n=10+10)
Read4MiBDelay10Msec-12 10.5MB ± 0% 0.1MB ± 8% -99.17% (p=0.000 n=10+8)
Read4MiBDelay50Msec-12 10.5MB ± 0% 0.2MB ±21% -98.30% (p=0.000 n=10+10)
Read4MiBDelay150Msec-12 10.5MB ± 0% 0.3MB ±17% -97.27% (p=0.000 n=10+10)
Write1k-12 3.36MB ± 0% 1.89MB ± 0% -43.73% (p=0.000 n=10+10)
Write16k-12 212kB ± 0% 120kB ± 0% -43.45% (p=0.000 n=10+9)
Write32k-12 107kB ± 0% 61kB ± 0% -43.15% (p=0.000 n=10+9)
Write128k-12 72.1kB ± 0% 60.7kB ± 0% -15.84% (p=0.000 n=10+10)
Write512k-12 63.5kB ± 0% 60.7kB ± 0% -4.38% (p=0.000 n=10+9)
Write1MiB-12 62.0kB ± 0% 60.7kB ± 0% -2.17% (p=0.000 n=10+10)
Write4MiB-12 60.9kB ± 0% 60.7kB ± 0% -0.28% (p=0.000 n=10+10)
Write4MiBDelay10Msec-12 10.6MB ± 0% 10.6MB ± 0% +0.39% (p=0.000 n=9+10)
Write4MiBDelay50Msec-12 10.6MB ± 0% 10.6MB ± 0% +0.38% (p=0.000 n=9+9)
Write4MiBDelay150Msec-12 10.6MB ± 0% 10.6MB ± 0% +0.39% (p=0.000 n=8+9)
ReadFrom1k-12 93.7kB ± 0% 93.6kB ± 0% -0.09% (p=0.002 n=9+8)
ReadFrom16k-12 93.7kB ± 0% 93.6kB ± 0% -0.11% (p=0.000 n=10+10)
ReadFrom32k-12 93.7kB ± 0% 93.7kB ± 0% -0.08% (p=0.000 n=10+9)
ReadFrom128k-12 93.7kB ± 0% 93.7kB ± 0% -0.07% (p=0.006 n=10+9)
ReadFrom512k-12 93.7kB ± 0% 93.7kB ± 0% -0.04% (p=0.030 n=10+10)
ReadFrom1MiB-12 93.7kB ± 0% 93.7kB ± 0% -0.04% (p=0.028 n=10+9)
ReadFrom4MiB-12 93.7kB ± 0% 93.7kB ± 0% ~ (p=0.071 n=9+7)
ReadFrom4MiBDelay10Msec-12 10.6MB ± 0% 10.6MB ± 0% +0.39% (p=0.000 n=8+10)
ReadFrom4MiBDelay50Msec-12 10.6MB ± 0% 10.6MB ± 0% +0.38% (p=0.000 n=9+10)
ReadFrom4MiBDelay150Msec-12 10.6MB ± 0% 10.6MB ± 0% +0.39% (p=0.000 n=10+10)
WriteTo1k-12 15.3MB ± 0% 2.7MB ± 0% -82.25% (p=0.000 n=10+10)
WriteTo16k-12 15.3MB ± 0% 2.7MB ± 0% -82.24% (p=0.000 n=9+10)
WriteTo32k-12 15.3MB ± 0% 2.7MB ± 0% -82.24% (p=0.000 n=10+10)
WriteTo128k-12 15.3MB ± 0% 2.7MB ± 0% -82.25% (p=0.000 n=10+9)
WriteTo512k-12 15.3MB ± 0% 2.7MB ± 0% -82.24% (p=0.000 n=10+10)
WriteTo1MiB-12 15.3MB ± 0% 2.7MB ± 0% -82.25% (p=0.000 n=10+10)
WriteTo4MiB-12 15.3MB ± 0% 2.7MB ± 0% -82.24% (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12 15.3MB ± 0% 2.8MB ± 3% -81.81% (p=0.000 n=10+10)
WriteTo4MiBDelay50Msec-12 15.3MB ± 0% 3.0MB ± 3% -80.71% (p=0.000 n=10+10)
WriteTo4MiBDelay150Msec-12 15.3MB ± 0% 3.2MB ±15% -79.40% (p=0.000 n=10+10)
CopyDown10MiBDelay10Msec-12 15.3MB ± 0% 2.8MB ± 4% -81.72% (p=0.000 n=10+9)
CopyDown10MiBDelay50Msec-12 15.3MB ± 0% 2.9MB ± 2% -81.17% (p=0.000 n=10+8)
CopyDown10MiBDelay150Msec-12 15.3MB ± 0% 3.2MB ± 5% -79.21% (p=0.000 n=10+10)
CopyUp10MiBDelay10Msec-12 10.6MB ± 0% 10.7MB ± 0% +0.79% (p=0.000 n=9+10)
CopyUp10MiBDelay50Msec-12 10.6MB ± 0% 10.7MB ± 0% +0.77% (p=0.000 n=9+8)
CopyUp10MiBDelay150Msec-12 10.6MB ± 0% 10.7MB ± 0% +0.77% (p=0.000 n=10+10)
MarshalInit-12 48.0B ± 0% 48.0B ± 0% ~ (all equal)
MarshalOpen-12 48.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MarshalWriteWorstCase-12 48.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MarshalWrite1k-12 48.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
AllocatorSerial-12 2.00 ± 0% 2.00 ± 0% ~ (all equal)
AllocatorParallel/1-12 2.00 ± 0% 2.00 ± 0% ~ (all equal)
AllocatorParallel/2-12 2.00 ± 0% 2.00 ± 0% ~ (all equal)
AllocatorParallel/4-12 2.00 ± 0% 2.00 ± 0% ~ (all equal)
AllocatorParallel/8-12 2.00 ± 0% 2.00 ± 0% ~ (all equal)
Read1k-12 30.7k ± 0% 15.4k ± 0% -50.00% (p=0.000 n=9+10)
Read16k-12 1.94k ± 0% 0.97k ± 0% -49.95% (p=0.000 n=10+10)
Read32k-12 982 ± 0% 492 ± 0% -49.90% (p=0.000 n=10+10)
Read128k-12 1.25k ± 0% 0.67k ± 0% -46.71% (p=0.002 n=8+10)
Read512k-12 1.14k ± 0% 0.58k ± 0% -49.16% (p=0.000 n=10+10)
Read1MiB-12 1.22k ± 0% 0.61k ± 0% -49.89% (p=0.000 n=9+8)
Read4MiB-12 1.32k ± 0% 0.79k ± 0% -40.21% (p=0.000 n=10+8)
Read4MiBDelay10Msec-12 1.57k ± 0% 1.06k ± 1% -32.85% (p=0.000 n=10+10)
Read4MiBDelay50Msec-12 1.58k ± 1% 1.09k ± 1% -31.02% (p=0.000 n=10+10)
Read4MiBDelay150Msec-12 1.59k ± 2% 1.14k ± 2% -28.02% (p=0.000 n=10+10)
Write1k-12 82.0k ± 0% 41.0k ± 0% -49.99% (p=0.000 n=10+10)
Write16k-12 5.16k ± 0% 2.59k ± 0% -49.84% (p=0.000 n=10+10)
Write32k-12 2.60k ± 0% 1.31k ± 0% -49.69% (p=0.000 n=9+9)
Write128k-12 2.12k ± 0% 1.31k ± 0% -38.29% (p=0.000 n=10+9)
Write512k-12 2.00k ± 0% 1.31k ± 0% -34.58% (p=0.000 n=10+9)
Write1MiB-12 1.98k ± 0% 1.31k ± 0% -33.89% (p=0.000 n=8+10)
Write4MiB-12 1.96k ± 0% 1.31k ± 0% -33.35% (p=0.000 n=8+10)
Write4MiBDelay10Msec-12 2.62k ± 2% 1.98k ± 2% -24.46% (p=0.000 n=9+10)
Write4MiBDelay50Msec-12 2.61k ± 0% 1.96k ± 0% -25.05% (p=0.000 n=9+9)
Write4MiBDelay150Msec-12 2.61k ± 0% 1.96k ± 1% -24.83% (p=0.000 n=8+9)
ReadFrom1k-12 1.97k ± 0% 1.31k ± 0% -33.26% (p=0.000 n=8+10)
ReadFrom16k-12 1.97k ± 0% 1.31k ± 0% -33.27% (p=0.000 n=9+10)
ReadFrom32k-12 1.97k ± 0% 1.31k ± 0% -33.25% (p=0.000 n=8+8)
ReadFrom128k-12 1.97k ± 0% 1.31k ± 0% -33.26% (p=0.000 n=9+10)
ReadFrom512k-12 1.97k ± 0% 1.31k ± 0% -33.26% (p=0.000 n=8+10)
ReadFrom1MiB-12 1.97k ± 0% 1.31k ± 0% -33.25% (p=0.000 n=8+8)
ReadFrom4MiB-12 1.97k ± 0% 1.31k ± 0% -33.25% (p=0.000 n=8+8)
ReadFrom4MiBDelay10Msec-12 2.61k ± 0% 1.97k ± 1% -24.59% (p=0.000 n=8+10)
ReadFrom4MiBDelay50Msec-12 2.62k ± 1% 1.97k ± 1% -24.81% (p=0.000 n=9+10)
ReadFrom4MiBDelay150Msec-12 2.62k ± 1% 1.98k ± 2% -24.65% (p=0.000 n=10+10)
WriteTo1k-12 2.29k ± 0% 1.76k ± 0% -23.14% (p=0.000 n=10+10)
WriteTo16k-12 2.29k ± 0% 1.76k ± 0% -23.11% (p=0.000 n=10+8)
WriteTo32k-12 2.29k ± 0% 1.76k ± 0% -23.11% (p=0.000 n=10+10)
WriteTo128k-12 2.29k ± 0% 1.76k ± 0% -23.11% (p=0.000 n=10+6)
WriteTo512k-12 2.29k ± 0% 1.76k ± 0% -23.07% (p=0.000 n=10+10)
WriteTo1MiB-12 2.29k ± 0% 1.76k ± 0% -23.09% (p=0.000 n=10+10)
WriteTo4MiB-12 2.29k ± 0% 1.76k ± 0% -23.05% (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12 2.68k ± 0% 2.17k ± 1% -19.04% (p=0.000 n=10+10)
WriteTo4MiBDelay50Msec-12 2.69k ± 1% 2.24k ± 1% -16.77% (p=0.000 n=10+10)
WriteTo4MiBDelay150Msec-12 2.71k ± 1% 2.32k ± 2% -14.51% (p=0.000 n=10+10)
CopyDown10MiBDelay10Msec-12 2.68k ± 0% 2.18k ± 0% -18.94% (p=0.000 n=10+10)
CopyDown10MiBDelay50Msec-12 2.69k ± 1% 2.21k ± 1% -17.82% (p=0.000 n=10+10)
CopyDown10MiBDelay150Msec-12 2.70k ± 1% 2.31k ± 1% -14.34% (p=0.000 n=10+10)
CopyUp10MiBDelay10Msec-12 2.60k ± 0% 1.97k ± 1% -24.23% (p=0.000 n=9+10)
CopyUp10MiBDelay50Msec-12 2.61k ± 1% 1.96k ± 0% -24.88% (p=0.000 n=9+8)
CopyUp10MiBDelay150Msec-12 2.62k ± 1% 1.97k ± 1% -24.82% (p=0.000 n=10+10)
MarshalInit-12 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MarshalOpen-12 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MarshalWriteWorstCase-12 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MarshalWrite1k-12 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
name old speed new speed delta
Read1k-12 195MB/s ± 3% 194MB/s ± 6% ~ (p=0.393 n=10+10)
Read16k-12 1.50GB/s ± 4% 1.80GB/s ± 4% +19.78% (p=0.000 n=10+10)
Read32k-12 1.88GB/s ±18% 2.60GB/s ± 5% +38.14% (p=0.000 n=10+10)
Read128k-12 2.73GB/s ± 4% 3.47GB/s ± 4% +27.07% (p=0.000 n=10+10)
Read512k-12 2.91GB/s ± 2% 3.75GB/s ± 4% +28.98% (p=0.000 n=10+9)
Read1MiB-12 2.58GB/s ± 5% 3.56GB/s ± 3% +38.10% (p=0.000 n=9+10)
Read4MiB-12 2.27GB/s ± 2% 2.77GB/s ± 5% +22.23% (p=0.000 n=10+10)
Read4MiBDelay10Msec-12 147MB/s ± 1% 151MB/s ± 1% +2.85% (p=0.000 n=10+10)
Read4MiBDelay50Msec-12 33.5MB/s ± 0% 33.8MB/s ± 0% +0.82% (p=0.000 n=10+10)
Read4MiBDelay150Msec-12 11.5MB/s ± 0% 11.5MB/s ± 0% +0.31% (p=0.000 n=10+10)
Write1k-12 105MB/s ± 7% 102MB/s ± 5% ~ (p=0.063 n=10+10)
Write16k-12 820MB/s ± 4% 793MB/s ± 3% -3.32% (p=0.009 n=10+8)
Write32k-12 1.08GB/s ± 5% 1.05GB/s ± 5% -2.85% (p=0.035 n=10+10)
Write128k-12 1.06GB/s ± 5% 1.03GB/s ± 7% ~ (p=0.052 n=10+10)
Write512k-12 1.06GB/s ± 3% 1.00GB/s ± 5% -5.55% (p=0.000 n=10+10)
Write1MiB-12 1.05GB/s ± 6% 1.02GB/s ± 6% ~ (p=0.143 n=10+10)
Write4MiB-12 1.04GB/s ± 5% 0.99GB/s ± 7% -4.72% (p=0.009 n=10+10)
Write4MiBDelay10Msec-12 3.09MB/s ± 0% 3.09MB/s ± 0% ~ (all equal)
Write4MiBDelay50Msec-12 640kB/s ± 0% 640kB/s ± 0% ~ (all equal)
Write4MiBDelay150Msec-12 220kB/s ± 0% 220kB/s ± 0% ~ (all equal)
ReadFrom1k-12 1.12GB/s ± 1% 1.12GB/s ± 3% ~ (p=0.720 n=10+9)
ReadFrom16k-12 1.11GB/s ± 2% 1.10GB/s ± 3% ~ (p=0.400 n=9+10)
ReadFrom32k-12 1.07GB/s ± 7% 1.07GB/s ± 7% ~ (p=0.529 n=10+10)
ReadFrom128k-12 1.03GB/s ± 7% 1.02GB/s ± 7% ~ (p=0.739 n=10+10)
ReadFrom512k-12 1.04GB/s ± 4% 1.03GB/s ± 4% ~ (p=0.165 n=10+10)
ReadFrom1MiB-12 1.02GB/s ± 6% 1.01GB/s ± 3% ~ (p=0.315 n=10+10)
ReadFrom4MiB-12 1.00GB/s ± 7% 1.00GB/s ± 6% ~ (p=0.739 n=10+10)
ReadFrom4MiBDelay10Msec-12 3.08MB/s ± 0% 3.07MB/s ± 0% -0.16% (p=0.022 n=9+10)
ReadFrom4MiBDelay50Msec-12 640kB/s ± 0% 640kB/s ± 0% ~ (all equal)
ReadFrom4MiBDelay150Msec-12 210kB/s ± 0% 210kB/s ± 0% ~ (all equal)
WriteTo1k-12 1.73GB/s ± 1% 2.08GB/s ± 1% +20.66% (p=0.000 n=10+10)
WriteTo16k-12 1.72GB/s ± 1% 2.05GB/s ± 1% +19.31% (p=0.000 n=9+9)
WriteTo32k-12 1.62GB/s ± 1% 1.95GB/s ± 4% +20.07% (p=0.000 n=9+10)
WriteTo128k-12 1.58GB/s ± 3% 1.86GB/s ± 5% +17.91% (p=0.000 n=10+10)
WriteTo512k-12 1.56GB/s ± 3% 1.82GB/s ± 3% +17.13% (p=0.000 n=10+10)
WriteTo1MiB-12 1.53GB/s ± 2% 1.82GB/s ± 2% +18.92% (p=0.000 n=9+10)
WriteTo4MiB-12 1.51GB/s ± 3% 1.80GB/s ± 3% +19.06% (p=0.000 n=10+10)
WriteTo4MiBDelay10Msec-12 97.6MB/s ± 1% 97.7MB/s ± 1% ~ (p=0.921 n=9+10)
WriteTo4MiBDelay50Msec-12 20.5MB/s ± 0% 20.6MB/s ± 0% ~ (p=0.807 n=10+10)
WriteTo4MiBDelay150Msec-12 6.93MB/s ± 0% 6.93MB/s ± 0% ~ (p=1.000 n=9+10)
CopyDown10MiBDelay10Msec-12 97.5MB/s ± 2% 100.7MB/s ± 1% +3.23% (p=0.000 n=9+10)
CopyDown10MiBDelay50Msec-12 22.3MB/s ± 0% 22.5MB/s ± 0% +0.85% (p=0.000 n=10+10)
CopyDown10MiBDelay150Msec-12 7.63MB/s ± 0% 7.67MB/s ± 0% +0.44% (p=0.000 n=8+10)
CopyUp10MiBDelay10Msec-12 3.08MB/s ± 0% 3.08MB/s ± 0% ~ (p=1.000 n=10+10)
CopyUp10MiBDelay50Msec-12 640kB/s ± 0% 640kB/s ± 0% ~ (all equal)
CopyUp10MiBDelay150Msec-12 220kB/s ± 0% 220kB/s ± 0% ~ (all equal)
This looks like a great work, thank you
I was just looking at the memory profile from the new version:
flat flat% sum% cum cum%
8.24GB 8.66% 83.74% 8.24GB 8.66% github.com/pkg/sftp.(*sshFxInitPacket).MarshalBinary
:laughing: 8.66% of the allocations are from a packet that is sent only once per connection!
Converting to draft, as there’s a lot of merge conflicts that have to be resolved…
I’m going to go ahead and close this. There’s so many conflicts now that it’s probably best to just restart. (The PR will be kept around as a reference anyways, so, no point in keeping it in an “open” state.)