ssh2-streams icon indicating copy to clipboard operation
ssh2-streams copied to clipboard

fastPut generates corrupted zip files

Open dvlato opened this issue 5 years ago • 11 comments

When we upload a ZIP file (size 37430637 bytes) using fastPut with default options (I am using the ssh2-sftp-client package, so I am just doing sftp.put(origin,target).then(sftp.end())), the uploaded file is corrupted.

In my tests most of the times (probably always if I use nodeJs 8) the file has a different size from the original file (I've also reproduced the file size difference with this smaller file, but not consistently with the attached file).
images.zip

I would say that with Node10 the file size is correct with the default options but if I change the options to {concurrency: 32, chunkSize: 8192} , the file size is 37422445 instead of 37430637 (the difference is equals to chunkSize!!).

However even when using NodeJS 10 with default options, the file contents are not the same (even if the file size seems to match)" this is the error that 'cmp' shows:

differ: char 655361, line 2098

I can reproduce this issue with different operating systems and computers.

dvlato avatar May 01 '19 15:05 dvlato

I understand the file gets corrupted because it's a ZIP consisting of thousands of very small files, so the zip structure can be damaged easily...

dvlato avatar May 01 '19 15:05 dvlato

Which version of ssh2-streams is being used here (npm ls should tell you)? I just recently published a version that should have fixed this.

mscdex avatar May 01 '19 17:05 mscdex

I've tried both 0.4.4 and 0.4.2 and found the issue with both, so it seems to be different from the issue you fixed yesterday.

On Wed, 1 May 2019, 18:16 mscdex, [email protected] wrote:

Which version of ssh2-streams is being used here (npm ls should tell you)? I just recently published a version that should have fixed this.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mscdex/ssh2-streams/issues/134#issuecomment-488347177, or mute the thread https://github.com/notifications/unsubscribe-auth/AF7Y7DX7NJ2DLKPZOAXPAZTPTHGAFANCNFSM4HJUIFXQ .

dvlato avatar May 01 '19 18:05 dvlato

I can't reproduce the issue with latest ssh2/ssh2-streams with the configurations you mentioned and with a file of the same exact size. The sha1sums always match on both sides.

mscdex avatar May 02 '19 06:05 mscdex

Hi thanks a lot for the quick response! I have checked, and it seems I only get these corrupt files (when using fastPut, put works fine but it's super slow - 2 minutes compared to a few seconds with lftp ) when connecting to that specific sftp server (Akamai's netstorage). The banner I get is this one: SSH-2.0-Server-VIII-hpn14v11 Do you have any idea of how I should troubleshoot this issue? Is there any special feature / SFTP version needed?

dvlato avatar May 02 '19 15:05 dvlato

I don't know what their underlying platform is, so the best I'm able to do is use hpn14v11 with OpenSSH v7.3 (the only OpenSSH version that particular hpn version was designed for). Again, every upload I try in that scenario as well comes out the same, the sha1sums match on both sides.

Are you saying it only happens with this particular server and if you use fastPut() to upload the file to another server, the file contents match exactly?

As far as debugging there is nothing special, just debug fastXfer() however you like, with an IDE or inserting debug statements.

mscdex avatar May 03 '19 01:05 mscdex

Also, as far as transfer speed goes, ssh2 isn't usually dramatically far off in comparison to standard OpenSSH sftp transfer using the same cipher/mac selection. For ssh2 you can get much faster transfer speeds if you use an aes-gcm cipher on a machine that has aes acceleration as it's more efficient than having cipher and mac separate, so you might consider promoting those ciphers by way of an explicit algorithms option in your connection config object if you have a CPU that has that (most modern x86 processors have this).

If you're using an OpenSSH client that has the hpn patch(es) and the server also has hpn patch(es), then there are special behavioral changes (protocol-wise) that allow for better performance, so that has some impact.

mscdex avatar May 03 '19 03:05 mscdex

Thank you for your quick and informative responses. That's what I meant, I've tested the code against two other sftp servers and for those the md5sum matches correctly with the same file. I was wondering (I haven't looked into the implementation) what part of the fastPut could be problematic for our server... for instance, it might require some 'special' feature that's not required for put() .

Also, as mentioned, the speed is several orders of magnitude slower with put() than if I use openSSH directly or filezilla. I'm on Mac Os X and I haven't installed any hpn patches (or see it in the output of my client) so I don't think that should be it. From your response I see that's clearly not normal so I will try to change the algorithm and see if that helps.

dvlato avatar May 03 '19 08:05 dvlato

You could also set debug: console.log and compare the resulting output between one of the servers that transfers ok and the problematic server and see if there are any obvious differences that might explain things.

You might also try using fs.createReadStream() and sftp.createWriteStream() and piping the two together. This should be equivalent to having concurrency: 1 (which you could also try) for fastPut() and might help rule out any read buffer reuse issues. If all of that checks out then you'd probably have to dig into the fastXfer() code as described previously.

mscdex avatar May 03 '19 14:05 mscdex

Hi, sorry to reopen this issue. When you say " ssh2 isn't usually dramatically far off in comparison to standard OpenSSH sftp transfer using the same cipher/mac selection.", are you talking about the 'fast transfer' or standard one? For standard put (or fastPut with concurrency: 1), when I set the concurrency to 1 and the algorithms to "aes128-ctr" , "hmac-md5" (I haven't set the buffer size or any other options), my file with size "37430637" takes nearly 3 minutes instead of some seconds, even with one of the 'correct' servers.

Is that something you would expect or should I check my code? I don't know what might make it so slow...

dvlato avatar May 07 '19 15:05 dvlato

Follow up for anyone having the same issue: we have found out that the issue was caused by the server side. The transfer was correct but saving to disk is not (and they are not going to fix it as they don't support 'in-place updates' such as seek and write).

However, we still find ssh2-streams extremely slow compared to OpenSSH sftp transfer, so we have stopped using it. I would love to revisit when the performance issue is solved or if it turns out it's a bad usage (a sample that showed comparable performance to OpenSSH would be welcome).

dvlato avatar Jun 11 '19 17:06 dvlato