RoboSharp icon indicating copy to clipboard operation
RoboSharp copied to clipboard

RoboSharp.Extensions - Async File Copier

Open RFBomb opened this issue 1 year ago • 60 comments

So for the past few weeks I've been noodling with this, and after a brief review of the Extensions package as it stood, I decided to place my work here. The extensions package already contains references the the Kerner32.dll file for handling SymLinks. Since it is windows specific already, I'm adding in the support for CopyFileEx, which resides within that same system dll.

This PR provides various functionality for wrapping CopyFileEx and its MoveFile equivalent. This allows native Progress Reporting methods for file copy/move operations for use with the custom IRoboCommand objects. CopyFileEx can only run Synchronously, so all Async methods that utilize it wrap it inside of a Task.Run().

In order to decouple the caller from the copy operation various methods have been introduced into the FileFunctions class, such as using a Progress object to report progress. A new FileCopier class has been introduced that extends the existing FilePair class to simplify calling of these functions for custom commands.

As an aside, part of the reason for combining this here instead of another nuget package is that I believe under the hood robocopy utilizes CopyFileEx for its progress reporting and restartable mode, as these are also available within that call's api.

The FileCopier class intentionally does not do any retry logic, as that should be implemented within the custom command if desired. It does provide a ProgressUpdated event that creates the appropriate event args though. The RoboSharp.CopyOptions object directly translates some of its booleans to the enum that CopyFileEx uses, so an extension method has been added into the Helpers namespace to allow for this translation.

RFBomb avatar Feb 26 '24 18:02 RFBomb

Sounds good, can you add a couple of code samples here to demonstrate how to use it? then I can have a play with it when I get a moment

PCAssistSoftware avatar Feb 26 '24 18:02 PCAssistSoftware

Yea I'm working on that side of things. One of the things I'm sorta debating though is if this is even a 'good idea'. As CopyFileEx (while simply to use and likely wont be deprecated any time soon), is superceded by CopyFile2 apparently. So I'm wondering if the PR should instead have a base implementation laid out instead, and let the caller decide to plug in their own progress-reporting copier.

CopyFileEx has been around since XP. CopyFile2 requires Windows8 or newer (most development is this scenario, but potentially not all, legacy systems exist)

RFBomb avatar Feb 26 '24 19:02 RFBomb

Yes awkward one that but can't see CopyFileEx going anywhere for a long time as used by so many products and other functions. To be honest I had heard of that but never of CopyFile2 so that says a lot.

PCAssistSoftware avatar Feb 26 '24 19:02 PCAssistSoftware

Yes I discovered CopyFile2 while perusing the web for CopyFileEx examples, and stumbled upon a few threads that recomended 'if an *Ex function is superseded you should use the new method instead.'

The major difference between them that I understand is that CopyFile2 uses an IntPtr struct to pass in all arguments, and has already gone through several revisions of that struct, adding some additional capabilities to it (the CopyFile2 call depends on the struct you pass in). Whereas CopyFileEx has a specific signature that doesn't allow for that flexibility.

I think for the needs here, CopyFileEx suits the needs just fine. But I'm going to write it in such a way that CopyFile2 (or just filestreams) can be used if the consumer wishes to implement it.

RFBomb avatar Feb 26 '24 19:02 RFBomb

@PCAssistSoftware This is coming along fairly well. Unfortuantely I did run into a bit of inconsistency:

  • On my home computer, the Unit Tests run fine!
  • On my work computer, they fail.

image

I was hoping you could download this PR and run the CopyFileEx_CopyFile test yourself to see if it also fails. This seems to be an inconsistency in CopyFileEx invoking across systems.

RFBomb avatar Apr 09 '24 18:04 RFBomb

Worked fine for me.

image

Let me know anything else I can test for you.

PCAssistSoftware avatar Apr 09 '24 18:04 PCAssistSoftware

for info the one which fails for me is CopyFileAsyncTest

image

PCAssistSoftware avatar Apr 09 '24 20:04 PCAssistSoftware

I HAVE A CONUNDRUM.

image

VS2022 PASSES the unit test and VS2019 FAILS it!!! what in the world........

RFBomb avatar Apr 09 '24 20:04 RFBomb

That is so odd?!?

PCAssistSoftware avatar Apr 09 '24 20:04 PCAssistSoftware

So obviously, if I can reliably reproduce these results on two different computers. it is a crapshoot as to what the behavior of the nuget package that appveyor produces will result in. (But that's why I write the test).

I have created a discussion post on the.net/runtime Because I don't know where else to report the issue.

Other than this hiccup, This PR is coming along fairly well I think. It still has a long way to go because I intend to incorporate some of the changes into the test app for demo/test purposes. Much of that is already marked up but I have to finalize everything.

RFBomb avatar Apr 09 '24 21:04 RFBomb

Sounds good. Will be interested to hear if you get any response to help explain it.

PCAssistSoftware avatar Apr 09 '24 22:04 PCAssistSoftware

Alright, I did get a response, implemented the change, and its now unit testing fine on both VS2022 and 2019 on my pc. It is now back to what I think I had originally, so not sure what I did wrong in the interim, but its working again so thats that.

More unit tests have been added and things have been refined. This is now 'functional'. The required changes to the UI have not yet been implemented, so its not ready for merging yet.

As an overview of usage:

FileCopyExFactory and StreamedCopierFactory are the two new factory objects to produce IFileCopier objects. These IFileCopierFactory objects are usable for custom implementations, such as the new RoboBatchCommand.

RoboBatchCommand was my first run at a custom IRoboCommand for my application, and it works fairly well. Basically provide it a collection of source/destination pairs and it will run through the collection, copying each to their respective destination. This allows to say grab 20 files from 20 seperate folders into either one destination or up to 20 ( 1 destination per source ), and run as a single IRoboCommand

var batch = new RoboBatchCommand(new StreamedCopierFactory()); // this is platform agnostic - hello linux :)
batch.AddCopier("SomeSource", "SomeDestination");
batch.AddCopier("SomeOtherSource", "SomeOtherDestination");
await batch.Start()

RFBomb avatar Apr 10 '24 18:04 RFBomb

Excellent!

Just run all tests here and I am getting 3 failures as per screenshot below - all have same error relating to attempting to read or write protected memory.

image

This is looking really great, fantastic work :)

PCAssistSoftware avatar Apr 10 '24 18:04 PCAssistSoftware

Strange. I bet it's because the path is inside of your OneDrive folder. Mine is C:\Repos. Since it's iterating across the test in <500ms one drive probably is picking up the new file, holding it, then the issue occurs.

RFBomb avatar Apr 10 '24 18:04 RFBomb

Strange. I bet it's because the path is inside of your OneDrive folder. Mine is C:\Repos. Since it's iterating across the test in <500ms one drive probably is picking up the new file, holding it, then the issue occurs.

moved it outside of my OneDrive and now get 6 failures....??!?!?

image

PCAssistSoftware avatar Apr 10 '24 19:04 PCAssistSoftware

Moved it to the root of another drive so path was shorter to rule out path length and got 4 failures. Very odd..

image

PCAssistSoftware avatar Apr 10 '24 19:04 PCAssistSoftware

Not that it helps here, but I ran successfully >< image

The last commit changed where I'm writing the files for the test, maybe that helps? I really don't know why it would be having that error

RFBomb avatar Apr 10 '24 19:04 RFBomb

Just tried last commit and getting 5 failures, 1 for the same error as before "System.AccessViolationException: Attempted to read or write protected memory" and the other 4 for a new error of "System.IO.DirectoryNotFoundException: Could not find a part of the path 'C:\Users\darre\AppData\Local\RoboSharpUnitTesting\auw35chf.sbu'." (with obviously a different filename for each of the 4 failures)

image

PCAssistSoftware avatar Apr 10 '24 19:04 PCAssistSoftware

Passing in VS2019 and failing in VS2022. debugging now...

RFBomb avatar Apr 10 '24 19:04 RFBomb

Success!!!

image

PCAssistSoftware avatar Apr 10 '24 19:04 PCAssistSoftware

I find it very odd that one of the tests passed and other didn't in VS2022, more specifically I find it strange that VS2022 threw the error and VS2019 was not. But it is now unit testing successfully in both environments for me.

OF COURSE APPVEYOR FAILED. ><

RFBomb avatar Apr 10 '24 19:04 RFBomb

I find it very odd that one of the tests passed and other didn't in VS2022, more specifically I find it strange that VS2022 threw the error and VS2019 was not. But it is now unit testing successfully in both environments for me.

OF COURSE APPVEYOR FAILED. ><

yes as per previous reply all fine here in 2022 as well. All very odd though

PCAssistSoftware avatar Apr 10 '24 20:04 PCAssistSoftware

Alright so to summary all the issues that were causing discrepancy from 2019 to 2022 :

  • P/Invoking is picky! (as it should be)
  • The signature for the call to CopyFileEx was wrong initially in that I was using ref int pbCancel instead of ref bool pbCancel. But I had changed it to bool pbCancel. This resulted in the protected memory exceptions vs2022 was reporting (can't change a constant, so error). Changing it back to ref bool pbCancel resolve this exception.
  • parameter lpData was originally an IntPtr but troubleshooting the above issue I had changed it to object. vs2022 did not care about this, as the only usage was an IntPtr and as such it compiled it like one. But VS2019 compiled into an object, which the P/Invoke did not like, and the result was cutting off the flag parameter. Changing this to the correct IntPtr lpData resolved the issue.
  • Eventually resolving the signature to be correct resolved all tests.

RFBomb avatar Apr 10 '24 20:04 RFBomb

#Alright, @PCAssistSoftware I overhauled this (again), but this time with a force-push, bringing it from 60+ confusing noodles of commits to 16 (relatively) organized ones. I'm quite satisfied from my own cursory review here, but will need to re-look at my consuming library to double check. That wont happen until next week at the earliest, probably later.

But feel free to poke around, it should be a bit more organized than previously. Note that many things were re-arranged, and Windows-Specific functionality is now contained in the RoboSharp.Extensions.Windows category (CopyFileEx and FileFunctions.MoveWithProgress)

As a note, this is MOSTLY the same as capability-wise as previous, but much smaller number of commits instead of piece-meal many times over.

RFBomb avatar Jun 12 '24 01:06 RFBomb

HI @RFBomb - just acknowledging receipt of your above message. Snowed under with work at present but will try and take a good look at it next week :)

PCAssistSoftware avatar Jun 13 '24 16:06 PCAssistSoftware

Appreciated. This work was part of an over-arching refactoring of my other project, so I still have yet to complete that, which is when I will be doing my own more in depth testing

RFBomb avatar Jun 13 '24 22:06 RFBomb

Can you just give me a quick overview of what new commands are added by your PR and how I would use them, just want to ensure I am testing all the right areas for you. Thanks

PCAssistSoftware avatar Jun 16 '24 21:06 PCAssistSoftware

The high-level summary is that this introduces IFileCopier, and two implementations of it (streamed copier and copyfileX).

The idea is that you pass one of those factories into the new batch command and from there you add in your custom batch of files ( file one destination one, file two destination, two, etc.) in the form of IFilePair objects, or using the add methods.

When you start the command it will take the files specified and copy them. The idea with this is not so much for large server to server transfers where RoboCopy is likely more efficient, but for local transfers where you specify a couple files, Maybe from five or six different directories and copy them all to some destination directory, or even into wildly different locations.

The app that I have uses this to copy various files off of the c drive and onto a specialized folder structure on a USB. so not all of the files exist within the same source directory, nor do they all go into the same destination directory, it is just a batch of files that need to be collected, copied to some destination, and they will copy one after another

RFBomb avatar Jun 17 '24 13:06 RFBomb

Also, the current name is RoboBatchCommand, and the more I thought about it the more I dislike it. It feels too similar to RoboQueue. I'm thinking of shortening it as simply FileBatch or maybe BatchCommand.

Also, that object currently does not handle copying directories as a whole, Just files. So I was contemplating Support for directories. But then again, that is where you could use a RoboQueue that contains a file batch and a RoboCommand to perform that operation (which I personally think is probably the cleaner method, leave the file batch file batch, and a directory to a proper RoboCommand)

RFBomb avatar Jun 17 '24 13:06 RFBomb

Thanks, had an initial look over it and all seems to work fine as far as I can see from my basic testing and looking through the code. Is a really useful addition!

I agree re changing the name, BatchCommand stood out for me.

PCAssistSoftware avatar Jun 17 '24 15:06 PCAssistSoftware