msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

Copy task should support copying a directory easily

Open KirillOsenkov opened this issue 3 years ago • 20 comments

Copy task doesn't make it easy to copy the entire directory, it forces you to mess with wildcards etc. and it's gnarly

We should consider adding a new SourceFolder parameter on the Copy task that, if specified, pre-populates the SourceFiles with the entire directory. Of course SourceFolder and SourcesFiles should be mutually exclusive.

KirillOsenkov avatar Nov 13 '20 07:11 KirillOsenkov

Microsoft.Build.Artifacts can do this, as a workaround. Still agree it would be nice for the copy to be built in and something like Microsoft.Build.Artifacts would just call it.

jeffkl avatar Nov 13 '20 16:11 jeffkl

Ugh, we should really fix this. Copying some files into a directory and preserving the directory structure is super painful.

KirillOsenkov avatar Mar 19 '21 02:03 KirillOsenkov

I'm imagining two new properties, SourceRoot and DestinationRoot, that when set, act as the corresponding roots, and the relative path from SourceRoot is preserved for each file.

KirillOsenkov avatar Mar 19 '21 02:03 KirillOsenkov

This way this won't be a breaking change.

KirillOsenkov avatar Mar 19 '21 02:03 KirillOsenkov

or maybe just SourceRoot and PreserveRelativePaths combined with the existing DestinationFolder.

KirillOsenkov avatar Mar 19 '21 02:03 KirillOsenkov

When I eventually come back here, here's how to do it:

<Project DefaultTargets="Copy">

  <PropertyGroup>
    <SourceDir>C:\temp\a</SourceDir>
    <DestinationDir>C:\temp\b</DestinationDir>
  </PropertyGroup>

  <Target Name="Copy">

    <!-- 
    PITFALL: if this runs during evaluation, the files might not exist on disk yet.
    So ensure this runs during execution and the glob is expanded immediately
    before the actual copy.
    -->
    <ItemGroup>
      <SourceFile Include="$(SourceDir)\**\*" />
      <DestinationFile Include="@(SourceFile->'$(DestinationDir)\%(RecursiveDir)%(Filename)%(Extension)')" />
    </ItemGroup>

    <Copy SourceFiles="@(SourceFile)"
          DestinationFiles="@(DestinationFile)"
          SkipUnchangedFiles="true" />
  </Target>

</Project>

KirillOsenkov avatar Mar 19 '21 02:03 KirillOsenkov

Another issue is that with the current approach you can't copy symbolic links from a source to a destination directory, the link gets resolved and the actual file copied instead of the symlink.

This will probably need to wait for BCL support (https://github.com/dotnet/runtime/issues/24271) though.

akoeplinger avatar Mar 19 '21 08:03 akoeplinger

@KirillOsenkov did you try Microsoft.Build.Artifacts?

jeffkl avatar Mar 19 '21 16:03 jeffkl

No, I’m using vanilla MSBuild.

KirillOsenkov avatar Mar 19 '21 16:03 KirillOsenkov

I just corrected the snippet I posted earlier to expand the globs during execution (inside the target), and not during evaluation. When it ran during evaluation the project may be evaluated too early, and the glob might not pick up the files which are copied later by another target. Expanding the glob from inside the target is more reliable.

KirillOsenkov avatar Mar 20 '21 00:03 KirillOsenkov

This works for me: https://stackoverflow.com/questions/119271/copy-all-files-and-folders-using-msbuild

floreseken avatar Jul 07 '22 11:07 floreseken

Task batching is idiomatic in MSBuild and the documentation of the Copy task includes an example of copying a directory. Speaking for myself, I don't find it 'gnarly'. But I do understand that MSBuild has a problem where many people have expectations of how MSBuild works and behaves that are very different from how MSBuild actually works and behaves. And I understand that, for various reasons, batching often falls into that gap.

This feature request is not about adding functionality that the Copy task currently lacks. It is about adding syntactic sugar to the Copy task. I'm not against syntactic sugar but I do have concerns about adding more parameters to the Copy task (which currently has 9 input parameters) especially when the parameters are not extending functionality.

Instead of extending the Copy task, I would propose creating a new CopyDir task.

Like MakeDir and RemoveDir, CopyDir would accept an ItemGroup of directories. CopyDir would also take a DestinationFolder and support an output ItemGroup of the directories copied (CopiedDirectories).

The CopyDir task would be a specialization of the Copy task and could be viewed as redundant, but it may succeed as even less 'gnarly' than extending the Copy task because it would be more focused conceptually and would have fewer parameters.

jrdodds avatar Jul 29 '22 16:07 jrdodds

I personally don't see anything wrong with adding another parameter to Copy, because this is where people will go looking for it. "Too many parameters" is not an objective downside.

A brand new CopyDir task is not discoverable. In fact, it will restore symmetry, since DestinationFolder already exists, so only makes sense to support <Copy SourceFolder="A" DestinationFolder="B" />.

In fact the danger with adding CopyDir is that people might already have their own custom/third-party CopyDir task, so adding the in-the-box one will mess with their builds and is a breaking change we can't afford. Whereas adding SourceFolder to Copy is not a breaking change.

KirillOsenkov avatar Jul 29 '22 17:07 KirillOsenkov

I disagree that

"Too many parameters" is not an objective downside.

There is general consensus that a function or class method should be limited in the number of arguments. The same principle applies here. But we can agree that we just see that differently and that, among all the possible trade-offs, it may be minor.

Some of your points can be generalized as essentially that no new tasks should ever be added. I understand your comments are about adding a CopyDir task, but that's the larger implication. Any new task can be considered to be not discoverable, and any new task might be a conflict and a breaking change for someone. The breaking change issue is an important point because UsingTask elements don't override the same way as Properties.

Related to the re-design of the Copy task:

From your description I assume you are envisioning SourceFolder as an ITaskItem and not an ITaskItem[]. Would specifying <Copy SourceFolder="A" DestinationFiles="B" /> be an error?

The following would be accepted?

<Copy SourcesFiles="A" DestinationFolder="B" />
<Copy SourcesFiles="A" DestinationFiles="B" />
<Copy SourceFolder="A" DestinationFolder="B" />

And the following would be errors?

<Copy SourceFolder="A" DestinationFiles="B" />
<Copy SourceFolder="A" DestinationFolder="B" DestinationFiles="C" />
<Copy SourceFolder="A" SourcesFiles="B" DestinationFiles="C" />
<Copy SourceFolder="A" SourcesFiles="B" DestinationFolder="C" />

jrdodds avatar Jul 29 '22 18:07 jrdodds

@jrdodds You're taking things off-topic. The feedback is that the current behavior feels limited and counterintuitive. There is perhaps unfortunately a very opinionated suggested solution @KirillOsenkov brought up, but that should be fine; maintainers can take the solution or not.

There is general consensus that a function or class method should be limited in the number of arguments.

Do you have data and actual numbers on this? This sounds like an opinion being framed as a universal truth. This response is starting to argue about the suggested solution rather than keeping the thread focused on the feedback.

mikerochip avatar Jul 29 '22 18:07 mikerochip

Hi @mikerochip, The feature request is to extend the Copy task and it has been tagged as 'needs-design' and 'up-for-grabs'. I see discussion of the design issues and trade-offs involved as being very on-topic. @KirillOsenkov has made several points about the design and I see that as helpful, constructive, and on-topic.

Regarding the number of arguments to a function: No, not a universal truth but it is a long-standing coding guideline (at least in my understanding and experience). If you would like a citation that shows this is not a personal opinion of my own invention, here is a SonarQube rule for C#: Methods should not have too many parameters.

My apologies if my response came across as argumentative. My intent was to be constructive and to explore the issues. If I decide to take this issue or pass on this issue, I want to be clear about the changes. Thanks

jrdodds avatar Jul 29 '22 19:07 jrdodds

Appreciate the clarification @jrdodds . My only horse in this race is that I have to do this exact thing (copy source folder into target folder, maintain directory structure) about once every year and a half, and I forget how to do it every time, and google searching brings me here, so I watched this issue.

I don't have a strong opinion on what the solution is, but I like @KirillOsenkov's solution. Yours would do the trick as well.

My main desire is getting to the point where this task is intuitive enough where I can just look at the Copy task docs and be on my way, rather than google search and end up in a years-long unresolved thread; I think I and many other folks on GitHub probably have experienced that more times than we want 😂 . At least for this issue there are options and this issue is effectively just that the current state isn't as ergonomic as it could be.

mikerochip avatar Jul 29 '22 20:07 mikerochip

@mikerochip It may be a recent change (I haven't looked at the history) but Example 2 in the Copy task documentation is what you need to copy a directory.

jrdodds avatar Jul 29 '22 20:07 jrdodds

Invoking the Copy task with the RecursiveDir metadata is invoking task batching.

<Copy SourceFiles="@(MySourceFiles)" DestinationFolder="$(DestDir)\%(RecursiveDir)" />

Task batching will invoke the Copy task for each 'batch'. I haven't tested but it seems likely, especially for a source directory with many sub-directories (and hence many unique RecursiveDir values each of which will be a separate batch), that an option that bypasses batching will perform the copy faster and more efficiently.

The Copy task has logic for retries and for parallelizing. After looking at the code, I think it is fair to say that factoring the existing Copy task into Copy and CopyDir tasks that share and re-use the existing logic is not a small effort. More to the point, it would be a larger effort than @KirillOsenkov's concept of adding a SourceFolder parameter on the Copy task. (So I have come around but for a rather different set of reasons. :grinning:)

jrdodds avatar Aug 05 '22 22:08 jrdodds

Design Proposal

This issue is currently tagged as 'needs-design'. This is a possible way that the feature request could be satisfied.

Background

For source and destination, the Copy task currently has the following parameters:

Name Req or Opt Type Note
SourceFiles Required ITaskItem[]
DestinationFiles Optional ITaskItem[] is an error if used with DestinationFolder;
is an error if not one-to-one with SourceFiles
DestinationFolder Optional ITaskItem is an error if used with DestinationFiles;
is expected to be a directory

The following are accepted:

<Copy SourcesFiles="AFile;BFile" DestinationFolder="BDirectory" />
<Copy SourcesFiles="AFile;BFile" DestinationFiles="CFile;DFile" />

:x: The following are errors:

<Copy SourcesFiles="AFile;BFile" DestinationFolder="BDirectory" DestinationFiles="CFile;DFile" />
<Copy SourcesFiles="AFile;BFile" DestinationFiles="CFile" />

Sources and destinations are resolved into calls to either File.Copy or OS specific methods for creating either hard or symbolic links.

Enhancement to the Copy Task to copy Directories

Task Parameters

Add a SourceFolders parameter.

SourceFolders will be an ITaskItem[] of folders to be copied to DestinationFolder. Like DestinationFolder, the members of SourceFolders will be expected to be directories.

Name Req or Opt Type Note
SourceFiles Optional ITaskItem[]
SourceFolders Optional ITaskItem[] is an error if used with DestinationFiles
DestinationFiles Optional ITaskItem[] is an error if used with DestinationFolder;
is an error if not one-to-one with SourceFiles
DestinationFolder Optional ITaskItem is an error if used with DestinationFiles;
is expected to be a directory

It would be an error if neither SourceFiles nor SourceFolders is provided to the task.

A table of which parameters could be used together:

SourceFiles SourceFolders DestinationFiles DestinationFolder
SourceFiles :heavy_check_mark: Yes :heavy_check_mark: Yes :heavy_check_mark: Yes
SourceFolders :heavy_check_mark: Yes :x: No :heavy_check_mark: Yes
DestinationFiles :heavy_check_mark: Yes :x: No :x: No
DestinationFolder :heavy_check_mark: Yes :heavy_check_mark: Yes :x: No

The following would be accepted:

<Copy SourceFolders="ADirectory;BDirectory" DestinationFolder="CDirectory" />
<Copy SourceFolders="ADirectory;BDirectory" SourcesFiles="AFile;BFile" DestinationFolder="CDirectory" />

:x: The following would be errors:

<Copy SourceFolders="ADirectory;BDirectory" DestinationFiles="CFile;DFile" />
<Copy SourceFolders="ADirectory;BDirectory" DestinationFiles="CFile;DFile" DestinationFolder="CDirectory" />
<Copy SourceFolders="ADirectory;BDirectory" SourcesFiles="AFile;BFile" DestinationFiles="CFile;DFile" />

Given directories c:\example\target and c:\example\work, the following Copy task would create a copy of the target directory under the work directory, i.e. c:\example\work\target.

<Copy SourceFolders="c:\example\target" DestinationFolder="c:\example\work" />

This matches the existing behavior of SourceFiles and DestinationFolder and allows for the SourceFolders and SourceFiles parameters to be used together.

For each directory in SourceFolders, it would be expected that each child of the directory would be copied with the same relative path, e.g. c:\example\target\foo\bar.cs would be copied to c:\example\work\target\foo\bar.cs.

The following example shows how files and folders from a 'root' directory could be dynamically 'discovered' and copied without copying the 'root'.

<PropertyGroup>
  <SourceRootDirectory>c:\example\target</SourceRootDirectory >
</PropertyGroup>
<ItemGroup Condition="Exists('$(SourceDirectory)')">
  <SrcDirectories Include="$([System.IO.Directory]::GetDirectories($(SourceRootDirectory)))" />
  <SrcFiles Include="$(SourceRootDirectory)\*.*" />
</ItemGroup>
<Copy SourceFolders="@(SrcDirectories)" SourcesFiles="$(SrcFiles)" DestinationFolder="c:\example\work" />

Internals

New validations and errors would need to be added.

When SourceFolders is present, each directory in SourceFolders would be recursed and the files enumerated and added to the set of files to copy.

:question: Should empty directories be copied?

With the following code, empty directories are not included and are not copied.

<ItemGroup>
  <MySourceFiles Include="c:\MySourceTree\**\*.*"/>
</ItemGroup>
<Copy SourceFiles="@(MySourceFiles)" DestinationFolder="$(DestDir)\%(RecursiveDir)" />

But SourceFolders could easily include empty directories. Is this an enhancement? Or an unwanted change in behavior?

jrdodds avatar Aug 06 '22 19:08 jrdodds

MSBuild Team triage: we like the proposal, @jrdodds.

❓ Should empty directories be copied?

With the following code, empty directories are not included and are not copied.

<ItemGroup>
  <MySourceFiles Include="c:\MySourceTree\**\*.*"/>
</ItemGroup>
<Copy SourceFiles="@(MySourceFiles)" DestinationFolder="$(DestDir)\%(RecursiveDir)" />

But SourceFolders could easily include empty directories. Is this an enhancement? Or an unwanted change in behavior?

@Forgind and I lean toward "yes, copy empty directories". It feels more intuitive, and if it is desired but not done, it's a pain to recreate the behavior with manual directory creation. This may require a bit more effort in the implementation though so we can revisit if it's horrible.

Removing the needs-design label since it doesn't any more.

@jrdodds are you interested in contributing the implementation?

rainersigwald avatar Oct 13 '22 16:10 rainersigwald

Let's elaborate on SourceFolder vs. SourceFolders. What is the behavior if multiple source folders are specified? Is this really necessary? Can we achieve what we want with just a single SourceFolder?

concretely, if I have SourceFolders="C:\A\A1;C:\B\B1" and the DestinationFolder="C:\D", what is the resulting layout on disk? C:\D\A1 and C:\D\B1 or are the contents of A1 and B1 copied directly to D?

If it's not immediately clear to me, it is probably a source of confusion to many.

Whereas copying from a single source directory to a single destination directory is well defined and understood semantics used by all the tools like cp, copy, xcopy, etc. It means exactly "take all contents of the source and copy it to the destination".

Basically I want <Copy SourceFolder="C:\A" DestinationFolder="C:\B" /> to work exactly like robocopy would (but no /MIR of course).

KirillOsenkov avatar Oct 13 '22 16:10 KirillOsenkov

@jrdodds are you interested in contributing the implementation?

Yes. Thank you.

jrdodds avatar Oct 13 '22 17:10 jrdodds

... if I have SourceFolders="C:\A\A1;C:\B\B1" and the DestinationFolder="C:\D", what is the resulting layout on disk? C:\D\A1 and C:\D\B1 or are the contents of A1 and B1 copied directly to D?

@KirillOsenkov Given

<Copy SourceFolders="C:\A\A1;C:\B\B1" DestinationFolder="C:\D" />

The result would be C:\D\A1 and C:\D\B1.

To quote from the proposal:

Given directories c:\example\target and c:\example\work, the following Copy task would create a copy of the target directory under the work directory, i.e. c:\example\work\target.

<Copy SourceFolders="c:\example\target" DestinationFolder="c:\example\work" />

This matches the existing behavior of SourceFiles and DestinationFolder and allows for the SourceFolders and SourceFiles parameters to be used together.

The intent is to extend the current Copy task while maintaining the task's existing semantics and behavior. SourceFiles is a set of files; SourceFolders is a set of folders. Both sets of items will be copied to the DestinationFolder.

The current Copy task is already very unlike robocopy and supporting an ItemGroup for SourceFolders makes sense for MSBuild. The ItemGroup can be created with an Exclude or otherwise filtered before being used.

SourceFolders can do something that a single source folder can't, which is that a single Copy invocation can pull from multiple sources as in the SourceFolders="C:\A\A1;C:\B\B1" example.

jrdodds avatar Oct 13 '22 20:10 jrdodds

OK and how do I achieve the semantics that I want? Copy all files and directories from Source to Destination? Again, if I understand your proposal correctly, it’s highly unintuitive and unexpected.

KirillOsenkov avatar Oct 13 '22 22:10 KirillOsenkov

BTW this is only tangentially related, but I wrote a tool called ContentSync that copies one directory to another one incrementally (similar to robocopy /MIR), but if a file contents are identical it doesn't touch the file (it uses content hash instead of timestamps).

https://github.com/KirillOsenkov/ContentSync

KirillOsenkov avatar Oct 31 '22 17:10 KirillOsenkov

@JaynieBai Please assign this issue to me. Thanks

jrdodds avatar Mar 09 '23 14:03 jrdodds

Not an answer or solution to this feature request but here is something to help with the current "super painful" task of copying the content of a folder into another folder.

Given two properties, SourceFolder and DestinationFolder, that are both paths to directories, then the following will copy the contents, directories and files, of the directory in the SourceFolder property to the directory in the DestinationFolder property:

    <ItemGroup>
      <FilesToCopy Include="$([MSBuild]::EnsureTrailingSlash('$(SourceFolder)'))**\*.*"/>
    </ItemGroup>
    <Copy SourceFiles="@(FilesToCopy)" DestinationFolder="$([MSBuild]::EnsureTrailingSlash('$(DestinationFolder)'))%(RecursiveDir)"/>

Save the above as a code snippet and, when you need to copy a folder, paste this snippet and replace the property names as needed.

The above code is essentially Example 2 in the Copy task documentation. You can copy from the example.

Either way, if you find it difficult, there is no need to write this from scratch every time.

jrdodds avatar May 08 '23 02:05 jrdodds

I was asked to take a look at this from the design perspective now that we are getting close to an implementation, so here I am.

First off, thank you for writing such a detailed spec comment/example @jrdodds - that made it very easy to work through the scenarios you had in mind. I agree with @KirillOsenkov that at first glance the subtleties of SourceFiles vs SourceFolders can be confusing - but I think keeping the Item-centric nature of MSBuild in mind helped me reason through the question in the same way that @jrdodds and @rainersigwald ended up at. MSBuild is already distinct from other tools in the way the the Copy task works, so we should clarify intent and behavior in the docs for the Copy Task when each version should be used.

baronfel avatar Jun 19 '23 15:06 baronfel

It seems unfortunate that you need to specify folders and files to copy separately, if you want to copy both. E.g., in the example above:

<Copy SourceFolders="ADirectory;BDirectory" SourcesFiles="AFile;BFile" DestinationFolder="CDirectory" />

It seems like it would be more uniform to have a single "Sources" list that contains either directories/folders or files, and tree copies the directories or individually copies the files, as appropriate. E.g.,

<Copy Sources="ADirectory;BDirectory;AFile;BFile" DestinationFolder="CDirectory" />

This way, if you have an Item that contains a mix, you can use:

<Copy Sources="@(Paths)" DestinationFolder="CDirectory" />

BruceForstall avatar Feb 25 '24 22:02 BruceForstall