ganga icon indicating copy to clipboard operation
ganga copied to clipboard

Inconsistency between behaviour of wildcards and exact filenames when outpufiles.localdir is not empty

Open vedanshbhartia opened this issue 1 year ago • 1 comments

The Issue:

The behaviour of LocalFile is inconsistent between the use of wildcards and exact filenames in outputfiles. This issue is seen only when the localdir of outputfile is not empty. For wildcards, the outputfiles are not copied to the localdir, and the job status is failed.

Reproduction:

Assume the following code is placed in the ~/ganga directory, and run using the ganga batch mode from the same directory. Code to reproduce:

j = Job()
j.application.exe = "touch"
j.application.args = ["a.txt", "b.txt"]

# After the job completes, the files a.txt and b.txt are placed in the ~/ganga directory, and the job status is completed
j.outputfiles = [LocalFile("./a.txt"), LocalFile("./b.txt")]

# If the following outputfiles list is used, no files are placed in the ~/ganga directory, and the job status is failed
# j.outputfiles = [LocalFile("./*.txt")]

j.submit()

A recording of the reproduction is also available at https://asciinema.org/a/utqAYu97FVkC0MzxcGVnMiXkO

Analysis

The copying issue seems to be because of the way the put method in LocalFile.py is implemented.

    def put(self):
        """
        Copy the file to the destination (in the case of LocalFile the localDir)
        """
        # This is useful for placing the LocalFile in a subdir at the end of a job

        # FIXME this method should be written to work with some other parameter than localDir for job outputs but for now this 'works'
        if self.localDir:
            try:
                job = self.getJobObject()
            except AssertionError:
                return

            # Copy to 'desitnation'

            if path.isfile(path.join(job.outputdir, self.namePattern)):
                if not path.exists(path.join(job.outputdir, self.localDir)):
                    os.makedirs(path.join(job.outputdir, self.localDir))
                shutil.copy(path.join(job.outputdir, self.namePattern),
                            path.join(job.outputdir, self.localDir, self.namePattern))

It tries to check whether the outputfiles are present in the job's temporary directory (job.outputdir in the code), and copies them to path.join(job.outputdir, self.localDir) (this should correspond to ~/ganga in the example at the top).

Please not the use of the ./ in the reproduction script at the beginning of the filename, which causes an absolute path to be used. The absolute path causes the job.outputdir to be ignored in the path.join call, and the file gets copied to self.localDir Though the reproduction uses this weird quirk of path.join, this issue should still exist when self.localdir is a relative subdirectory of job.outputdir.

For wildcards, the namePattern is *.txt, while for exact file names, the namePattern would be a.txt Since the code looks for an exact match, and no file is present that is named *.txt, no files are copied when wildcards are used.

Fix

If the put method is slightly modified to use pattern matching rather than looking for exact filenames, wildcard files are copied too.

    def put(self):
        """
        Copy the file to the destination (in the case of LocalFile the localDir)
        """
        # This is useful for placing the LocalFile in a subdir at the end of a job

        # FIXME this method should be written to work with some other parameter than localDir for job outputs but for now this 'works'
        if self.localDir:
            try:
                job = self.getJobObject()
            except AssertionError:
                return

            if not path.exists(path.join(job.outputdir, self.localDir)):
                os.makedirs(path.join(job.outputdir, self.localDir))

            for outputFile in glob.glob(path.join(job.outputdir, self.namePattern)):
                shutil.copy(outputFile,
                            path.join(job.outputdir, self.localDir, path.basename(outputFile)))

This does not fix the issue with the job failing, I have yet to check what causes that particular problem.

vedanshbhartia avatar Mar 01 '23 06:03 vedanshbhartia

This is a little bit of a tricky issue.

  • When we use LocalFile in the inputfiles attribute, we want to make sure that any relative PATH is expanded to an absolute one (this ensures that what is persisted in the Ganga repository doesn't depend on from which directory Ganga was started). However, when it is used in the outputfiles attribute, you could argue that we should only expand the relative filename when we need to use it.
  • If we have the same filename in multiple subdirectories (so ./a/file.txt and ./b/file.txt), then a wildcard ./*/file.txt would lead to two files with the same name copied back. For a LocalFile we might just replicate that directory structure but it is not obvious that this would work for DiracFile, GoogleFile etc.

egede avatar Mar 06 '23 10:03 egede