doit icon indicating copy to clipboard operation
doit copied to clipboard

dependencies is a set not a list

Open dvot197007 opened this issue 7 years ago • 13 comments

Hello, The documentation section titled "keywords with task metadata" says:

dependencies: list of file_dep

In trying dependencies[0] in a task, the interpreter told me that dependencies is actually a set and can't be indexed. Hence, the documentation should be changed to reflect that.

By the way, the work around suggested on stack exchange is to use list(dependencies)[0].

I haven't checked, but it may be that the keyword 'changed' also refers to a set rather than a list as stated in the documentation?

Thanks, David

dvot197007 avatar Apr 17 '18 11:04 dvot197007

I think better change the code :smile:

Internally doit converts file_dep to set but i think it makes more sense to pass a list as mentioned in the docs.

Should be an easy fix...

schettino72 avatar May 26 '18 05:05 schettino72

Is the order of dependencies passed to the command or python actions guaranteed?

I would think it would match the order it was declared in the task's dictionary. i.e. if I only specify file_dep to be a list of [a, b, c] then %(dependencies)s should guarantee a b c and not b c a.

averagehat avatar Aug 01 '19 00:08 averagehat

I just ran into this - the order seems to be arbitrary. I specified a file_depas ['a.txt', 'b.txt'] and then in the task assigned a, b = dependencies. I ended up with a == 'b.txt' and b == 'a.txt'. This was quite surprising.

valrus avatar Nov 24 '19 01:11 valrus

can you try master? i will do a release next week (hopefully).

schettino72 avatar Nov 24 '19 06:11 schettino72

Just checked and it looks like the order of dependencies is maintained in master. Thanks @schettino72!

valrus avatar Nov 25 '19 02:11 valrus

Oops, I was mistaken - sorry. dependencies is now a list, not a set, but apparently it's still a set somewhere in the process because the order of the list isn't stable. I stuck a debugger call in a Task and ran it twice, inspecting a two-element dependencies list during each run, and the first run had ['a.txt', 'b.txt'] and the second had ['b.txt', 'a.txt'].

valrus avatar Nov 25 '19 05:11 valrus

Just ran into this. I have a task that has several dependencies of different type (data structure). How am I supposed to process them, when the order from the file_dep-field is not preserved? I can't split it into multiple tasks. Are file dependencies supposed to be of the same type in doit, because of set()?

gambolputty avatar Aug 08 '22 20:08 gambolputty

@schettino72 if the order of file_dep is not guaranteed, what's the expected/correct way to handle multiple file dependencies? I'm still struggling with this today, and I've resorted to searching the file paths for substrings.

sdahdah avatar Oct 10 '23 14:10 sdahdah

@sdahdah why would you need the order of file_dep to be guaranteed?

@gambolputty just save your structure of which filename is what in a separate place. I guess you could use the meta task parameter for this.

schettino72 avatar Oct 10 '23 20:10 schettino72

@schettino72 Thanks for answering so fast. I have two different datasets from different sensors that are both required by the algorithm I'm running. One is a CSV and one is a pickle.

My current workaround is to search the dependency list for *.pickle to load the first file, and *.csv to load the second file. How would you recommend I handle this? I'd appreciate any insight you have

sdahdah avatar Oct 10 '23 20:10 sdahdah

def task_hello():
    def python_hello(pickle, json, dependencies):
        print(f'Dependencies are: {dependencies}')
        print(f'pickle is: {pickle}')
        print(f'json is: {json}')

    pickle = 'foo.pickle'
    json = 'bar.json'
    return {
        'actions': [(python_hello, [pickle, json])],
        'file_dep': [pickle, json],
        'verbosity': 2,
    }

schettino72 avatar Oct 11 '23 01:10 schettino72

@schettino72 This makes sense! I was making all my Python actions have the signature python_hello(dependencies, targets). I did not realize adding extra parameters was the way to go. Thank you!

sdahdah avatar Oct 11 '23 12:10 sdahdah

I think dependency order preservation is great when the order of the input files to a program affect the hash of the output, and you expect doit to compile your files and produce an output with a specific hash.

cosineblast avatar Jan 19 '24 03:01 cosineblast