tools-devteam icon indicating copy to clipboard operation
tools-devteam copied to clipboard

Make Tools Collection-Aware

Open jmchilton opened this issue 10 years ago • 17 comments

Most tools do not require modifications - but tools that should consume pairs of datatsets should probably be updated to allow paired inputs via a <data_collection collection_type="paired"... parameter and tools that do reductions over datasets should be reworked to use multiple="true" data inputs or augmented to allow data_collection parameters.

Tools Consuming Pairs

  • [x] tophat2
  • [x] bowtie2
  • [x] bwa (maybe...)
  • [ ] fastq_paired_end_deinterlacer, fastq_paired_end_interlacer, fastq_paired_end_joiner, fastq_paired_end_splitter.

Dataset Reduction Tools

  • [x] cuff*

jmchilton avatar Nov 21 '14 13:11 jmchilton

Also this new tool markup needs documenting on https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax

peterjc avatar Nov 21 '14 13:11 peterjc

@peterjc markup has been documented there and there is now a tool tutorial for collections in progress as part of planemo's extended tool development documentation (http://planemo.readthedocs.org/en/master/writing_standalone.html#collections).

jmchilton avatar Mar 24 '15 13:03 jmchilton

Thanks @jmchilton.

It hasn't been migrated to this repository yet, but I'd like the "Concatenate Dataset" tool (cat, aka tools/filters/catWrapper.xml) updated too. i.e. use multiple="true" instead of the repeat block: https://github.com/galaxyproject/galaxy/blob/dev/tools/filters/catWrapper.xml

peterjc avatar Mar 24 '15 13:03 peterjc

See also #92.

jmchilton avatar Mar 27 '15 14:03 jmchilton

@peterjc Yeah - there are probably a handful of utilities like that - that should exist to make working with collections (and multiple files generally) more useful. I am not sure if they should be in the tool shed or in the distribution. I also don't know if we should touch something like cat1 that must be part of so many workflows or if we should just make a new set of tools. Maybe I will create an issue so we can at least summarize what the common set of utilities should be.

jmchilton avatar Mar 27 '15 14:03 jmchilton

@jmchilton In the case of cat1, this would likely break many old workflows so there is a lot to be said for moving it to the Tool Shed first (so that multiple versions can be installed at once), or simply deprecating/hiding the old tool and adding a new one with a different identifier. A new issue for that seems best (where? Trello or here on tools-devteam?)

peterjc avatar Mar 27 '15 14:03 peterjc

I was going in the direction of creating a new set of tools: https://github.com/bgruening/galaxytools/tree/master/tools/text_processing/text_processing. This is based on core-utils and should be much more efficient than current text processing tools. But with new code comes new errors: https://trello.com/c/1MUUPdq2

bgruening avatar Mar 27 '15 14:03 bgruening

As collections are now fully functional and I guess with 15.07 they will be used a lot, should we make a small virtual hackathon to upgrade as much tools as possible and offer a few nice reduce tools? Ideas welcome.

bgruening avatar Aug 01 '15 22:08 bgruening

@bgruening - I like this idea a lot. I was thinking we should start to do more smaller, virtual hackathons. I was thinking of starting with workflows - but collections may be even a better idea.

How about we pick two days in September (that was we have time to inform people and get the word out). Maybe September 17th and 18th? We have 4 google hangouts - one in the morning US time / midday Europe and one in the early afternoon US / evening Europe time each day to discuss things and plan out how to proceed.

jmchilton avatar Aug 03 '15 18:08 jmchilton

And also non-stop stream from the lab :dancer: :+1:

martenson avatar Aug 03 '15 19:08 martenson

Awesome! How many people from the lab are on board. I would like to send out a lot of information material beforehand (planemo, aplliances, todo lists?) so we can jump start. @peterjc @nsoranzo @erasche can we label it as IUC event?

bgruening avatar Aug 03 '15 19:08 bgruening

Interesting idea - I've pencilled those dates in on my calendar, although not sure if I can commit the time I would try to follow along.

As long as it isn't seen as being committee members only, branding this as an IUC event sounds fine to me.

peterjc avatar Aug 04 '15 09:08 peterjc

Oh no sorry, the idea was to say the IUC is organising a virtual hackathon to improve tools. Join us!

bgruening avatar Aug 04 '15 10:08 bgruening

:+1:

peterjc avatar Aug 04 '15 10:08 peterjc

Ping @Takadonet @apetkau - would either of you (or maybe Eric) be able to set aside some time for hacking these days - or maybe different days in August? I am not aware of anyone that has done more collection tool and workflow building than your lab - it would be great to have your experience at hand.

jmchilton avatar Aug 04 '15 12:08 jmchilton

Sorry, apparently I'd muted this/ignored it. Will definitely be available to help hack :)

hexylena avatar Aug 10 '15 15:08 hexylena

Hey, sorry about that. Didn't see this until now. September's pretty busy for me so I'm not sure if I'll have time, however I'll see. If I can get a lot of my work done in the next month I may be able to set aside some time for tool hacking. Great idea by the way.

apetkau avatar Aug 10 '15 18:08 apetkau