Per Unneberg issues

Results 49 issues of


                                            Per Unneberg

Make standalone scripts for collecting results

Data output is now collected via rules, making heavy use of pandas. This functionality should also be present in standalone scripts, along the following lines: ``` collect_results.py --samples SAMPLE1 SAMPLE2......

type: feature

cutadapt paired-end rule duplicates parameters with "file:"

Current rule is not adapted to using "file:" syntax as options -b and -B duplicate the values of 'threeprime' and 'fiveprime'

type: bug

Use new luigi visualizer

enhancement

global config not updated correctly

backend.**global_config** should be updated once all parameters have been updated correctly. Currently not working/implemented.

Better SLURM/drmaa integration

(Long-term goal?): Integrate with SLURM/drmaa along the lines of luigi.hadoop and luigi.hadoop_jar. Currently using the local scheduler on nodes works well enough

Control number of threads / worker

How control the number of workers/threads in use? An example best explains the issue: alignment with bwa aln can be done with multiple threads. bwa sampe is single-threaded, and uses...

Hadoop integration

Integrate with hadoop. This may be extremely easy: set the job runner for the JobTasks via the config file; by default, they use DefaultShellJobRunner, but could also use a (customized...

Implement restarting functions

Implement options --restart and --restart-from that restart from scratch or from a given task. Would require calculation of target names between any two vertices in the dependency graph. The idea...

Clean up intermediate output

Add task for cleaning up intermediate output (related to issue on pipes). Tmp files could be removed if is_tmp=True?

Task event timing

I have added start_time and end_time to BaseJobTask, but currently the times don't get submitted to the graph/table interface. This would allow monitoring execution times and identifying pipeline bottlenecks.