nextflow
nextflow copied to clipboard
Enhance the azcopy options configuration scope
This PR builds upon the available settings with the core azcopy
tool which is used for uploads and downloads in the context of Azure Blob storage service.
@pditommaso this PR is ready from my side - but depends on a sister PR https://github.com/nextflow-io/azcopy-tool/pull/1 where I've mentioned the major benefit of the update since we source the azcopy
tool from that repo.
UPDATE: Keeping this as draft so as not to merge accidentally before the sister PR.
Quick update:
I've had to remove the options which are not compatible with the older 10.8.0
version of azcopy
which we use and now the behavior is as expected on the Azure side
The following config with custom azcopy opts are reflected on the generated .command.run
files ✅
azcopy {
blockSize = "17"
blobTier = "Cool"
putMD5 = true
overwrite = false
checkMD5 = "NoCheck"
copyToolInstallMode= 'task'
}
The only problem remains to be solved is the last test in AzBashLibTest
, for which @jorgeaguileraseqera has graciously offered to help (thanks!)
https://github.com/nextflow-io/nextflow/blob/db89af1e4a5f2021972a2bdf2e0bb2aed997245e/plugins/nf-azure/src/test/nextflow/cloud/azure/file/AzBashLibTest.groovy#L163
The unit tests for the code and the functional test (via rnaseq-nf
) are passing now
(base) bash-5.0$ ../../nextflow/launch.sh run main.nf -profile azb
N E X T F L O W ~ version 22.08.0-edge
Launching `main.nf` [fervent_descartes] DSL2 - revision: 4ba66eb0c8
R N A S E Q - N F P I P E L I N E
===================================
transcriptome: /Users/eklavya/projects/code/nextflow_on_azure/rnaseq-nf/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa
reads : /Users/eklavya/projects/code/nextflow_on_azure/rnaseq-nf/data/ggal/ggal_gut_{1,2}.fq
outdir : az://batch-jobs/rnaseq-nf-publishdir
Uploading local `bin` scripts folder to az://batch-jobs/rnaseq-nf-workdir/tmp/7e/03ec1928a1e8327db898dd00687297/bin
[e3/17ed0c] process > RNASEQ:INDEX (ggal_1_48850000_49020000) [100%] 1 of 1 ✔
[7e/54d1f2] process > RNASEQ:FASTQC (FASTQC on ggal_gut) [100%] 1 of 1 ✔
[5d/3cbff4] process > RNASEQ:QUANT (ggal_gut) [100%] 1 of 1 ✔
[04/c40c43] process > MULTIQC [100%] 1 of 1 ✔
Staging foreign file: /Users/eklavya/projects/code/nextflow_on_azure/rnaseq-nf/multiqc
Done! Open the following report in your browser --> az://batch-jobs/rnaseq-nf-publishdir/multiqc_report.html
Completed at: 05-Aug-2022 11:26:07
Duration : 1m 37s
CPU hours : (a few seconds)
Succeeded : 4
@pditommaso , thanks for the review. I have accommodated the change requests now.
@pditommaso , I've made the MD5 (check/put) behavior as the default and resolved all merge conflicts.
Shall we merge ahead?
The next round of changes would be done only after the azcopy
tool has been upgraded via https://github.com/nextflow-io/azcopy-tool/pull/1 (WIP)
:warning: 7 God Classes were detected by Lift in this project. Visit the Lift web console for more details.
@pditommaso , shall we take this merge forward?
@pditommaso , shall we merge the PR with this edge release cycle?
Abi, there's the usual problem your PRs do not run the tests and I've tested locally and it fails
Abi, there's the usual problem your PRs do not run the tests and I've tested locally and it fails
🤔 this is curious. Let me take a look why there's no automated testing here.
We lost the momentum on this. Closing it because I don't know the status of this PR
True 😞
But quick question, do you think that this effort would be a useful addition since fusion
is now available for Azure as well?
That's another good point. The ideal goal is to not rely anymore on azcopy and use instead fusion in the mid-term
I see, in that case let me know if there's a need for the retouches on this PR and then I can make the iteration and circle back. Otherwise, I agree, good opportunity to showcase fusionfs
for this 👍