TESK
TESK copied to clipboard
PureFTPd fails with some workflows
It is not clear why this happens, or which component (WES, TESK, or FTP) is at fault, but when trying to run the RD workflow using a pureFTPd server, it fails.
I will run the workflow again and get more log data. For the time being, crawling through slack I found few of the errors:
jlcwl pure-ftpd: ([email protected]) [NOTICE] /mnt/data/input//cae43838-6e57-4c03-952b-fd49be6383fa/output_15f17c58-0771-4b90-a9f0-45cf8238d213/dbsnp_138.b37.vcf.gz uploaded (1472953420 bytes, 38655.42KB/sec)
jlcwl pure-ftpd: ([email protected]) [ERROR] Can't open cwl.output.json: No such file or directory
jlcwl pure-ftpd: ([email protected]) [INFO] Can't change directory to cwl.output.json: No such file or directory
jlcwl pure-ftpd: ([email protected]) [INFO] Can't change directory to //cae43838-6e57-4c03-952b-fd49be6383fa/output_15f17c58-0771-4b90-a9f0-45cf8238d213/cwl.output.json: No such file or directory
schema_salad.validate.ValidationException: location \"ftp://195.148.31.210//cae43838-6e57-4c03-952b-fd49be6383fa/output_1a0cdc01-c336-4d73-94ee-e5ea8f1701c1/\" ends with \"/\" but is not a Directory
Here is what we got for a failed RD workflow:
This is just one one of the tasks, I can send all the tasks if needed:
task-84008576-lkn67.log task-84008576-outputs-filer-n5zm5.log task-84008576-inputs-filer-lgnsd.log task-84008576-ex-00-2jrn5.log
I forgot to mention that the file is indeed uploaded to the FTP server:|
[cloud-user@ecp-cla WES-cli]$ lftp vm1976.kaj.pouta.csc.fi
lftp [email protected]:~> ls c4
c44bad4a-adce-4aed-a2f9-904b334ab283/ c47f4ac9-0a55-4035-bef0-eb2f2745493e/
lftp [email protected]:~> ls c47f4ac9-0a55-4035-bef0-eb2f2745493e/
drwxr-xr-x 6 1001 jarno 210 Jul 2 07:35 .
drwxr-xr-x 6 1001 jarno 210 Jul 2 07:35 ..
drwxr-xr-x 2 1001 jarno 58 Jul 2 07:34 output_3dc07dbe-ff84-482a-8029-3684bdc77fd4
drwxr-xr-x 2 1001 jarno 26 Jul 2 07:35 output_4c3d5a87-c991-47bd-bf4b-651ab6ab3d71
drwxr-xr-x 2 1001 jarno 34 Jul 2 07:35 output_65a92b8b-6d23-4e97-bc98-90141e5019dc
drwxr-xr-x 2 1001 jarno 84 Jul 2 07:35 output_fa71aec4-b7da-4cc7-a43f-6e5d6bc013b4
lftp [email protected]:/> ls c47f4ac9-0a55-4035-bef0-eb2f2745493e/output_4c3d5a87-c991-47bd-bf4b-651ab6ab3d71/
drwxr-xr-x 2 1001 jarno 26 Jul 2 07:35 .
drwxr-xr-x 2 1001 jarno 26 Jul 2 07:35 ..
-rw-r--r-- 1 1001 jarno 892326179 Jul 2 07:35 hs37d5.fa.gz
[cloud-user@jlcwl ~]$ sudo grep c47f4ac9-0a55-4035-bef0-eb2f2745493e /var/log/pureftpd.log
195.148.30.238 - input [02/Jul/2020:07:34:42 -0000] "PUT /mnt/data/input/c47f4ac9-0a55-4035-bef0-eb2f2745493e/output_3dc07dbe-ff84-482a-8029-3684bdc77fd4/Mills_and_1000G_gold_standard.indels.b37.vcf" 200 86369975
195.148.30.238 - input [02/Jul/2020:07:35:02 -0000] "PUT /mnt/data/input/c47f4ac9-0a55-4035-bef0-eb2f2745493e/output_fa71aec4-b7da-4cc7-a43f-6e5d6bc013b4/U5c_CCGTCC_L001_R1_001.fastq.gz" 200 487611787
195.148.30.238 - input [02/Jul/2020:07:35:06 -0000] "PUT /mnt/data/input/c47f4ac9-0a55-4035-bef0-eb2f2745493e/output_fa71aec4-b7da-4cc7-a43f-6e5d6bc013b4/U5c_CCGTCC_L001_R2_001.fastq.gz" 200 546806668
195.148.30.238 - input [02/Jul/2020:07:35:13 -0000] "PUT /mnt/data/input/c47f4ac9-0a55-4035-bef0-eb2f2745493e/output_4c3d5a87-c991-47bd-bf4b-651ab6ab3d71/hs37d5.fa.gz" 200 892326179
195.148.30.238 - input [02/Jul/2020:07:35:38 -0000] "PUT /mnt/data/input/c47f4ac9-0a55-4035-bef0-eb2f2745493e/output_65a92b8b-6d23-4e97-bc98-90141e5019dc/dbsnp_138.b37.vcf.gz" 200 1472953420
I forgot the way to reproduce this is:
WES URL: csc-wes.c03.k8s-popup.csc.fi
workflow_type = cwl workflow_type_version = v1.0 workflow_url = https://github.com/jarnolaitinen/RD_pipeline/blob/master/workflow.cwl
Input {"curl_fastq_urls":{"class":"File","path":"http://195.148.30.67:8000/fastq_files_urls.txt"},"curl_reference_genome_url":{"class":"File","path":"http://195.148.30.67:8000/reference_seq_url.txt"},"curl_known_indels_url":{"class":"File","path":"http://195.148.30.67:8000/known_indels_url.txt"},"curl_known_sites_url":{"class":"File","path":"http://195.148.30.67:8000/known_sites_url.txt"},"readgroup_str":"@RG\tID:Seq01p\tSM:Seq01\tPL:ILLUMINA\tPI:330","sample_name":"abc1","threads":"10","gqb":[20,25,30,35,40,45,50,70,90,99]}
I copied the inpit files to: ftp://ftp-private.ebi.ac.uk:/upload/RD-files/