proteomics-sample-metadata
proteomics-sample-metadata copied to clipboard
SDRF file format problem (px-submission-tool vs. sdrf-pipeline)
Hi All,
related to the PXD022713 dataset I submitted last week (private for the moment), I encountered a problem with the SDRF file I tried for the first time to generate and submit (as suggested by the px-submission-tool v2.5.2).
Could you please help me in understanding what was wrong in the format? I used the Python "sdrf-pipeline" to validate it upstream from the PRIDE submission.
I used the "sdrf-default.tsv" file to fill out the sample and file annotations. I had to duplicate the "characteristics[organism]" column to be able to define two species (hopping it's the right way to do, I can't catch a related example). I also have to change the instrument annotation by free text while it was ok to use the ontology url with the "sdrf-pipeline".
I join two files (renamed .txt for joining) : the one validated by the Python script and the one submitted "SDRF_ND9.tsv" (as "other" and not "experimental design" otherwise an error was still dropped in the log) :
sdrf-pipeline
parse_sdrf validate-sdrf --sdrf_file SDRF_ND9_toValidate.tsv Everything seems to be fine. Well done.
px-submission log
2020-11-24 12:43:24,054 INFO [pool-1-thread-8] u.a.e.p.s.v.Main [Main.java:107] ERROR : The number of columns in the SDRF ({}) is smaller than the number of mandatory fields ({})', value='', row=0, column='N/A' 2020-11-24 12:43:24,072 INFO [pool-1-thread-8] u.a.e.p.s.v.Main [Main.java:107] ERROR : Invalid columns present: name, experiment, fraction ', value='', row=0, column=' name, experiment, fraction' 2020-11-24 12:43:24,072 INFO [pool-1-thread-8] u.a.e.p.s.v.Main [Main.java:107] ERROR : The following columns are mandatory and not present in the SDRF: source name, characteristics[organism part], characteristics[disease], characteristics[organism], characteristics[cell type], assay name, comment[fraction identifier], comment[data file]', value='', row=0, column='N/A'
I also submitted another dataset PXD022725 with the same problem.
SDRF_ND9.txt SDRF_ND9_validatedBySDRF.txt
Thanks for your help. Oana
Hi ovigy, the px-submission-tool use jsdrf to verify sdrf files. I tested SDRF_ND9.tsv file with sdrf-pipelines, Everything seems to be fine. But there is a problem with jsdrf verification . @ovigy
Can you explain @daichengxin what is the error?
I may need to look at the jsdrf code, now I can’t see what fields are missing in this file @ypriverol
Hi! Thanks for your replies. I tried first to use the Java tool but the user documentation was to light for me, sorry I'm not familiar with Maven and java libraries. I will give it a new try. It was easier for me with Python. Thanks
thanks @ovigy , we will review the code and go back to you soon.
Thanks for your feedback @ovigy , the jsdrf has bugs and will be fixed
You're welcome. Thanks in return for the documentation and your help. Have a good day