CSV header/data length mismatch 5 != 3 on row that does not exist
Hi! I am joining 4 csv tables with the same number of rows. A mlr command is implemented in a Nextflow process.
Command:
script:
"""
mlr
--csv join
-u
--ul
--ur
-j SequenceName
-f ${stanford} ${comet} |
mlr --csv join -u --ul --ur -j SequenceName -f ${g2p} |
mlr --csv join -u --ul --ur -j SequenceName -f ${rega} > joint_${comet.getSimpleName().split('comet_')[1]}.csv
"""
comet, stanford, rega, and g2p are my csv tables.
The join was working without any problem last week, but since today I have been having this errow:
CSV header/data length mismatch 5 != 3 at filename (stdin) row 1176.
The thing is that 1176 row does not exist in any of my csv tables. All my tables have 3 columns and 1175 rows each.
Any idea what is going on here?
Thanks, Vera
@vera-rykalina is it possible for you to share your data files, e.g. at gist.github.com?
Also, I suspect that the output of mlr --csv join -u --ul --ur -j SequenceName -f ${g2p} is intermediate data which does have 1176 rows (which can happen if there is a duplicate value of SequenceName) ...
Closing as I believe this is resolved -- if this is in error please re-open and I'm happy to discuss further -- thank you!