immunarch icon indicating copy to clipboard operation
immunarch copied to clipboard

Problem with importing columns that are mostly NAs

Open christianwoe opened this issue 1 year ago • 3 comments

Hi all,

🐛 Bug

I was trying to import some data from MiXCR 4.3.1 tsv files and recognized warning messages for some of the samples. After further checking it seems that in rare cases columns are assigned to type logical even if there are cases where character content is present for some of the clones. However, those cases are replaced by 'NA' and therefore the information is discarded. It looks for me like the readr function in inside repLoad is guessing the wrong type of the column, probably because it only checks a subset of rows.

It would be helpful to be able to modify the parameter provided to the readr function, either 'col_types' or 'guess_max'. Or is there already another solution?

To Reproduce

Steps to reproduce the behavior:

  1. repLoad(pathname)

This is the warning message.

Warning: One or more parsing issues, call `problems()` on your data frame for details, e.g.:
  dat <- vroom(...)
  problems(dat)

Expected behavior

Columns with at least 1 non-NA are not assigned to type logical.

Many thanks and kind regards, Christian

christianwoe avatar Sep 12 '23 10:09 christianwoe

Hi @christianwoe

Thank you for opening the issue. Could you share an example of such data please? What columns are usually the problematic ones?

I'm open to scheduling a short call to discuss this issue over Zoom if this accelerates things.

vadimnazarov avatar Oct 05 '23 21:10 vadimnazarov

Hi, here is an example based on test data where I think the 'allDHitsWithScore' is causing a warning, because only one of all the clonotypes has an assigned value here.

Best wishes, Christian

Multi_TRA_FS115_2_S150.clones_TRAD.tsv.zip

christianwoe avatar Oct 06 '23 13:10 christianwoe

Hey everyone, I'm facing a similar issue here - was this fixed in the latest update? Cheers, Nicole

yls2g13 avatar Jan 29 '24 05:01 yls2g13