dandelion
dandelion copied to clipboard
Singularity Container Preprocessing Error
Description of the bug
Hi Zewen, Great package and love the container you made! I was trying to do preprocessing manually myself but had some issues with file paths. So I decided to use your container, and used the preprocessing function. But there was an issue I think with how I labeled the individual column. Stupidly I didnt think to make it lead with a alphabetical character and I think the container read it in as an int64( if I am understanding this correctly). Maybe a typing assignment when importing in preprocessing.py would help? Will try again with the individual changed to ms'3'. Thanks and let me know if you need anything else!
Minimal reproducible example
sample,prefix,individual
3s_bcr,3s,3
4s_bcr,4s,4
5s_bcr,5s,5
6s_bcr,6s,6
7s_bcr,7s,7
8s_bcr,8s,8
3b_bcr,3b,3
4b_bcr,4b,4
5b_bcr,5b,5
6b_bcr,6b,6
7b_bcr,7b,7
8b_bcr,8b,8
apptainer run -B $PWD ~/kt16_default_sc-dandelion.sif dandelion-preprocess --org=mouse --filter_to_high_confidence --meta ./sample_info.csv
The error message produced by the code above
Traceback (most recent call last):
File "/share/dandelion_preprocess.py", line 378, in <module>
main()
File "/share/dandelion_preprocess.py", line 288, in main
ddl.pp.reassign_alleles(
File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/site-packages/dandelion/preprocessing/_preprocessing.py", line 1439, in reassign_alleles
out_dir = Path(combined_folder)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 871, in __new__
self = cls._from_parts(args)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 509, in _from_parts
drv, root, parts = self._parse_args(args)
^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 493, in _parse_args
a = os.fspath(a)
^^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not int64
OS information
MacOS container most recent
Version information
command line
Additional context
No response
Hi @bpr4242 thanks! yes you are right it's a problem with the way numbers are interpreted by default with pandas.read_csv
i would normally never name files/folders as numbers as it causes issues like this. so yea changing to an actual string should work.
potentially solved with #403. will be implemented in new release shortly.