CITE-seq-Count icon indicating copy to clipboard operation
CITE-seq-Count copied to clipboard

ValueError: columns cannot be a set

Open jamesboot opened this issue 2 years ago • 5 comments

Hello,

I'm getting the following error during UMI correction, when running CITE-seq-Count 1.4.5, Python 3.8.

Correcting umis
Traceback (most recent call last):
  File "/data/home/hmy961/citeseq-counts-env/bin/CITE-seq-Count", line 8, in <module>
    sys.exit(main())
  File "/data/home/hmy961/citeseq-counts-env/lib/python3.8/site-packages/cite_seq_count/__main__.py", line 603, in main
    io.write_dense(
  File "/data/home/hmy961/citeseq-counts-env/lib/python3.8/site-packages/cite_seq_count/io.py", line 48, in write_dense
    pandas_dense = pd.DataFrame(sparse_matrix.todense(), columns=columns, index=index)
  File "/data/home/hmy961/citeseq-counts-env/lib/python3.8/site-packages/pandas/core/frame.py", line 639, in __init__
    raise ValueError("columns cannot be a set")
ValueError: columns cannot be a set

Command and options for running:

CITE-seq-Count -R1 $READ1 -R2 $READ2 -t $TAGS -cbf 1 -cbl 16 -umif 17 -umil 26 -trim 10 -T 4 --max-error 3 -cells $CELLS --whitelist $WHITELIST -o $OUTDIR

Also, may or may not be related, I am getting the following warning at the start of the processing. I've always used the options above so not sure why this warning is appearing now.

[WARNING] Read1 length is 28bp but you are using 26bp for Cell and UMI barcodes combined.
This might lead to wrong cell attribution and skewed umi counts.

Any help would be much appreciated!

jamesboot avatar Oct 15 '22 10:10 jamesboot

Hello @jamesboot A few things I would test out.

I think it might be because this version doesn't check for 0 counts and I think this is what is happening there.

  1. Are you sure about the UMI stopping at 16bp? Not super important, but it might give you a little increase in your UMI counts depending on your library's diversity and its size.
  2. Can you run the same without the whitelist? I suspect There is no cell barcode overlap. This usually happens on SCv3 from 10x runs.

Hoohm avatar Oct 16 '22 09:10 Hoohm

Hi Patrick,

Thanks very much for your quick response. I tried running without the whitelist (all other options the same as above) but still got the same error message:

Correcting umis
Traceback (most recent call last):
  File "/data/home/hmy961/citeseq-counts-env/bin/CITE-seq-Count", line 8, in <module>
    sys.exit(main())
  File "/data/home/hmy961/citeseq-counts-env/lib/python3.8/site-packages/cite_seq_count/__main__.py", line 603, in main
    io.write_dense(
  File "/data/home/hmy961/citeseq-counts-env/lib/python3.8/site-packages/cite_seq_count/io.py", line 48, in write_dense
    pandas_dense = pd.DataFrame(sparse_matrix.todense(), columns=columns, index=index)
  File "/data/home/hmy961/citeseq-counts-env/lib/python3.8/site-packages/pandas/core/frame.py", line 639, in __init__
    raise ValueError("columns cannot be a set")
ValueError: columns cannot be a set

I'm pretty sure the UMI stops at 16bp. We are using 10X 5' v2 chemistry if that helps.

jamesboot avatar Oct 17 '22 08:10 jamesboot

I think this is a bug with the newest pandas version: https://github.com/facebook/Ax/issues/1153

Can you try to reinstall with pandas 1.4?

Hoohm avatar Oct 22 '22 11:10 Hoohm

I think this is a bug with the newest pandas version: facebook/Ax#1153

Can you try to reinstall with pandas 1.4?

This fixed the error for me. I think it's because, in pandas 1.5, they no longer allow DataFrame columns to be set by a set datatype. io.py has a line that creates a DataFrame by setting the columns with a set datatype.

ymjzhang avatar Oct 28 '22 15:10 ymjzhang

Sorry for taking some time to come back to this. Running with pandas 1.4 fixed the problem for me too! Thanks for your help!

jamesboot avatar Nov 01 '22 09:11 jamesboot