CITE-seq-Count icon indicating copy to clipboard operation
CITE-seq-Count copied to clipboard

Pandas error after correcting UMIs

Open ejohnson643 opened this issue 9 months ago • 1 comments

Everything looked to be running fine:

!CITE-seq-Count \
-R1 $Hash_R1 \
-R2 $Hash_R2 \
-t feature_barcode_onlist.csv \
-cbf 1 -cbl 16 \
-umif 17 -umil 28 \
-wl 737K-august-2016.txt \
-T 8 \
-cells 10000 \
-o ../Hash/output/CITEseqCount/

Gave some output...

Loading whitelist
Detected 4 files to run on.
Counting number of reads
Started mapping
Processing 8,749,912 reads
CITE-seq-Count is running with 8 cores.
Processed 1,000,000 reads in 5.752 seconds. Total reads: 1,000,000 in child 39209
Mapping done for process 39209. Processed 1,093,739 reads
Processed 1,000,000 reads in 7.888 seconds. Total reads: 1,000,000 in child 39210
Mapping done for process 39210. Processed 1,093,739 reads
Processed 1,000,000 reads in 10.12 seconds. Total reads: 1,000,000 in child 39211
Mapping done for process 39211. Processed 1,093,739 reads
Processed 1,000,000 reads in 12.52 seconds. Total reads: 1,000,000 in child 39212
Mapping done for process 39212. Processed 1,093,739 reads
Processed 1,000,000 reads in 14.68 seconds. Total reads: 1,000,000 in child 39213
Mapping done for process 39213. Processed 1,093,739 reads
Processed 1,000,000 reads in 16.82 seconds. Total reads: 1,000,000 in child 39214
Mapping done for process 39214. Processed 1,093,739 reads
Processed 1,000,000 reads in 18.76 seconds. Total reads: 1,000,000 in child 39215
Mapping done for process 39215. Processed 1,093,739 reads
Processed 1,000,000 reads in 21.02 seconds. Total reads: 1,000,000 in child 39216
Mapping done for process 39216. Processed 1,093,739 reads
Mapping done
Merging results
Counting number of reads
Started mapping
Processing 17,736,725 reads
CITE-seq-Count is running with 8 cores.
Processed 1,000,000 reads in 5.921 seconds. Total reads: 1,000,000 in child 39247
Processed 1,000,000 reads in 10.75 seconds. Total reads: 1,000,000 in child 39246
Processed 1,000,000 reads in 8.693 seconds. Total reads: 2,000,000 in child 39247
Mapping done for process 39247. Processed 2,217,090 reads
Processed 1,000,000 reads in 18.81 seconds. Total reads: 1,000,000 in child 39248
Mapping done for process 39251. Processed 0 reads
Mapping done for process 39250. Processed 118,453 reads
Mapping done for process 39252. Processed 0 reads
Mapping done for process 39253. Processed 0 reads
Processed 1,000,000 reads in 26.54 seconds. Total reads: 1,000,000 in child 39249
Processed 1,000,000 reads in 40.86 seconds. Total reads: 2,000,000 in child 39246
Mapping done for process 39246. Processed 2,217,090 reads
Processed 1,000,000 reads in 48.54 seconds. Total reads: 2,000,000 in child 39248
Processed 1,000,000 reads in 46.78 seconds. Total reads: 2,000,000 in child 39249
Mapping done for process 39248. Processed 2,217,090 reads
Mapping done for process 39249. Processed 2,217,090 reads
Mapping done
Merging results
Counting number of reads
Started mapping
Processing 19,046,516 reads
CITE-seq-Count is running with 8 cores.
Mapping done for process 39371. Processed 0 reads
Mapping done for process 39373. Processed 0 reads
Mapping done for process 39374. Processed 0 reads
Processed 1,000,000 reads in 6.071 seconds. Total reads: 1,000,000 in child 39372
Mapping done for process 39375. Processed 0 reads
Mapping done for process 39376. Processed 0 reads
Mapping done for process 39377. Processed 0 reads
Mapping done for process 39372. Processed 1,309,791 reads
Mapping done for process 39378. Processed 0 reads
Mapping done
Merging results
Counting number of reads
Started mapping
Processing 20,398,249 reads
CITE-seq-Count is running with 8 cores.
Mapping done for process 39397. Processed 0 reads
Mapping done for process 39398. Processed 0 reads
Mapping done for process 39399. Processed 0 reads
Processed 1,000,000 reads in 5.978 seconds. Total reads: 1,000,000 in child 39396
Mapping done for process 39400. Processed 0 reads
Mapping done for process 39401. Processed 0 reads
Mapping done for process 39396. Processed 1,351,733 reads
Mapping done for process 39402. Processed 0 reads
Mapping done for process 39403. Processed 0 reads
Mapping done
Merging results
Correcting cell barcodes
Generated barcode tree from whitelist
Finding reference candidates
Processing 529,377 cell barcodes
Collapsing cell barcodes
Correcting umis
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/OverTCR/bin/CITE-seq-Count", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/OverTCR/lib/python3.11/site-packages/cite_seq_count/__main__.py", line 603, in main
    io.write_dense(
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/OverTCR/lib/python3.11/site-packages/cite_seq_count/io.py", line 48, in write_dense
    pandas_dense = pd.DataFrame(sparse_matrix.todense(), columns=columns, index=index)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/OverTCR/lib/python3.11/site-packages/pandas/core/frame.py", line 702, in __init__
    raise ValueError("columns cannot be a set")
ValueError: columns cannot be a set

This seems like an incompatibility with a Pandas update.

ejohnson643 avatar Oct 03 '23 18:10 ejohnson643

@ejohnson643 CITE-seq-Count cannot run with python 3.11 and also is restricted to pandas version:

  • Pandas 1.3.5
  • Python 3.7.16

I would suggest creating a separate conda environment and install CITE-seq-Count there. For example.

conda create -n citeseq python=3.7.16

Then active the environment:

conda activate citeseq

Followed by installing CITE-seq-Count

pip install CITE-seq-Count==1.4.5

Let me know how this works for you.

cpflueger2016 avatar Oct 30 '23 02:10 cpflueger2016