tcrdist3
tcrdist3 copied to clipboard
`IndexError` when running `TCRrep` constructor
When running the following as described in the tcrdist
documentation with my own dataframe of sequences (that as far as I can tell, are all formatted correctly), I get the following:
tr = TCRrep(cell_df = dff,
organism = 'mouse',
chains = ['alpha','beta'],
db_file = 'alphabeta_gammadelta_db.tsv',
compute_distances = False)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[110], line 2
1 # Run TCRrep
----> 2 tr = TCRrep(cell_df = dff,
3 organism = 'mouse',
4 chains = ['alpha','beta'],
5 db_file = 'alphabeta_gammadelta_db.tsv',
6 compute_distances = False)
File ~/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:182, in TCRrep.__init__(self, organism, chains, db_file, archive_name, blank, cell_df, clone_df, imgt_aligned, infer_all_genes, infer_cdrs, infer_index_cols, deduplicate, use_defaults, store_all_cdr, compute_distances, index_cols, cpus, df2, archive_result)
180 if infer_cdrs:
181 for chain in self.chains:
--> 182 self.infer_cdrs_from_v_gene(chain = chain, imgt_aligned = self.imgt_aligned)
183 # Assume all provided columns are index columns, except 'count' 'cell_id', 'clone_id'
185 if infer_index_cols:
File ~/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:518, in TCRrep.infer_cdrs_from_v_gene(self, chain, imgt_aligned)
513 self.cell_df = self.cell_df.assign(cdr1_a_aa=list(map(f0, self.cell_df.v_a_gene)),
514 cdr2_a_aa=list(map(f1, self.cell_df.v_a_gene)),
515 pmhc_a_aa=list(map(f2, self.cell_df.v_a_gene)))
516 if chain == "beta":
517 self.cell_df = self.cell_df.assign(cdr1_b_aa=list(map(f0, self.cell_df.v_b_gene)),
--> 518 cdr2_b_aa=list(map(f1, self.cell_df.v_b_gene)),
519 pmhc_b_aa=list(map(f2, self.cell_df.v_b_gene)))
...
--> 743 aa_string = self.all_genes[organism][gene].__dict__[attr][cdr]
744 except KeyError:
745 aa_string = None
IndexError: list index out of range
I have no idea what this code in tcrdist/repertoire.py
is doing, but I am assuming that for now, I can include an IndexError
in the try: except:
to also set aa_string
to None
if this exception is encountered.
I would like to unpack what is going on here and provide a more effective error message as I am not sure what is wrong with the input dataframe to cause this in the first place.