AbNumber
AbNumber copied to clipboard
ChainParseError: 2 antibody domains in sequence
anarci supports 2 domains in one sequence, while abnumber does not
abnumber.exceptions.ChainParseError: Found 2 antibody domains in sequence: "DIQLTQSPSFLSASVGDRVTITCSARSSISFMYWYQQKPGKAPKLLIYDTSNLASGVPSRFSGSGSGTEFTLTISSLEAEDAATYYCQQWSSYPLTFGQGTKLEIKGGGSGGGGEVQLVESGGGLVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRSKYNNYATYYADSVKDRFTISRDDSKNSLYLQMNSLKTEDTAVYYCVRHGNFGNSYVSWFAYWGQGTLVTVSSGGCGGGEVAALEKEVAALEKEVAALEKEVAALEKGGGDKTHTCPPCPAPEAAGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKISKAKGQPREPQVYTLPPSREEMTKNQVSLWCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK"
Hi @deweihu96, thanks for reporting this, I would like to support this in the future. A pull request would be welcome.
The current AbNumber Chain
object can only hold a single variable domain, with a single CDR3, etc. So probably this cannot be supported using chain = Chain(seq, 'imgt')
, but using a separate call like chains = Chain.parse_domains(seq, 'imgt')
.
So if you have a sequence like Var1Const1Var2Const2
, you should get two Chain objects where the chain.tail
corresponds to any sequence that immediately follows the variable domain (chain1.tail = "Const1"
)
Hi @prihoda ~ Thanks for your reply. The simplest way that I came up with is:
- Use anarci to find two domains, and slice the sequences in two domains;
- Use abnumber to do numbering on two sequences.
@deweihu96 sounds good. Can you share the part of the code where you parse the anarci output?
@prihoda
>>> import anarci
>>> seq = 'QIQLVQSGSELKKPGASVKVSCKASGYTFTHYAMNWVRQAPGQGLEWMGWINTNTGEPTYAQGFTGRFVFSLDTSVSTAYLQISSLKAEDTAVYYCAREREPGMDEWGQGTLVTVSSGGGGSSSSSSDVVMTQSPLSLPVTLGQPASISCRSSQSLVHANTNTYLEWYQQRPGQSPRLLIYKVSNRFSGVPDRFSGSGSGTDFTLKISRVEAEDVGVYYCFQGTHVPNTFGQGTKLEIK'
>>> sequences, numbered, alignment_details, hit_tables = anarci.run_anarci(seq,'kabat',allowed_species='human')
>>> alignment_details
#[[
#{'id': 'human_H', 'description': '', 'evalue': 1.4e-55, 'bitscore': 178.0, 'bias': 1.0, 'query_start': 0, 'query_end': 117, 'species': 'human', 'chain_type': 'H', 'scheme': 'imgt', 'query_name': 'Input sequence'},
#{'id': 'human_K', 'description': '', 'evalue': 1.9e-56, 'bitscore': 180.6, 'bias': 0.1, 'query_start': 127, 'query_end': 239, 'species': 'human', 'chain_type': 'K', 'scheme': 'imgt', 'query_name': 'Input sequence'}]]
Once you have the start and end positions, slice the sequence and parse them with abnumber: )
I noticed that you're also one of the authors of biophi. I want to say that's a really great job!