annotate_v icon indicating copy to clipboard operation
annotate_v copied to clipboard

Antibody Annotation - Annotate VH and VL sequences (FR and CDR) in Python

annotate_v

Antibody Annotation - A simple to use Python script annotating VH or VL sequences of an antibody using Kabat, Chothia, Martin schemes. It utilizes the REST API interface of Abnum from Dr Andrew Martin's group at UCL

Description

User provides a single-letter amino acid sequence of the VH or VL chain of an antibody, and specifies an annotation scheme (Kabat, Chothia, or Martin). The script sends a request to Abnum, which returns a string with each residue matched to its number. The script then identifies FR and CDR regions using definitions outlined here1. It prints out the amino acid sequence of the FRs and CDRs, and returns a list of 2 dict. The first dict consists of region: seq pairs. The second dict consists of number:residue pairs.

Dependencies

  • Imports requests module

Simple to use

First assign variables and create an instance of class annotate

aaseq: STRING amino acid sequence, single-letter coded, needs to be complete VH OR VL sequence. Upper/lower case.

scheme:STRING annotation scheme, can be one of the following: **"kabat", "chothia". Must be lowercase

aaseq="EIVLTQSPAIMSASPGERVTMTCSASSGVNYMHWYQQKPGTSPRRWIYDTSKLASGVPARFSGSGSGTDYSLTISSMEPEDAATYYCHQRGSYTFGGGTKLEIK"
scheme="chothia"
seq=annotate(aaseq,scheme)

Then invoke the retrieve() method

result=seq.retrieve()
print(result[0]) #prints the first dict (region vs seq)
print(result[1]) #prints the second dict (number vs residue)

Example output

Annotation in Chothia scheme:
L-FR1:   EIVLTQSPAIMSASPGERVTMTC
L-CDR1:  SASSGVNYMH
L-FR2:   WYQQKPGTSPRRWIY
L-CDR2:  DTSKLAS
L-FR3:   GVPARFSGSGSGTDYSLTISSMEPEDAATYYC
L-CDR3:  HQRGSYT
L-FR4:   FGGGTKLEIK
{'L-FR1': 'EIVLTQSPAIMSASPGERVTMTC', 'L-CDR1': 'SASSGVNYMH', 'L-FR2': 'WYQQKPGTSPRRWIY', 'L-CDR2': 'DTSKLAS', 'L-FR3': 'GVPARFSGSGSGTDYSLTISSMEPEDAATYYC', 'L-CDR3': 'HQRGSYT', 'L-FR4': 'FGGGTKLEIK'}
{'L1': 'E', 'L2': 'I', 'L3': 'V', 'L4': 'L', 'L5': 'T', 'L6': 'Q', 'L7': 'S', 'L8': 'P', 'L9': 'A', 'L10': 'I', 'L11': 'M', 'L12': 'S', 'L13': 'A', 'L14': 'S', 'L15': 'P', 'L16': 'G', 'L17': 'E', 'L18': 'R', 'L19': 'V', 'L20': 'T', 'L21': 'M', 'L22': 'T', 'L23': 'C', 'L24': 'S', 'L25': 'A', 'L26': 'S', 'L27': 'S', 'L28': 'G', 'L29': 'V', 'L30': 'N', 'L32': 'Y', 'L33': 'M', 'L34': 'H', 'L35': 'W', 'L36': 'Y', 'L37': 'Q', 'L38': 'Q', 'L39': 'K', 'L40': 'P', 'L41': 'G', 'L42': 'T', 'L43': 'S', 'L44': 'P', 'L45': 'R', 'L46': 'R', 'L47': 'W', 'L48': 'I', 'L49': 'Y', 'L50': 'D', 'L51': 'T', 'L52': 'S', 'L53': 'K', 'L54': 'L', 'L55': 'A', 'L56': 'S', 'L57': 'G', 'L58': 'V', 'L59': 'P', 'L60': 'A', 'L61': 'R', 'L62': 'F', 'L63': 'S', 'L64': 'G', 'L65': 'S', 'L66': 'G', 'L67': 'S', 'L68': 'G', 'L69': 'T', 'L70': 'D', 'L71': 'Y', 'L72': 'S', 'L73': 'L', 'L74': 'T', 'L75': 'I', 'L76': 'S', 'L77': 'S', 'L78': 'M', 'L79': 'E', 'L80': 'P', 'L81': 'E', 'L82': 'D', 'L83': 'A', 'L84': 'A', 'L85': 'T', 'L86': 'Y', 'L87': 'Y', 'L88': 'C', 'L89': 'H', 'L90': 'Q', 'L91': 'R', 'L92': 'G', 'L93': 'S', 'L96': 'Y', 'L97': 'T', 'L98': 'F', 'L99': 'G', 'L100': 'G', 'L101': 'G', 'L102': 'T', 'L103': 'K', 'L104': 'L', 'L105': 'E', 'L106': 'I', 'L107': 'K'}

Limitations

  • Relies on internet connection to Abnum website
  • Currently it can only annotate using Kabat, Chothia, or Martin scheme. One scheme each time.
  • Incomplete VH or VL sequence might not be annotated