crisprDesign icon indicating copy to clipboard operation
crisprDesign copied to clipboard

Unexpected PAM truncation

Open RussBainer opened this issue 9 months ago • 1 comments

Hi JP and team, I'm trying to make a new CrisprNuclease object based on an enzyme that has been shown to have a more permissive pam sequence, which I initially tried to encode by specifying more pams and weights. When I did this, I found that the pams appear to be internally capped at 4:

> pams
 [1] "(3/3)ACC" "(3/3)CCC" "(3/3)TCC" "(3/3)GCC" "(3/3)ACA" "(3/3)CCA" "(3/3)TCA" "(3/3)GCA" "(3/3)ACG" "(3/3)CCG" "(3/3)TCG" "(3/3)GCG"
[13] "(3/3)ACT" "(3/3)CCT" "(3/3)TCT" "(3/3)GCT"
> pw
 [1] 0.40 0.40 0.40 0.40 0.43 0.43 0.43 0.43 0.32 0.32 0.32 0.32 0.30 0.30 0.30 0.30
> 
> eNme2c <- CrisprNuclease("eNme2c",
+                          targetType="DNA",
+                          pams=pams,
+                          weights=pw,
+                          metadata=list(description="eNme2c nuclease, Cas9 variant from Neisseria meningitidis"),
+                          pam_side="3prime",
+                          spacer_length=20)
> 
> pams(eNme2c)
DNAStringSet object of length 4:
    width seq                                                                                                            names               
[1]     3 ACA                                                                                                            ACA
[2]     3 CCA                                                                                                            CCA
[3]     3 TCA                                                                                                            TCA
[4]     3 GCA                                                                                                            GCA

This does not happen when I try to make a simple Nuclease object, but is introduced when turn that into a CrisprNuclease:

> flarg <- Nuclease('Flarg', 'DNA', motifs = pams, weights = pw)
> motifs(flarg)
DNAStringSet object of length 16:
     width seq
 [1]     3 ACC
 [2]     3 CCC
 [3]     3 TCC
 [4]     3 GCC
 [5]     3 ACA
 ...   ... ...
[12]     3 GCG
[13]     3 ACT
[14]     3 CCT
[15]     3 TCT
[16]     3 GCT
> flarg.cn <- new("CrisprNuclease", flarg, pam_side="3prime", spacer_length = as.integer(20))
> pams(flarg.cn)
DNAStringSet object of length 4:
    width seq                                                                                                            names               
[1]     3 ACA                                                                                                            ACA
[2]     3 CCA                                                                                                            CCA
[3]     3 TCA                                                                                                            TCA
[4]     3 GCA                                                                                                            GCA

I personally have a workaround for this use case, but I thought I would raise it in case this isn't the functionality you want.

Thanks again for this awesome toolset!

RussBainer avatar May 11 '24 00:05 RussBainer