Peptides icon indicating copy to clipboard operation
Peptides copied to clipboard

Vectorization of hydrophobicity

Open richierocks opened this issue 9 years ago • 1 comments

hydrophobicity is now vectorized. I also tweaked the data loading code in that function to avoid NOTEs in R CMD check.

Removing spaces from sequences is now outsourced to .remove_spaces (the leading dot to keep it internal). I only used it in the hydrophobicity function, but there are a dozen other places where this could be used. Also, I kept the logic as is, but you might want to change the regular expression to "[[:space:]]" to remove other spacing characters like tabs and non-breaking spaces.

There is also a small problem with the package licence that I haven't fixed. If you declare GPL-2, you can't also include license file. See section 1.1.2 of Writing R Extensions.

richierocks avatar May 25 '16 04:05 richierocks

I've had a change of heart on the best way to write this function. Using stri_count_fixed from the stringi package is faster for long sequences and long vectors. (I'm getting a better than 3x speed up for cases of 1e6 sequences.)

I also rewrote aindex using the same technique; the code is faster and clearer (to me at least).

richierocks avatar May 25 '16 06:05 richierocks