spexs2
spexs2 copied to clipboard
an exhaustive sequence pattern search tool
Allow sequences with multiple counted datasets. Example input: ``` 1 0 8 ACGT 2 4 0 CGTC ``` Requires some thought how to properly represent these in configuration file.
A way to automatically add group characters. It requires some configurable heuristic. Example `xA, xC => x[AC]`, when `p(xA) + p(xC) > 0.1`.
Allow extending with subgroupings. For example extending with (s | sh).
Add possibility to define min and max gap for star extender.
Make an hashing function for the pattern position set. This allows for easy comparison whether some patterns are located in the same places.
Allow the spexs implementation to start from an arbitrary seed pattern. For example find all interesting patterns that start with "P.*T".