tfmodisco-lite icon indicating copy to clipboard operation
tfmodisco-lite copied to clipboard

`example_idx` doesnt trace back to original contribution scores

Open mlweilert opened this issue 1 year ago • 1 comments

When using the modisco.h5 to track down original seqlet coordinates, the example_idx parameter is arbitrary (i.e. doesn't have a 1:1 match with the dimensions of the original input .npy in terms of different contribution score windows). This makes it essentially impossible to use only the "modisco.h5" saved metadata to trace back the genomic coordinates of the seqlets. You can reimplement parts of the "extract_seqlets.py" code to find which of your pos/neg_regions made your cut, but would it be possible to make the contribution indices match the modisco.h5 outputs?

Specifically, this function I think is where we lose the information. https://github.com/jmschrei/tfmodisco-lite/blob/main/modiscolite/extract_seqlets.py#L59

mlweilert avatar Sep 19 '23 19:09 mlweilert