straglr icon indicating copy to clipboard operation
straglr copied to clipboard

refactor: use dataclass for structured tsv reads

Open lavafroth opened this issue 11 months ago • 4 comments

Changes

  • Create a TSV structure to hold start, size and strand.
  • Bind the current read to a variable instead of repeatedly indexing into support.
  • Create a slice to extract left and right sequences.

lavafroth avatar Mar 26 '24 06:03 lavafroth

I've been leery of the among of memory used to create an object for every support read instead of a simple tuple.

readmanchiu avatar Apr 03 '24 21:04 readmanchiu

When the code gets compiled to bytecode, the class (struct) has three fields, so the memory consumption should very likely reduce. Also, the slice objects consume the same memory as a tuple and the changed code will improve readability without incurring a performance penalty.

lavafroth avatar Apr 06 '24 01:04 lavafroth

Traceback (most recent call last):
  File "/projects/btl_scratch/rchiu/tmp/straglr/extract_repeats.py", line 7, in <module>
    from dataclasses import dataclass
  File "/home/rchiu/miniconda2/lib/python3.9/dataclasses.py", line 5, in <module>
    import inspect
  File "/home/rchiu/miniconda2/lib/python3.9/inspect.py", line 39, in <module>
    import importlib.machinery
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 786, in exec_module
  File "<frozen importlib._bootstrap_external>", line 881, in get_code
  File "<frozen importlib._bootstrap_external>", line 980, in get_data

readmanchiu avatar Apr 16 '24 00:04 readmanchiu

You might have conflicting modules, maybe a file with the same name. Dataclasses were introduced in Python 3.7

lavafroth avatar Apr 17 '24 03:04 lavafroth