Refactor tsk_bit_array_* structures
I'm a bit confused about what the tsk_bit_array_ struct really is. I thought it was a straightforward bit-set implementation, but there's the idea of rows which I'm confused by. Is it a 2D bit set? So, a list of N independent bit sets? If so, I think we should change the API to be more explicit about this, and make operations work on the rows and bits rather than using the get_row operation to get a row, and then having methods which just work on a single row (like intersect, substract, etc).
Also, I think it would be clearer if we used set theoretic operations through out, so add -> union etc.
So, to be clear, we'd have operations like
tsk_bit_array_set_bit(self, row, bit)
tsk_bit_array_contains(self, row, bit)
etc
What do you thing @lkirk?
Yes, indeed these arrays are a list of N independent bit sets. I actually like the word "bit set" more than "bit array" as well. In addition to the refactor, maybe renaming things to tsk_bitset_* is a good idea as well?
I like your suggestions, they will simplify the calling code quite a bit.
Closing for inactivity and labelling "future", please re-open if you plan to work on this.
Has this been done already @lkirk? If so, we can remove the "future" label
@jeromekelleher it hasn't yet. I wanted to get all of the underlying machinery for two-locus stats worked out before refactoring. Once my next PR goes through, I can take care of this. I think it'd be nice to have parity between the python and C code (the python code in test_ld_matrix.py was created to match the ideas laid out here.).
Also, the ability to specify the row index in the various methods for bit arrays would add a lot of clarity to the code where they're consumed (especially where we're accessing a lot of rows as temporary variables).