tskit icon indicating copy to clipboard operation
tskit copied to clipboard

`allele_frequency_spectrum` with a 1-d sample set throws uninformative error

Open petrelharp opened this issue 10 months ago • 1 comments

@tforest points out that right now,

>>> ts.allele_frequency_spectrum([[0, 1, 2]], mode='branch').shape
(4,)

but

>>> ts.allele_frequency_spectrum([0, 1, 2], mode='branch').shape
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/peter/projects/tskit-dev/tsvenv/lib/python3.12/site-packages/tskit/trees.py", line 9221, in allele_frequency_spectrum
    return self.__one_way_sample_set_stat(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/projects/tskit-dev/tsvenv/lib/python3.12/site-packages/tskit/trees.py", line 7668, in __one_way_sample_set_stat
    stat = stat.reshape(stat.shape[:-1])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The error is happening because we're trying to drop a dimension for having a 1-D sample set (instead of 2-D); as described in the docs, but we can't, because for the AFS there is actually no dimension to drop.

Two options:

  1. make ts.allele_frequency_spectrum([0, 1, 2]) the same as ts.allele_frequency_spectrum([[0, 1, 2]]) (i.e., if there's only one sample set, allow it to be passed straight instead of in a 2D array).
  2. throw a more informative error.

A possible downside of (1) is that someone might do

ts.allele_frequency_spectrum([0, 1, 2])

when they mean

ts.allele_frequency_spectrum([[0], [1], [2]])`

and then be confused?

However, this seems unlikely - we think that the better option is (1).

petrelharp avatar Jan 25 '25 00:01 petrelharp

(1) seems OK to me

jeromekelleher avatar Jan 25 '25 12:01 jeromekelleher