sgkit
sgkit copied to clipboard
Make vector generation consistent for dask and numpy in distance tests
In the following code the dask version is returning ints in [0, 2] and the numpy arrays are floats in [0, 1].
https://github.com/pystatgen/sgkit/blob/9cc4490d89c27d5e00322b517a74c626043b105d/sgkit/tests/test_distance.py#L20-L28
Following change was suggested for the same here: https://github.com/pystatgen/sgkit/pull/498#discussion_r614265136 by @eric-czech
+ rs = da.random.RandomState(0)
+ x = rs.randint(0, 3, size=size, chunks=chunk).astype(dtype)
+ return x if array_type == "da" else np.asarray(x)
- if array_type == "da":
- rs = da.random.RandomState(0)
- x = rs.randint(0, 3, size=size).astype(dtype).rechunk(chunk)
- else:
- x = np.random.rand(size[0], size[1]).astype(dtype)
- return x
But for some unknown reasons this fails on Windows: https://github.com/pystatgen/sgkit/runs/2419912014?check_suite_focus=true