sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Make vector generation consistent for dask and numpy in distance tests

Open aktech opened this issue 4 years ago • 0 comments

In the following code the dask version is returning ints in [0, 2] and the numpy arrays are floats in [0, 1].

https://github.com/pystatgen/sgkit/blob/9cc4490d89c27d5e00322b517a74c626043b105d/sgkit/tests/test_distance.py#L20-L28

Following change was suggested for the same here: https://github.com/pystatgen/sgkit/pull/498#discussion_r614265136 by @eric-czech

+    rs = da.random.RandomState(0)
+    x = rs.randint(0, 3, size=size, chunks=chunk).astype(dtype)
+    return x if array_type == "da" else np.asarray(x)
-    if array_type == "da":
-        rs = da.random.RandomState(0)
-        x = rs.randint(0, 3, size=size).astype(dtype).rechunk(chunk)
-    else:
-        x = np.random.rand(size[0], size[1]).astype(dtype)
-    return x

But for some unknown reasons this fails on Windows: https://github.com/pystatgen/sgkit/runs/2419912014?check_suite_focus=true

aktech avatar Apr 23 '21 17:04 aktech