heat icon indicating copy to clipboard operation
heat copied to clipboard

Allow random arrays of arbitrary size / speed up random arrays

Open mrfh92 opened this issue 10 months ago • 4 comments

The current very elaborate implementation of our random generator and random arrays suffers from two drawbacks:

  • the maximum number of entries is quite low (~maxint32 / maxint64 for the respective data types)
  • it is comparatively slow.

I suggest to consider something like the following:

global_seed = <...> 
local_seed = global_seed + ht.MPI_WORLD.rank 
local_rand = torch.rand(local_shape, ...) # with local_seed as seed 
global_array = ht.array(local_rand, ...) 

To discuss:

  • does this destroy the "quality" of the resulting pseudo-random array? (I know, as a trained mathematician I should be able to answer that question, but at the moment I can't)
  • even if the quality might be lower in the end, I suggest to add this version of random arrays, e.g., with a keyword argument quick_and_dirty=True or something like this

mrfh92 avatar Apr 19 '24 12:04 mrfh92