Update random number generator
Changes
Update AcquistionGeometry, ImageGeometry and VectorGeometry allocate RANDOM and RANDOM_INT methods to optionally use numpy.Generator.default_rng random number generator.
- Avoids creating an array of float64 then casting to the array dtype, which can allocate more memory than is needed.
- Old methods can be accessed with new fill type labels
RANDOM_DEPRECATEDandRANDOM_INT_DEPRECATED. - The NEW versions will no longer allow users to set the random seed globally, which may have an impact where algorithms don't pass the seed to allocate.
- For now use deprecated methods in tests where they expect to be able to use a global seed. In the future we should aim to use the new methods as default. An example is https://github.com/TomographicImaging/CIL/pull/2106
Testing you performed
Tests in test_DataContainer and test_AcquisitionGeometry
Please add any demo scripts to https://github.com/TomographicImaging/CIL-Demos/tree/main/misc
Related issues/links
Closes #2033 Might close #804
Checklist
- [x] I have performed a self-review of my code
- [x] I have added docstrings in line with the guidance in the developer guide
- [x] I have updated the relevant documentation
- [x] I have implemented unit tests that cover any new or modified functionality
- [ ] CHANGELOG.md has been updated with any functionality change
- [ ] Request review from all relevant developers
- [ ] Change pull request label to 'Waiting for review'
Contribution Notes
Please read and adhere to the developer guide and local patterns and conventions.
- [x] The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in CIL (the Work) under the terms and conditions of the Apache-2.0 License
- [x] I confirm that the contribution does not violate any intellectual property rights of third parties
I looked at some different methods for random number generation
- Previous method
Using
numpy.random.seed()andnumpy.random.random()directly. Cons: doesn't let us choose the data type - Calling
numpy.random.Generator()in each file Cons: no global seed - Use
numpy.random.Generator()as a global rng in a utils file, with a set_seed method
stream = np.random.PCG64DXSM
global_rng = np.random.Generator(stream(None))
def set_seed(seed):
global global_rng
global_rng = np.random.Generator(stream(seed))
Pros: global seed Cons: using a fixed bit generator
- Use
numpy.random.Generator()in a RandomGenerator class in a utils file
class RandomGenerator:
def __init__(self, seed=None):
self._seed = seed
self._rng = np.random.Generator(np.random.PCG64DXSM(seed))
def set_seed(self, seed):
self._seed = seed
self._rng = np.random.Generator(np.random.PCG64DXSM(seed))
def __getattr__(self, name):
return getattr(self._rng, name)
global_rng = RandomGenerator(seed=None)
Use __getattr__ method so we can directly call the random numbers like global_rng.random() rather than something like global_rng.get_rng().random()
Pros: global seed
Cons: complicated
| Method | Pros | Cons |
|---|---|---|
numpy.random.seed() in each file |
Global seed | Doesn't let us choose the data type |
numpy.random.Generator(seed) in each file |
No global seed | |
| Global Generator with set_seed method | Global seed | Less flexible for future changes |
RandomGenerator Class with __getattr__ |
Global seed | Complicated |
Thanks Hannah, just a quick few comments, mainly on the documentation. I think most comments relate to all three files - image, acquisition and vector geometry
Thanks @MargaretDuff I think I've updated all three files with the suggestions
Discussion in dev meeting
- Use out=allocate(empty) then rng.random(out=out). To be decided: moving the random functionality to fill and just call fill from geometry.allocate? How do we want to handle the deprecated version? Could keep deprecated version in allocate and new version in fill?
- Allocate and fill should be able to do these functions and use standard np functions to do it https://data-apis.org/array-api/latest/API_specification/creation_functions.html
- Move allocate None to the top
- Update the CHANGELOG
Has an issue been opened/will an issue be opened to address removing the use of the deprecated methods?
After discussion with @MargaretDuff
- Noticed we can currently use geometry.allocate with an array because it just passes whatever is in value to fill
- Perhaps most logical usage would be allocate to be used for generate data e.g. None, uniform or random array. And fill for fill with an array.