torchgfn
torchgfn copied to clipboard
Implement a uniform forward policy as a random baseline method for easy benchmarking.
Such a policy would not be useful for applications, but maybe good for either verifying that A) your trained GFN is learning (relative to this baseline) or B) reporting a random baseline, in a paper.