ScenTrees.jl icon indicating copy to clipboard operation
ScenTrees.jl copied to clipboard

[JOSS] make functions run deterministic

Open matbesancon opened this issue 5 years ago • 23 comments

Following the convention from Julia stdlib's Random and various packages from the JuliaStats ecosystem, every function which performs a random sampling should take in parameter the Random Number Generator (which is usually a subtype of Random.AbstractRNG)

matbesancon avatar Nov 29 '19 14:11 matbesancon

@matbesancon Is this really necessary for functions like rand() and randn()?

kirui93 avatar Nov 29 '19 15:11 kirui93

That's precisely for these functions. You would take a RNG in all functions and pass it to the rand and randn calls

matbesancon avatar Nov 29 '19 15:11 matbesancon

Yes. I have seen what you mean. This means that I need to use the package Random for this. Thank you. I will implement immediately.

kirui93 avatar Nov 29 '19 15:11 kirui93

Linking to the review: https://github.com/openjournals/joss-reviews/issues/1912

matbesancon avatar Dec 02 '19 10:12 matbesancon

@matbesancon I will close this issue then because I have already included using Random RNG in the package through MersenneTwister.

kirui93 avatar Dec 13 '19 12:12 kirui93

According to Julia's documentation, every function which performs a random sampling should take in a parameter the Random Number Generator (which is usually a subtype of Random.AbstractRNG. Also, random number generation uses the Mersenne Twister library via MersenneTwister objects. We employed this idea and included this functionality for all the parts of the code where we generate random numbers.

kirui93 avatar Dec 17 '19 08:12 kirui93

https://github.com/kirui93/ScenTrees.jl/blob/master/src/StochPaths.jl

This is not quite right, the rng as a global variable makes each run depend on the number of calls of a function. The rng should be an argument passed to functions instead

matbesancon avatar Jan 07 '20 08:01 matbesancon

https://docs.julialang.org/en/v1/stdlib/Random/

I used the concept in the above julia documentation. Another option was to use Random.seed!().

kirui93 avatar Jan 07 '20 08:01 kirui93

Both MersenneTwister and Random.seed!() works in a similar manner. Consider the below,

julia> MersenneTwister(01102019) == Random.seed!(01102019) true

kirui93 avatar Jan 07 '20 08:01 kirui93

In https://github.com/kirui93/ScenTrees.jl/blob/master/src/StochPaths.jl, rng = MersenneTwister(0101219) and I have passed it as an argument of randn() in all the functions.

kirui93 avatar Jan 07 '20 08:01 kirui93

Hello @matbesancon, do you suggest using as in the following example instead of the above?

julia> rand(rng::AbstractRNG, 3,2)
3×2 Array{Float64,2}:
 0.496169  0.449182 
 0.732     0.875096 
 0.299058  0.0462887

kirui93 avatar Jan 07 '20 13:01 kirui93

In any function that needs rng, it should be an argument of this function:

GaussianSamplePath1D()

# replace with 
GaussianSamplePath1D(rng)

matbesancon avatar Jan 07 '20 14:01 matbesancon

For the stochastic approximation process, we need a function that doesn't take any inputs. That is why we coded these examples without any inputs. This is why we preferred to have rng as global variable.

kirui93 avatar Jan 07 '20 15:01 kirui93

If at some point you need the function to be run randomly, you can always call it with:

GaussianSamplePath1D(Random.GLOBAL_RNG)

while if it is defined without argument, making it deterministic is more painful, since one needs to set the seed outside of it

matbesancon avatar Jan 08 '20 16:01 matbesancon

@matbesancon These functions that you have pointed out were just examples of the functions that you can pass to the stochastic approximation process. The main factor that we are insisting is that these kind of functions should not take any inputs and that is why I prefer to have these examples in the current state. Having the seed outside is really expensive but we prefer to have it that way to be in the scope of the rules of the function that we are passing to the approximating functions.

kirui93 avatar Jan 08 '20 16:01 kirui93

Hello @matbesancon Any comments so far? Thank you.

kirui93 avatar Jan 21 '20 09:01 kirui93

The main factor that we are insisting is that these kind of functions should not take any inputs

Is there a reason for that? I fail to see why. In my example above, the function is taking the RNG as a normal argument, but you could also pass is as a keyword argument with a default to Random.GLOBAL_RNG, which means the default would be the same as you are currently having

matbesancon avatar Jan 21 '20 17:01 matbesancon

Hello @matbesancon , In the functions TreeApproximation! and LatticeApproximation, we pass these kind of functions as an argument and in the process, we keep calling them for every iteration we make. In this case, therefore, our requirement was to have a function to be approximated that takes no inputs. The user can use function closures or anything else for these kind of functions as long as he/she will pass a function with no inputs.

kirui93 avatar Jan 21 '20 17:01 kirui93

In the functions TreeApproximation! and LatticeApproximation, we pass these kind of functions as an argument and in the process

It is not a breaking change with your current code, if rng is taken as keyword argument, you can still call the function without any argument and it would work. The difference is that now you can also call it with a controlled input source of randomness

matbesancon avatar Jan 21 '20 20:01 matbesancon

In the functions TreeApproximation! and LatticeApproximation, we pass these kind of functions as an argument and in the process

It is not a breaking change with your current code, if rng is taken as keyword argument, you can still call the function without any argument and it would work. The difference is that now you can also call it with a controlled input source of randomness

I prefer to have it the way it is now. These functions GaussianSamplePath1D() and the rest are just examples of stochastic processes that we require the user to provide.

kirui93 avatar Jan 22 '20 07:01 kirui93

Ok, I don't think it's the ideal setting, but I will not consider that blocking for the submission to JOSS.

matbesancon avatar Jan 22 '20 09:01 matbesancon

I will still consider what you have proposed and see how having rng as an input to the functions affects the main functions.

kirui93 avatar Jan 22 '20 10:01 kirui93

Let me refer you to this line, for example,

https://github.com/kirui93/ScenTrees.jl/blob/master/src/TreeApprox.jl#L31

if we include the option of using rng for these stochastic functions, then we are forcing the user of the package to input a function which takes rng as an input all the times. This may not be the case for all situations. That is why we prefer a function with no inputs.

kirui93 avatar Jan 22 '20 10:01 kirui93