fitter icon indicating copy to clipboard operation
fitter copied to clipboard

Reproducibility across fittings

Open nchelaru opened this issue 5 years ago • 3 comments
trafficstars

Hello!

Thank you so much for making this tool, it is very useful!

I noticed that across multiple fittings to the same set of data, different "best" distributions are shown. Is this intended behaviour? Is there a way to ensure reproducibility across runs, like setting a random seed?

Cheers, Nancy

nchelaru avatar Oct 14 '20 01:10 nchelaru

thanks for using fitter. Interesting behaviour that you have noted here. This is not intended. It may happen that several distributions have exactly the same score ? In which case, the sorting algorithm based on the score may not be deterministic, although I doubt it since this is performed with pandas. Would you have an example ?

cokelaer avatar Nov 22 '20 17:11 cokelaer

@cokelaer thanks a lot for such a great tool! @nchelaru, at first glance I thought I was facing the same issue. But a closer look revealed that in fact the common 'best distributions' between different fittings had the same error values. Looking deeper, I saw a few of my best fit distributions had failed to converge within the default 30s timeout. This may be due to some differences in the internal processing of the function. The apparent issue was solved when I increased the timeout. Hope this helps. Cheers!

vartak16 avatar Dec 03 '20 20:12 vartak16

@vartak16 thanks this is probably the reason indeed. Thanks for using fitter and the encouragements.

cokelaer avatar Dec 06 '20 16:12 cokelaer