khmer icon indicating copy to clipboard operation
khmer copied to clipboard

Is the normalize-by-median.py script deterministic?

Open zmunro opened this issue 4 years ago • 2 comments

Trying to recreate some previous;y computed results and there are a few places where nondeterminism could be present and I was wondering if it could be present in this script.

zmunro avatar Jun 12 '20 19:06 zmunro

I'll let @ctb correct me if I'm wrong. But if nothing else has changed in the environment, running normalize-by-median.py multiple times with the same arguments and the same inputs should produce identical output in single-threaded mode. There can be some non-deterministic behavior when counting k-mers in multi-threaded mode, so I guess there's a chance you might get slightly different results there. But I wouldn't expect the differences to be very much.

standage avatar Jun 12 '20 19:06 standage

On Fri, Jun 12, 2020 at 12:41:31PM -0700, Daniel Standage wrote:

I'll let @ctb correct me if I'm wrong. But if nothing else has changed in the environment, running normalize-by-median.py multiple times with the same arguments and the same inputs should produce identical output in single-threaded mode. There can be some non-deterministic behavior when counting k-mers in multi-threaded mode, so I guess there's a chance you might get slightly different results there. But I wouldn't expect the differences to be very much.

Agreed!

ctb avatar Jun 22 '20 15:06 ctb