ml5-library icon indicating copy to clipboard operation
ml5-library copied to clipboard

Implementing random seed in kmeans

Open tlsaeger opened this issue 4 years ago • 1 comments
trafficstars

Dear ml5 community,

I'm submitting a new issue. Please see the details below.

As discussed in issue #1136 with @bomanimc I would like to improve on the kmeans implementation, by adding the option to specify a random seed. This would make sure, that the output does not vary when the random seed is set.

It is implemented in the original Tensorflow, could not find the kmeans implementation in tf.js at all. @bomanimc would be cool if you could help me get started here, or anyone else who might wanna join. I have never done anything like that before.

Thanks!

tlsaeger avatar Apr 16 '21 16:04 tlsaeger

Chiming in here one year later (I wrote the ml5.js implementation of kmeans):

The non-deterministic behavior of kmeans is expected. It's also expected behavior across many other ml models (basically any that rely on randomly initialized weights or, as with kmeans, starting points).

For that reason, maybe it makes sense to:

  1. implement a random seed function that can be shared across all model classes
  2. import/reuse that func in relevant model implementations
  3. give each model an optional parameter to use that random seed.

If you want to update just kmeans, easiest way is probably:

  1. Adding random seed generator to the random.js utils file and having the randomSample func employ it (with a seed parameter)
  2. Add a new param to DEFAULTS and update line 86 of this file to use the new function (and feed in the new seed value)

jwilber avatar Apr 29 '22 01:04 jwilber