fbow icon indicating copy to clipboard operation
fbow copied to clipboard

cluster center initialization

Open jbfuehrer opened this issue 5 years ago • 4 comments

Hey,

is there a reason why the mechanics for determining the cluster centers changed from the kmpp algorithm used inside DBoW2 to the version now used in fbow?

I noticed that especially with smaller vocabularies, sometimes the exact same feature is chosen multiple times as the initial cluster center which results in one of them always being empty (because all features fall into the one being found first during linear search) and therefore generating unused/meaningless words.

I ported the DBoW2 KMPP implementation over to fbow and can do a PR. Just wanted to make sure I'm not missing any domain knowledge before doing so.

Greets

jbfuehrer avatar Mar 05 '19 15:03 jbfuehrer

Same thoughts here. The new initial-cluster-center-choosing-algorithm doesn't make sense to me, either.

dukeNashor avatar Apr 16 '19 07:04 dukeNashor

@rmsalinas Can you please comment on this? Any plans to fix the issue? @jbfuehrer Can you please commit your impl to your fork repo at least, will be much appreciated.

S-o-T avatar May 16 '19 13:05 S-o-T

@S-o-T done, also created a PR now.

jbfuehrer avatar May 16 '19 19:05 jbfuehrer

@jbfuehrer Thank you.

S-o-T avatar May 16 '19 19:05 S-o-T