namemaker icon indicating copy to clipboard operation
namemaker copied to clipboard

Feature Request. Suggest an order value based on training data

Open MrDowntempo opened this issue 1 year ago • 0 comments

3 is a great default for order. However, 2 would make more sense for really spartan datasets. But of greater interest to me, is very large datasets where a greater order makes more sense, and could produce better results. I've been tinkering with this idea myself in trying to find a good order value for my personal datasets. The more names in the source dataset (maybe logarithmically?) seems to be a good place to start. However, there's probably some additional considerations that could help. Such as if, even with a large dataset, certain chains lead to dead ends. That is to say, even if you have 100k names in a dataset, if there is only one name with a "thi" in it, then any th will always lead to i at low orders.

Having the order value either more 'thoughtfully' picked with an algorithm, or instead having method that could be run optionally to determine what it thinks the best order value would be, might produce better results out of the box.

MrDowntempo avatar Jun 01 '23 15:06 MrDowntempo