Ryan Henderson

Results 45 comments of Ryan Henderson

Hi @ecdeveloper, Yes, the signatures are pretty heavy and your ballpark figures are correct. Disk space was not a concern for us when we developed this, so we didn't put...

That's exactly right. With the Elasticsearch implementation, the actual image distances are only used to sort the final results. This could easily be done with the Elasticsearch `match` score. So...

You shouldn't need to alter the database query. ``SignatureES`` overrides the ``__init__`` method. You can have it explicitly take a ``dist`` argument. Make the default ``None``. Then, pass the base...

Actually this looks good to me. As long as we choose the default threshold to still get most of the positive results (high recall) it's ok to have worse specificity....

Something between 1 and 10? :rofl: Seriously though, just pick something in that range. If you really want to fix narrower bounds on value, you can look at the tests...

There are 63 words available for search. The Tf/Idf top term means all those words are an exact match, and the Idf means that every word is unique in the...

That's ok. It's a typical feature of "search engine"-like applications...some of the results are basically meaningless. But if you were doing a search, wouldn't you rather have a look at...

Even better than expected! You can try reducing the number of words (`N`) and see how it affects your results.

Yes it should strictly decrease accuracy, but I don't know by how much. The original paper used 100 words of length 10, so more and shorter words. I don't find...