iskdaemon icon indicating copy to clipboard operation
iskdaemon copied to clipboard

allow user to tweak imgdb parameters

Open ricardocabral opened this issue 12 years ago • 5 comments

currently, the only one of these which is exposed in the interface is the number of “buckets” used for the average luminosity part of the search (which controls how precise the calculation is: more buckets gives a more precise calculation but is slower). Expose more of these parameters, so that a system can be built to tune them optimally.

ricardocabral avatar Jan 22 '12 17:01 ricardocabral

@ricardocabral can you explain how to tweak imgdb params via API ?

tegansnyder avatar Oct 09 '15 01:10 tegansnyder

Hi. Unfortunately not much can be calibrated now through the API. As examples of parameters to be tweaked for each application (typical queries, typical photos added to the database) are the hardcoded weights at https://github.com/ricardocabral/iskdaemon/blob/master/src/imgSeekLib/imgdb.h#L39

These could be specified via API parameters, or config files, and a supervised learning algorithm could be used to tweak these parameters randomly to see what gives the best score, least errors etc for your specific application.

ricardocabral avatar Oct 09 '15 14:10 ricardocabral

Thanks for the reference. I was digging into the source code and noticed these. I need to understand this a bit more before I proceed with experimentation.

I was experimenting with imgSeek to build a way to identify similar product images to a trusted source of product imagery. Think of it as a reverse image search based fuzzy matching algorithm for identifying similar products as an addition to web crawling activities.

My initial load of 128,000 product images yielded various successes when matching found product images on white backgrounds against other known source images. My goal was to build a way to identify similar product images found via crawling the web. It is hit or miss. It seems that colors play a major role (color histogram), along with the wavelet functions.

I think imgSeek is a good way to get similarity of images that are slight variations of each other, the untrained nature of it leads to false positives. One example I see is when I use a photo of a product I take from my mobile phone that may introduce some noise or light it does yield the same results as a photo found via the web. I'm going to be looking into training a supervised model using OpenCV to pick out product images, but I prior to going that route I will try experimenting with tweaking the imgSeek sketch weights. Do you have any suggestions on where to start?

Thanks!

tegansnyder avatar Oct 10 '15 03:10 tegansnyder

the paper that imgseek/isk-daemon is based on has some details on the steps the authors when through to get to these "magic" weights that are hardcoded: http://grail.cs.washington.edu/projects/query/ I'd suggest starting there for some ideas.

On Sat, Oct 10, 2015 at 12:16 AM, Tegan Snyder [email protected] wrote:

Thanks for the reference. I was digging into the source code and notice these. I need to understand this a bit more before I proceed with experimentation.

I was experimenting with imgSeek to build a way to identify similar product images to a trusted source of product imagery. Think of it as a reverse image search based fuzzy matching algorithm for identifying similar products as an addition to web crawling activities.

My initial load of 128,000 product images yielded various successes when matching found product images on white backgrounds against other known source images. My goal was to build a way to identify similar product images found via crawling the web. It is hit or miss. It seems that colors play a major role (color histogram), along with the wavelet functions.

I think imgSeek is a good way to get similarity of images that are slight variations of each other, the untrained nature of it leads to false positives. One example I see is when I use a photo of a product I take from my mobile phone that may introduce some noise or light it does yield the same results as a photo found via the web. I'm going to try looking into training a supervised model using OpenCV to pick out product images, but I also might try experimenting with tweaking the sketch weights. Do you have any suggestions on where to start?

Thanks!

— Reply to this email directly or view it on GitHub https://github.com/ricardocabral/iskdaemon/issues/38#issuecomment-147027718 .

ricardocabral avatar Oct 10 '15 13:10 ricardocabral

Thanks @ricardocabral i appreciate your input.

tegansnyder avatar Oct 10 '15 14:10 tegansnyder