dedupe icon indicating copy to clipboard operation
dedupe copied to clipboard

Feature - human-readable version of write_settings method

Open maxkadel opened this issue 11 months ago • 2 comments

I work for a university library, and I'm exploring using dedupe for identifying partner MARC records that are duplicates of our existing records.

I've had some success, using dedupe in combination with the pymarc library.

My issue is that I don't think I can get buy-in to use this work in production without more transparency about what decisions the machine learning algorithm has made about the training data - basically, I want some version of the write_settings method that writes those settings in a more human-readable way, even if that human has to have some expertise in order to interpret those settings.

maxkadel avatar Jan 17 '25 14:01 maxkadel

I am not a specialist in dedupe, but in machine learning or neural networks in general, the internal link structure is almost impossible to obtain.

SeregaKR avatar Jan 23 '25 08:01 SeregaKR