ml-commons [FEATURE] Support local cross-encoder model

[FEATURE] Support local cross-encoder model

Open HenryL27 opened this issue 2 years ago • 5 comments

Is your feature request related to a problem? We're trying to put a bunch of local model types in ml-commons (#1164). One such type is a cross-encoder. This will allow us to support reranking in the neural-search plugin, which a lot of people have asked for.

What solution would you like? Will be able to upload a custom cross-encoder model, deploy it, and use it with the upcoming neural search reranking processor.

What alternatives have you considered? External hosting: Still would have to deal with the inputs and outputs, and then we also get the pleasure of figuring out some solution for externally hosting cross-encoders.

LTR: That can do reranking, but not cross-encoder reranking, so this isn't where that thought belongs.

Do you have any additional context? Add any other context or screenshots about the feature request here.

Nov 03 '23 21:11 HenryL27

Gonna implement this as a new ml-algorithm / function name: TEXT_SIMILARITY. Cross-encoders are one technique to do this, but not the only one. Essentially it's just defined as (text1, text2) -> similarity_socre (As opposed to embedding models which are (text) -> vector, and then similarity is an inner product of two vectors)

Nov 03 '23 22:11 HenryL27

A bunch of requests have come in for remote reranking models too, so I'll make sure that connectors can deal with TextSimilarityDatasets appropriately as well

Nov 06 '23 21:11 HenryL27

Make sure local model and remote model can be switched smoothly .

Nov 17 '23 17:11 ylwu-amzn

Make sure local model and remote model can be switched smoothly .

This is done in this PR https://github.com/opensearch-project/ml-commons/pull/1954

Feb 03 '24 01:02 ylwu-amzn

Make sure local model and remote model can be switched smoothly .

@HenryL27 will check remote model and then we can close this issue.

Apr 09 '24 18:04 dhrubo-os

ml-commons ml-commons copied to clipboard

[FEATURE] Support local cross-encoder model

ml-commons
ml-commons copied to clipboard