bergamot-translator icon indicating copy to clipboard operation
bergamot-translator copied to clipboard

Quality scores when `responseOptions.qualityScores:true` but `skip-cost:true`

Open jelmervdl opened this issue 3 years ago • 6 comments

When responseOptions.qualityScores:true is used with a model that has skip-cost:true, the quality scores output is still there and there is no error message. However, they average around 20 instead of the -0.1…-0.9 ish?

I'd expected that it would fail on an ABORT, instead of return completely different scores.

jelmervdl avatar Feb 25 '22 10:02 jelmervdl

When responseOptions.qualityScores:true is used with a model that has skip-cost:true, the quality scores output is still there.

There is also a solution where the client does not try to do the above. In the short term, @abhi-agg is expected to know this caveat. Values being not within limits should be signal enough.

A neat way to solve the general problem is perhaps to leak model information (show-alignment, skip-cost) etc through Request to Response, and during Response construction use that to trigger ABORTS.

Use the following information:

https://github.com/browsermt/bergamot-translator/blob/fe3f3982debbd88e94f1909c313d1a611d2ff1c9/src/translator/request.h#L84

Pass required stuff indicating capability of alignment, quality along below:

https://github.com/browsermt/bergamot-translator/blob/fe3f3982debbd88e94f1909c313d1a611d2ff1c9/src/translator/request.cpp#L87

There is also a config UI problem: show-alignment, skip-cost being completely alien vocabulary to HTML or quality. I'm not particularly a fan of Config = std::string, the TranslationModelConfig from https://github.com/browsermt/bergamot-translator/blob/fe3f3982debbd88e94f1909c313d1a611d2ff1c9/doc/Unified_API.md was super hard to modify, so I just took the std::string proposal. On the plus side, the std::string based approach has made correspondences to marian's and switching configuration quite fast allowing experimentation.

jerinphilip avatar Feb 26 '22 11:02 jerinphilip

There is also a solution where the client does not try to do the above.

I know, but that feels like a you're holding it wrong type of solution.

I think I like most of your solution. If we could expose the (parsed?!) model config options through public interface of TranslationModel, and then pass reponseBuilder_ and eventually the QualityEstimator a reference to the model that was used, they could figure out themselves whether the model was okay and ABORT if that was not the case. QualityEstimator is specialised enough code that it is okay to specifically query skip-cost in there. It's not like it needs to be compatible with another Marian-like setup that doesn't use that parameter.

I'm pretty happy with having a std::string (or the Yaml datastructure that marian uses) as the configuration. That way bergamot-translator doesn't need code changes to be synced with every update to marian.

My initial idea was to figure out where QualityEstimator gets its values from from the History object, and then see whether I can make it detect that those values are uninitialised. Similar to how HTML detects that response.alignment is empty. I got as far as finding that skip-cost affects the mode or use parameter when building the graph, but could not yet figure out why QualityEstimator still has access to something that is supposedly skipped during construction of the graph.

jelmervdl avatar Feb 28 '22 11:02 jelmervdl

A neat way to solve the general problem is perhaps to leak model information (show-alignment, skip-cost) etc through Request to Response, and during Response construction use that to trigger ABORTS.

This information doesn't need to leak via Request because the translate API expects TranslationModel object as a parameter (same goes for Async version of translate API as well) and each TranslationModel object is created with a specific config that is stored internally. So you have that information already.

abhi-agg avatar Feb 28 '22 11:02 abhi-agg

I know, but that feels like a you're holding it wrong type of solution.

Do not disagree. However, setting the correct YAML + ResponseOptions is necessary at the client. Is a 2 line change. The C++ change proposed here to trigger an ABORT is a larger change. The ABORT is intended for a developer to know and set the right config, there's no recovery given the WASM exception state. Is this really what we want to wait on for implementing QE? A competent developer should be able to navigate and simplify all our lives, allowing an eventual fix?

This information doesn't need to leak via Request...

This is a feasible idea. For a checkConsistent(model, responseOptions) -> ABORT before queueing, it need to be written at 4 places - {translate, pivot} x {blocking,async}service. I believe the above is enough for this issue to be not deemed a "blocker". Please get started on implementing QE - such a fix will be brought in independently.

Since I'm assigned this, please treat this as intimation, that I'm not going to get around to solving this until ARM and separate graphs from workspaces are solved (particularly to help @abhi-agg schedule).

jerinphilip avatar Feb 28 '22 12:02 jerinphilip

Oh no this is in no way blocking. Just intended as a convenience feature and a way to prevent wrong output( as opposed to an error message) in case of a mistake

jelmervdl avatar Feb 28 '22 12:02 jelmervdl

There is also a solution where the client does not try to do the above. In the short term, @abhi-agg is expected to know this caveat.

I am completely aware of the caveats.

This is a feasible idea. For a checkConsistent(model, responseOptions) -> ABORT before queueing, it need to be written at 4 places - {translate, pivot} x {blocking,async}service. I believe the above is enough for this issue to be not deemed a "blocker". Please get started on implementing QE - such a fix will be brought in independently.

This issue is not blocking QE integration. I just left my opinion regarding the solution.

abhi-agg avatar Feb 28 '22 13:02 abhi-agg