Greg Tatum issues

Results 204 issues of


                                            Greg Tatum

Create toxicity and bias evaluations

Our models were trained on primarily web scraped data, subtitles, and smaller sets of curated and aligned datasets. This means that they can have bias and toxicity in them. We...

evals

Mock out the OpenAI API for the llm eval task to guard against regressions

Initially in `tests/test_final_eval.py` I added an optional `OPENAI_API_KEY` to fully run the test, or else the test gets skipped. It would be better to run it with some kind of...

evals

Inference engine ExpressionGraph investigation

I'm investigating the loading process for the inference engine under a native build. This is a Linux `perf` recording of a simple translation. It was done under a debug build...

inference

Hook up the graphviz code for the expression graph

The Marian model files are not self describing. They contain the data for the model, but not the operations that need to be performed to them to actually run them....

inference

Build the inference engine locally using the same codepaths as the Wasm engine.

This contains work from some other PRs, so I'm marking it as draft until I get the others landed. This is a prerequisite for #1197.

Use immutable memory references on Tensors to save a good chunk of memory

https://github.com/mozilla/translations/blob/2fe6b2b8c1ecfc0401f063f08d53744cdbde31ac/inference/marian-fork/src/graph/node_initializers.cpp#L204-L207 I'm pretty sure the Tensors can be loaded in as immutable data sources. However, they are loaded in through this lambda with a copy. There is some misdirection here...

inference

Investigate using AsyncService for the Wasm inference engine

I've had concerns over memory usage when doing parallelized translations. I did an investigation in #957 that looked into memory usage using Valgrind's dhat tool. Here is the breakdown of...

inference

English to Belarusian (en-be) failed to train a good teacher

Teacher: 82.63 (-6.86%) Student: 80.79 (-9.30%) Taskgroup: https://firefox-ci-tc.services.mozilla.com/tasks/groups/cvseSyr6STiCrwNCtpQDJQ I'm not sure why, but we should investigate.

language-coverage

Shortlisting is not correct for split vocabs

When using a shared vocab, the original source tokens are added to the shortlist as they are shared between the source and target. This allows a word from the source...

bug

English to Albanian (en-sq) has no flores evaluation and TedTalks COMET is low

We haven't shipped the finished model yet and the COMET score is quite low for the [77.48](https://firefox-ci-tc.services.mozilla.com/tasks/FQOGa_q3QWeJXUTpGm2VDA) evaluation. We should integrate Flores+ #1190 and pull evals. Then we should investigate...

language-coverage