Greg Tatum
Greg Tatum
Our models were trained on primarily web scraped data, subtitles, and smaller sets of curated and aligned datasets. This means that they can have bias and toxicity in them. We...
Initially in `tests/test_final_eval.py` I added an optional `OPENAI_API_KEY` to fully run the test, or else the test gets skipped. It would be better to run it with some kind of...
I'm investigating the loading process for the inference engine under a native build. This is a Linux `perf` recording of a simple translation. It was done under a debug build...
The Marian model files are not self describing. They contain the data for the model, but not the operations that need to be performed to them to actually run them....
This contains work from some other PRs, so I'm marking it as draft until I get the others landed. This is a prerequisite for #1197.
https://github.com/mozilla/translations/blob/2fe6b2b8c1ecfc0401f063f08d53744cdbde31ac/inference/marian-fork/src/graph/node_initializers.cpp#L204-L207 I'm pretty sure the Tensors can be loaded in as immutable data sources. However, they are loaded in through this lambda with a copy. There is some misdirection here...
I've had concerns over memory usage when doing parallelized translations. I did an investigation in #957 that looked into memory usage using Valgrind's dhat tool. Here is the breakdown of...
Teacher: 82.63 (-6.86%) Student: 80.79 (-9.30%) Taskgroup: https://firefox-ci-tc.services.mozilla.com/tasks/groups/cvseSyr6STiCrwNCtpQDJQ I'm not sure why, but we should investigate.
When using a shared vocab, the original source tokens are added to the shortlist as they are shared between the source and target. This allows a word from the source...
We haven't shipped the finished model yet and the COMET score is quite low for the [77.48](https://firefox-ci-tc.services.mozilla.com/tasks/FQOGa_q3QWeJXUTpGm2VDA) evaluation. We should integrate Flores+ #1190 and pull evals. Then we should investigate...