bergamot-translator
bergamot-translator copied to clipboard
ARM Support for bergamot-translator matrix multiplies
We're going to do this the way it saves time in the long run.
Edit: Looks like I'm going to start doing with browsermt fork first. We'll figure marian-nmt/marian-dev later.
- [ ] https://github.com/jerinphilip/marian/pull/4
- [x] https://github.com/jerinphilip/marian/pull/5
The tasks involve the following:
- [ ] simde for the remaining mathfuncs
- [ ] GEMM for ARM.
So previously, we were looking at -DUSE_WASM_COMPATIBLE_SOURCE=on when targeting Mozilla ARM, but we don't want this on native (as it disables threading etc). And enabling it appears to have uncovered some x86 only things in faiss.
https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/3rd_party/faiss/VectorTransform.cpp#L135
More recent faiss has ARM support (but is failing due to absence of BLAS in my android-ndk cross compile).
We don't use faiss currently. It could be in the future if we switch to maximum dot product instead of shortlists.
We've hit Hello world on ARM.
$ cat $CONFIG
relative-paths: true
models:
- model.intgemm.alphas.bin
vocabs:
- vocab.deen.spm
- vocab.deen.spm
shortlist:
- lex.s2t.bin
- false
beam-size: 1
normalize: 1.0
word-penalty: 0
mini-batch: 64
maxi-batch: 1000
maxi-batch-sort: src
workspace: 2000
max-length-factor: 2.5
gemm-precision: int8Alpha
$ ./marian-decoder -c $CONFIG --log-level off <<< "Hello world!"
Hallo Welt!
git diff?
@XapaJIaMnu
https://github.com/jerinphilip/marian/pull/4 (It's in this issue's description). This attempt experimental, in that I started an #ifdef USE_INTGEMM next to the WASM, pulled in https://github.com/jerinphilip/MozIntGemm/ and now got it to work.
There's Android NDK on CI successfully building libmarian (There's a secondary issue of protobuf -> sentencepiece-> android logging causing issues, which I have deferred a solution for). On the oracle cloud ARM machine we have access to all targets compile.
I am not optimistic about the source in the existing state (ifdefs that have to be tracked) being maintainable and am open to suggestions regarding how best to clean it up. I also see conversations of merging with upstream (marian-nmt/marian-dev). I will need to start tracking the other branches (int8 similar to int8Alpha) and fix those up as well.