David Dale

Results 74 comments of David Dale

The Seamless project did not release a speech language identification model. However, you can use a speech LID model from a related project called MMS: https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#lid. In https://github.com/facebookresearch/seamless_communication/issues/325 I give...

Can you please explain once more, what final goal you want to achieve, and how would you want the solution to look like?

> implement the word/sentence look-up without external dependencies What do you mean by "without external dependencies"? You want to do the lookup in pure `numpy`, without `gensim` and `compress_fasttext` packages?

> Of course, if you have pointers, that'd be great. For example, the hashing function is not clear in compress fastttext lookup. What kind of pointers do you need? And...

> 1- Can you help me narrow down ft_cc.en.300_freqprune_50K_5K_pq_100.bin model from 300 dim to 100? This model already has internal product-quantized vectors in 100 dimensions, just as its name tells.

> 2- Can you help me reproduce ft_cc.en.300_freqprune_50K_5K_pq_100.bin model? What are the steps to compress? compress-fasttext is not working for me. I produced it with `compress-fasttext`. If it is not...

> I do not wish to mess with the function. But I want to know which hash function you are using. Can you provide pointer to its code? The function...

Hi Wafaa! Currently, Stopes is focused only on translation models, and ALTI+ was implemented only for seq2seq transformers, such as NLLB. We are not currently planning to adapt ALTI+ to...

Hi @zhenghuawang6! Can you please share some code and inputs that could reproduce this problem?

Hi @andysegura89! It looks like you are using a submitit launcher, which is expected to work with a Slurm cluster (this is what we are using by default in the...