Hossam Amer
Hossam Amer
If you can explain what the TopKMD is doing in the old code, that'd be greatly appreciated.
Thanks, @byshiue. Just one question, beam_online_softmax_topk_stage1_kernel depends on the vocab size parameter to move to the right memory location of log probs. Does beam_online_softmax_topk_stage2_kernelLauncher and/or batch_topk_kernel also depend on the...
I believe that should be the solution: ``` new_vocab = ft_gensim.key_to_index new_vectors = ft_gensim.vectors.unpack() new_ngrams = ft_gensim.vectors_ngrams.unpack() ``` That being said, this code increases the size of the original model...
My original goal is: (1) Take any language model from [here](https://fasttext.cc/docs/en/crawl-vectors.html) and compress this model down to 2-3 MBs using the `prune_ft_freq` (2) Use this model and implement the word/sentence...
> > implement the word/sentence look-up without external dependencies > > What do you mean by "without external dependencies"? You want to do the lookup in pure `numpy`, without `gensim`...
> > Of course, if you have pointers, that'd be great. For example, the hashing function is not clear in compress fastttext lookup. > > What kind of pointers do...
Thanks @urialon for getting back. The model that I was using in the previous (sorry I edited my post above) is the one given in the repo. That said, the...
Just want to update on the issue. Using the following did not result into the size issue: ``` MODEL=neulab/distilgpt2-finetuned-wikitext103 CUDA_VISIBLE_DEVICES=0 python -u run_clm.py \ --model_name_or_path ${MODEL} \ --dataset_name wikitext --dataset_config_name...
Hi Uri, I tried to construct the datastore with the wikitext validation set and given distill gpt model. Then run knn using the same set. The final perplexity scores are...