ccdv-ai
ccdv-ai
Hi @monk1337 The loaded model has a maximum sequence length of 512 tokens. If you use: `model = BertModel.from_pretrained("bert-large-uncased", max_position_embeddings=1024)` The model won't be loaded because the loaded checkpoint also...
Hi @Gimperion Can you share your transformers version and a snippet of code you did use?
I think I found the problem @Gimperion Something is wrong with the model and the tokenizer. The `` token has the index 50264 while the model config states that "vocab_size":...
hi @shensmobile You can train the LSG model the same way as the other models. Two ways to use it: 1. Fine-tune the base model then convert it for the...
First warning is ignorable. Should work out of the box with this code: ```python from lsg_converter import LSGConverter # To convert a model model_path = "myroberta_model" # or whatever model...
@shensmobile For a given token, the maximum context is equal to `3*block_size + 2*sparse_block_size*sparsity_factor`. Its better to use the same size for blocks and sparse blocks for efficiency reasons. Using...
You can also try using fp16 instead of fp32. Gradient accumulation is fine. Changing the optimizer can reduce memory, SGD is lighter than Adam but convergence is slower. If you...
The architecture in the config is `HF_ColBERT`, [see](https://huggingface.co/AdrienB134/ColBERTv1.0-bert-based-spanish-mmarcoES/blob/main/config.json). If this is a BERT model, change: ``` "architectures": ["HF_ColBERT"] ``` to ``` "architectures": ["bert"] ``` Or try : ``` model, tokenizer...
Hi, I have no idea. If it can load HF models, it should be able to load lsg models.
@danielhanchen ok this is something with sentencepiece Some models are missing the tokenizer ".model" file so the **fast** tokenizer can be loaded but not the **slow** one. And there is...