vokenization issues

About the finetune accuracy

5

Hi, thanks for your interesting work. I met a problem when I tried to finetune the model. I loaded the released pretrained model BERT_base model, and finetuned it on GLUE...

yxgnahz

processed_tokens variable is incorrectly calculated in function 'vokenization.vokenize_corpus_mp::reducer'

![image](https://github.com/airsplay/vokenization/assets/61137732/eba8f244-4b41-4182-96d3-693bde72e45a)

CoderChen01

A problem about ClassificationHead in the model.py

Thanks for your great work! And I notice that you utilized a non-linear layer with GELU and a LayerNorm operation and a linear layer called decoder as the voken classification...

Shimao-Zhang

about voken regression and voken constrastive

1

I have two questions. (1) I notice that in your code https://github.com/airsplay/vokenization/blob/5601b799184ed54414872565f233e22c76f5f6f0/vlm/model.py#L238 , you design three loss function voken classification, voken regression and voken constrastive. But you only report "voken...

lizhiustc

How to fine-tune on SQUAD?

Hi authors, Thanks for sharing this nice work! I'm big fan of it. I notice the paper reported results on SQUAD datasets, but I did not find relevant code in...

peihaowang

Revokenization

Hi, Thank you for your great work. I'm trying to train a RoBERTa-based VLM model on my own dataset. I plan to use your pre-trained vokenizer provided [here]( https://github.com/airsplay/vokenization#models). But,...

vilhil

Minor typos and grammatical fixes

bharatr21

RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

3

Training of Epoch 0: GPU 0 will process 591616 data in 2311 iterations. 0%| | 0/2311 [00:31

zhanhl316

vokenization
vokenization copied to clipboard

Metadata

About the finetune accuracy

processed_tokens variable is incorrectly calculated in function 'vokenization.vokenize_corpus_mp::reducer'

A problem about ClassificationHead in the model.py

about voken regression and voken constrastive

How to fine-tune on SQUAD?

Revokenization

Minor typos and grammatical fixes

RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

← Metadata

Owner

Metadata

vokenization vokenization copied to clipboard

Metadata

← Metadata

Owner

Metadata

vokenization
vokenization copied to clipboard