Curya
Curya
I re-train (8 V100) the mle phase using your released config file of `configs/phase1/clipRN50_mle.yml`, but the performance is lower than reported in the paper (CIDEr: 106.5 v.s 110.3). Does the...
Hi, authors. Would you please provide the details of `language_evaluation` in `eval_finecapeval.py` used in Evaluation on FineCapEval?
您好,我想问一下扩充的词表起到什么作用? https://github.com/pengxiao-song/LaWGPT/blob/main/resources/legal_vocab.txt 存在重复token(比如`公正审判`,第968行和第4137行),与chinese-llama合并时需要先对自身去重才能正常合并。 ```python # Load custom vocabulary new_tokens = open(VOC_PATH, "r").read().split("\n") new_tokens = list(set(new_tokens)) ## 去重 for token in new_tokens: if token not in llama_spm_tokens_set: new_token = model.ModelProto().SentencePiece() new_token.piece...
Hi, authors. In your paper, you mentioned: > Specifically, we set $\epsilon$ to the mean $\mathcal{l}_\infty$ norm of embedding differences between five captions that correspond to the same image. We...
参数--steps_per_epoch 是不是实际并不影响训练行为。对于Flux lora训练,我看这个参数传入到了数据集构建,但是并没有影响数据集导入的行为,只作为了输出数据集长度的,而且数据集实际长度也应该是由实际读取的数据所决定的。