Results 11 issues of junphine

path = "/data/nlp/models/IDEA-CCNL/Randeng-T5-784M-QA-Chinese" tokenizer = AutoTokenizer.from_pretrained(path,trust_remote_code=True) mast_id = tokenizer.convert_tokens_to_ids("") # mast_id: 2 应该是词表最后100位之中

I trained 3 models, but after averaging the weights, the model output is garbled!

当使用这种格式'ans': {‘

我理解旋转位置编码应该在取得query和key的向量之后计算,这里为什么是在之前计算? input_sub具体指的是什么?作为旋转位置编码的x_pos参数传入,发现他大部分场景下始终是0向量,所以旋转位置编码没有起到作用

hugingface已经有2个月无法访问了,但是我想用transformers加载模型

Taylor (pt:300M head:48, head_dim:16, seq_len:2048) 0.62it/s GPU: 32G CausalFullAttention (pt:310M head:8 head_dim:96 seq_len:2048) 1.77it/s GPU: 26G

@lucidrains I've seen e2-tts-pytorch, it is NAR text to speech,And a duration predictor model is required. I want to use transfusion to implement tts, is it feasible? My implementation method:...

When using unet as an argument to pre_post_transformer_enc_dec, the function maybe_transition_to_modality_decoding parse out the modality_shape,it should be multiplied by 2: modality_shape = tuple(map(lambda x: x * 2,modality_shape)) the shape flow:...

增加依赖 spring-ai-alibaba-starter-store-analyticdb 减少重复代码。 修复AbstractDBConnectionPool因为每次创建DataSource,而不能pool的bug。 ### Describe what this PR does / why we need it ### Does this pull request fix one issue? ### Describe how you did it ###...