junphine issues

Results 11 issues of


                                            junphine

support for config fulltext index in queryEntities

T5-qa <extra_id_0> 的mast_id 返回为2？

path = "/data/nlp/models/IDEA-CCNL/Randeng-T5-784M-QA-Chinese" tokenizer = AutoTokenizer.from_pretrained(path,trust_remote_code=True) mast_id = tokenizer.convert_tokens_to_ids("") # mast_id： 2 应该是词表最后100位之中

How to weights merge?

I trained 3 models, but after averaging the weights, the model output is garbled!

带<mask>的<ans>格式的输出不一致？

当使用这种格式'ans': {‘

能否解释一下EmbeddingExt中的旋转位置编码的作用是什么？

我理解旋转位置编码应该在取得query和key的向量之后计算，这里为什么是在之前计算？ input_sub具体指的是什么？作为旋转位置编码的x_pos参数传入，发现他大部分场景下始终是0向量，所以旋转位置编码没有起到作用

请问能不能把transformers的包装代码也放进来

hugingface已经有2个月无法访问了，但是我想用transformers加载模型

Compared to CausalFullAttention, Taylor is slow to train and use more GPU

Taylor (pt:300M head:48, head_dim:16, seq_len:2048) 0.62it/s GPU: 32G CausalFullAttention (pt:310M head:8 head_dim:96 seq_len:2048) 1.77it/s GPU: 26G

use transfusion implement text to speech

@lucidrains I've seen e2-tts-pytorch, it is NAR text to speech，And a duration predictor model is required. I want to use transfusion to implement tts, is it feasible? My implementation method：...

Please confirm something about the error of modality_shape size in sample

When using unet as an argument to pre_post_transformer_enc_dec, the function maybe_transition_to_modality_decoding parse out the modality_shape，it should be multiplied by 2: modality_shape = tuple(map(lambda x: x * 2,modality_shape)) the shape flow：...

nl2sql LLMService的call增加spring cache支持

增加依赖 spring-ai-alibaba-starter-store-analyticdb 减少重复代码。修复AbstractDBConnectionPool因为每次创建DataSource,而不能pool的bug。 ### Describe what this PR does / why we need it ### Does this pull request fix one issue? ### Describe how you did it ###...