MeJerry215

Results 9 issues of MeJerry215

关于此处char 的类型 在大多数语言中 是 char 可以看做是int8 所以 正常来说 符号是从 0-127表示的 ASICII码 而且是区分 unsigned 和 signed, 所以这里直接是2 bytes char 还有 默认是unsigned的 我是不认同的。 https://github.com/krahets/hello-algo/blob/7ca27c3df1bfc981fc7faa4528dadb410457d221/docs/chapter_data_structure/data_and_memory.md?plain=1#L24 还有此处没有将原码 反码 补码的情况下, 我认为在计算机中存储的就是补码。默认的整数在不区分unsigned 和signed 的情况下我更倾向于signed int32数据类型,则补码表示的负数 会比...

使用`ls_fs_transformer_export.py` 导出en2fr的时候发现 缺少layernorm参数 en2de ``` dict_keys(['encoder.embed_tokens.para', 'encoder.layers.0.para', 'encoder.layers.1.para', 'encoder.layers.2.para', 'encoder.layers.3.para', 'encoder.layers.4.para', 'encoder.layers.5.para', 'encoder.layer_norm.weight', 'encoder.layer_norm.bias', 'decoder.embed_tokens.para', 'decoder.layers.0.para', 'decoder.layers.1.para', 'decoder.layers.2.para', 'decoder.layers.3.para', 'decoder.layers.4.para', 'decoder.layers.5.para', 'decoder.layer_norm.weight', 'decoder.layer_norm.bias', 'decoder.output_projection.clip_max']) ``` en2fr ``` dict_keys(['encoder.embed_tokens.para', 'encoder.layers.0.para', 'encoder.layers.1.para',...

在fairseq中的weight都是单个op的weight,而在ligthseq中可能是融合了整层的参数,所以如何将这些整层的参数对应上?让我能够直接用到fairseq中已经训练好的weigth。 比如如下是ls中的weight信息 ``` dict_keys(['encoder.embed_tokens.para', 'encoder.layers.0.para', 'encoder.layers.1.para', 'encoder.layers.2.para', 'encoder.layers.3.para', 'encoder.layers.4.para', 'encoder.layers.5.para', 'encoder.layer_norm.weight', 'encoder.layer_norm.bias', 'decoder.embed_tokens.para', 'decoder.layers.0.para', 'decoder.layers.1.para', 'decoder.layers.2.para', 'decoder.layers.3.para', 'decoder.layers.4.para', 'decoder.layers.5.para', 'decoder.layer_norm.weight', 'decoder.layer_norm.bias', 'decoder.output_projection.clip_max']) ``` 选择其中的decoer.layer.5 预期对应的weight应该是 fairseq中的 ``` ['decoder.layers.5.self_attn.in_proj_weight', 'decoder.layers.5.self_attn.in_proj_bias', 'decoder.layers.5.self_attn.out_proj.weight',...

**Describe the bug** 模型存在两个输入 input_ids, attention_mask 当前尝试使用onnxsim input_model.onnx output_model.onnx --overwrite-input-shape "input_ids:1,128;attention_mask:1,128",失败 尝试使用onnxsim input_model.onnx output_model.onnx --overwrite-input-shape "input_ids:1,128 attention_mask:1,128" 失败 尝试使用onnxsim input_model.onnx output_model.onnx --overwrite-input-shape "input_ids:1,128 attention_mask:1,128" 失败 **Model** gpt2 huggingface 导出 这个入参如何使用多输入的填写呢?

only call dot in a block with M = 16, N = 1, K = 128, PN = 16 since calling `tl.dot` needs M, N, K >=16, padding the N...

TestScripts ```python import torch import triton import triton.language as tl import math import sys torch.manual_seed(42) import matplotlib.pyplot as plt import csv import functools import torch import time def median(lst): sorted_lst...

**hugging face model card**: WisdomShell/CodeShell-7B-Chat **Model Description**: CodeShell is a multi-language code LLM developed by the [Knowledge Computing Lab](http://se.pku.edu.cn/kcl/) of Peking University. CodeShell has 7 billion parameters and was trained...

In the given examples axoltol [exmaples/medusa](https://github.com/ctlllll/axolotl/tree/main/examples/medusa), I follow the `vicuna_7b_qlora_stage1.yml` and `vicuna_7b_qlora_stage2.yml` to write my llama2 trainning config. Howerver I did't get such greate performance improvement, below is my test...

I would like to know what features Triton will develop and support in the future, but I can't find any information on the homepage.