yinfan98

Results 11 issues of yinfan98

Hi, I really love the work, can I deploy this work on openxlab?

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand...

docker
Jetson

### Motivation when do the w8a8 quantization in pytorch engine, I found that InternLM2 modeling like. It use self.attention, self.feed_forward... ```python class InternLM2DecoderLayer(nn.Module):   def __init__(self, config: InternLM2Config):   super().__init__()   self.hidden_size =...

hi, can I export the result with point cloud or mesh but not the video?

### PR types Others ### PR changes Others ### Description RoPE kernel support theta input

contributor

### PR Category Others ### PR Types Improvements ### Description 增强paddle.view,使其能支持动态推理shape

PaddlePaddle Hackathon
contributor
API

### PR Category Inference ### PR Types New features ### Description 给paddle添加int4量化的kernel和int4量化进行permute的kernel。 ### TL;DR 支持了一个GPU kernel,它能做int4 weight only量化的工作。并且能支持weight_only_linear (同时也能和反量化接口对齐,如果你想单纯做量化反量化看看。你可以这么执行代码) ```python import paddle x = paddle.randn(shape=[4096, 2048], dtype=paddle.float16) qt, scale =...

contributor

受散步 @sanbuphy 和 百度飞桨(厦门)人工智能产业赋能中心 邀请来给PaddleMIX套件支持InternVL2-8B。 基本搞完了但没对精度所以先来WIP下,肝到天亮有点昏昏欲睡😪。 顺便列下TODO: - [x] 定义模型结构 - [x] 下载模型 - [x] 权重转换成paddle格式并贡献转换脚本 - [x] 修正Tokenizer bug - [x] 修正预处理逻辑 - [x] 修正模型前向代码 - [ ] 测试dataset,dataloader...

contributor

Support int4 weight only quantization for Llama3 1. Define the weight only layer in ModelParallel.py 2. Define ConvertWeightToOpmx.py and add quant here 3. Update Dynamic Static modeling for quant model...