lanlin issues

Results 16 issues of


                                            lanlin

fix import error in parallel build

When using custom model file and building the engine in parallel, the original implementation will raise import error.

support jax wheels

镜像名: jax-wheels 上游路径: https://storage.googleapis.com/jax-releases/jax_cuda_releases.html 镜像简介: https://github.com/google/jax 国内其他镜像源同步情况: 同步方法: 镜像大小: 状态追踪: - [ ] 同意同步 - [ ] 同步脚本 - [ ] 部署到镜像源 - [ ] 首次同步成功 - [ ] 测试脚本...

new-mirror

Move Tensorkubes content to separate repository

Yesterday, I saw the complete example of tf on k8s. But now some files are missing in the branch ”tensorkubes“. Could you send me the original complete file? thx~

[Bug] Ascend v0.7.2.post1，对serving api测速，概率性卡死

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. -...

when a model has layers with and without GPT plugin enabled, GptSession raises error

### System Info TensorRT-LLM: latest main branch built in the triton-trtllm container (23.12) GPU: V100 ### Who can help? @byshiue ### Information - [ ] The official example scripts -...

bug

triaged

Investigating

Generic Runtime

the first sleep call does not release memory when initialized

env: ``` ms-swift==3.10.0 transformers==4.57.1 ``` I add a print after sleep [here](https://github.com/modelscope/ms-swift/blob/main/swift/trainers/rlhf_trainer/rollout_mixin.py#L170) ``` ... context = self.offload_context if self.enable_offload else nullcontext with context(): self.engine = self._prepare_vllm_engine() if args.sleep_level > 0:...