lyc728 issues

Results 20 issues of


lyc728

Install Triton Inference Server without Docker containers

/usr/bin/ld: ../libtritonserver.so: undefined reference to `absl::lts_20220623::StartsWithIgnoreCase(absl::lts_20220623::string_view, absl::lts_20220623::string_view)' /usr/bin/ld: ../libtritonserver.so: undefined reference to `absl::lts_20220623::numbers_internal::safe_strto64_base(absl::lts_20220623::string_view, long*, int)' /usr/bin/ld: ../libtritonserver.so: undefined reference to `absl::lts_20220623::ParseTime(absl::lts_20220623::string_view, absl::lts_20220623::string_view, absl::lts_20220623::Time*, std::__cxx11::basic_string*)' /usr/bin/ld: ../libtritonserver.so: undefined reference to `absl::lts_20220623::FormatTime[abi:cxx11](absl::lts_20220623::string_view,...

Triton's gpumemory footprint is twice that of Tensorrt-LLM

In my test of qwen-72b, weight_only_precision int4 is used for loading with four cards, each occupying about 12G. However, when Triton conducts reasoning, each card can occupy about 28G. May...

qwen-vl

请问有对qwen-vl微调的代码吗？大佬最近有打算复现想法没

qwen-vl

请问，在SFT qwen-vl时，如果想微调一个chat类的模型，在三阶段时用的微调模型是二阶段的，还是qwen-vl-chat模型呢？

qwen-vl

可以基于qwen-vl 进行相应的微调或者三阶段训练吗？如果修改，需要注意哪些地方

千文VL模型的三阶段和二阶段微调

技术报告中讲了SFT的数据是35K 但是有进一步讲解这些数据是哪些组成的吗？

千文模型的三阶段和二阶段微调

### 起始日期 | Start Date 2024/3/21 ### 实现PR | Implementation PR qwen-vl的微调 ### 相关Issues | Reference Issues _No response_ ### 摘要 | Summary qwen-vl的三阶段微调和二阶段微调是那个脚本呢？ ### 基本示例 | Basic Example finetune/ds_config_zero2.json这个脚本和finetune/ds_config_zero3.json是分别代表二三阶段吗？...

question

dataset

assert len(datasets) > 0, 'datasets should not be an empty iterable' # type: ignore AssertionError: datasets should not be an empty iterable 数据格式是 ![image](https://user-images.githubusercontent.com/88084847/144958390-fca6f666-1d92-4eed-b0b0-6484bdcf37c3.png)

LayoutLLM数据下载报错

你好，下载了一些数据集，解压后报错了，可能是文件有损坏，可以看下吗？

How does Triton implement batch inference

In the [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) build.py `parser.add_argument('--max_batch_size', type=int, default=10)` However, when Triton calls the code, `client/inflight_batcher_llm_client.py`, it sends grpc requests at the same time, accepts them and returns them. How does it...

triaged