ianzangwill2 comments

Results 6 comments of


ianzangwill2

如何提高推理的性能

7分钟包括模型加载时间嘛？ -- 不包括。忘了说了，我是在 windows11上安装了 WSL2， WSL2 里使用ubuntu，再进行推理。 load_in_8bit=True 请问这个参数是在哪个函数里？谢谢

我测试了下，报错信息如下：如果不使用 load_in_8bit=True，那么模型加载和推理速度很慢，要5-7分钟才能出结果。此时模型的dype显示为torch.float32 如果使用了 load_in_8bit=True，报错信息如下： Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument...

如何提高推理的性能

谢谢！哪天试试看 Original Email Sender:"Bo仔很忙"< ***@***.*** >; Sent Time:2023/4/10 23:30 To:"LianjiaTech/BELLE"< ***@***.*** >; Cc recipient:"ianzangwill2"< ***@***.*** >;"Author"< ***@***.*** >; Subject:Re: [LianjiaTech/BELLE] 如何提高推理的性能 (Issue #101) 有兴趣可以试试看复刻的版本，int8量化后大概是8g左右显存，单卡跑起来推理速度还行，belle，同时复刻了chatglm和原版belle — Reply to this email directly,...

如何提高推理的性能

我试了两次，但是好像占用内存比较多？在已经allocate 30GB内存的情况下好像还在读取，然后就被os强制杀死了。 Original Email Sender:"Bo仔很忙"< ***@***.*** >; Sent Time:2023/4/11 1:30 To:"LianjiaTech/BELLE"< ***@***.*** >; Cc recipient:"ianzangwill2"< ***@***.*** >;"Author"< ***@***.*** >; Subject:Re: [LianjiaTech/BELLE] 如何提高推理的性能 (Issue #101) 有兴趣可以试试看复刻的版本，int8量化后大概是8g左右显存，单卡跑起来推理速度还行，belle，同时复刻了chatglm和原版belle — Reply to this...

如何提高推理的性能

你是用 windows+WSL，还是原生linux？ Original Email Sender:"Bo仔很忙"< ***@***.*** >; Sent Time:2023/4/17 11:17 To:"LianjiaTech/BELLE"< ***@***.*** >; Cc recipient:"ianzangwill2"< ***@***.*** >;"Author"< ***@***.*** >; Subject:Re: [LianjiaTech/BELLE] 如何提高推理的性能 (Issue #101) 我试了两次，但是好像占用内存比较多？在已经allocate 30GB内存的情况下好像还在读取，然后就被os强制杀死了我本机32g内存是没有问题，没出现这种情况，我看下是不是存在这样的问题~ — Reply...

如何提高推理的性能

不好意思，最近比较忙，一直没回复。 windows下直接调用python的库吗？ Original Email Sender:"Bo仔很忙"< ***@***.*** >; Sent Time:2023/4/18 10:19 To:"LianjiaTech/BELLE"< ***@***.*** >; Cc recipient:"ianzangwill2"< ***@***.*** >;"Author"< ***@***.*** >; Subject:Re: [LianjiaTech/BELLE] 如何提高推理的性能 (Issue #101) 你是用 windows+WSL，还是原生linux？ Original Email Sender:"Bo仔很忙"< @.***...