WenXIN-AI issues

Results 7 issues of


                                            WenXIN-AI

I encountered a problem while determining the number of proofreading datasets

I attempted to conduct preliminary GPTQ quantification using official examples, but the only change was to modify the model to a fine tuned version of the Baichuan model. When I...

executeV2: Error Code 1: Cask (Cask Pooling Runner Execute Failure)

when I use the lateset expressing for loading the resnet-50, it can run successfully ``` # 加载预训练的 ResNet-50 模型 model= models.resnet50(ResNet50_Weights.IMAGENET1K_V1) ....... context.execute_v2(bindings) cuda.memcpy_dtoh(output_data,d_output) ``` output: ``` [[-1.99739981e+00 -8.73850882e-01 -2.61901110e-01...

triaged

Due to Flashattention, inference cannot be performed on v100

### Description / 描述 FlashAttention only supports Ampere GPUs or newer. ### Case Explaination / 案例解释 Due to Flashattention, inference cannot be performed on v100

badcase

can't load bin factory without pytorch

### Description --------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[5], line 1 ----> 1 state_dict_1=flow.load('/home/aistudio/chinese-llama-2-7b/pytorch_model-00001-of-00002.bin') File ~/external-libraries/oneflow/framework/check_point_v2.py:444, in load(path, global_src_rank, map_location, support_pytorch_format) 441 i = _broadcast_py_object(None, global_src_rank) 442 load...

stale

WenXIN-AI

I encountered a problem while determining the number of proofreading datasets

executeV2: Error Code 1: Cask (Cask Pooling Runner Execute Failure)

Due to Flashattention, inference cannot be performed on v100

can't load bin factory without pytorch

good!

If this model supports streaming output, it would be a great opportunity to assist virtual personal assistants

Open custom memory loading strategy for LLM nodes