Shengsheng Huang comments

Results 37 comments of


                                            Shengsheng Huang

inference problem with baichuan 13b

> > update new question here ， when I use interence with following gpu,how can I put inputs id to another gpu ![image](https://private-user-images.githubusercontent.com/74948610/296129408-1c4ce1aa-3035-4925-902b-2a5cb5e9bd8a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDUyOTg2MzgsIm5iZiI6MTcwNTI5ODMzOCwicGF0aCI6Ii83NDk0ODYxMC8yOTYxMjk0MDgtMWM0Y2UxYWEtMzAzNS00OTI1LTkwMmItMmE1Y2I1ZTliZDhhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAxMTUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMTE1VDA1NTg1OFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA0ZDQ4ZmIzZjg2NzViYzc2MGVjNWM4ZTFkNDAwNDY0ODcxOWY0Mzc5NjQyZDc4NmJhNTU3Mzg5ZThmYWEyYzQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.6qHX3l6hKbX61IIJborD9MQlbku_ji1pTOCNaGbMGgc) > > Then what this figure means,...

ModuleNotFoundError: No module named 'transformers_modules.Qwen-7B-Chat'

Since the problem is solved, can we close this issue?

packaging.version.InvalidVersion: Invalid version: '2023.2.0-13089-cfd42bd2cb0-HEAD'

It seems you're using OpenVINO 2023.2. For this example we only support OV 2022.3.

[Question]: How to use int8 API (nano or ggml?) in LLAMA inference?

If you are using HuggingFace transformers to load your LLAMA model, you can refer to the llama2 example here https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2 If you are using customized code to load LLAMA model,...

Shengsheng Huang

inference problem with baichuan 13b

ModuleNotFoundError: No module named 'transformers_modules.Qwen-7B-Chat'

packaging.version.InvalidVersion: Invalid version: '2023.2.0-13089-cfd42bd2cb0-HEAD'

[Question]: How to use int8 API (nano or ggml?) in LLAMA inference?

Is there a plan for the BigDL/PPML projects to support running XLM-RoBERTa large-XNLI within a TEE?

Stream llm example for both GPU and CPU

add 大语言模型应用开发指南.md