yk1012664593 comments

Results 2 comments of


                                            yk1012664593

[Bug]: Running llama2-7b on H20, Floating point exception (core dumped) appears on float16

llama model init start INFO 04-26 17:03:13 llm_engine.py:98] Initializing an LLM engine (v0.4.1) with config: model='/mnt/deep_learning_test/testsuite/dataset/llms_inference_llama7b-v2_accelerate/checkpoint/7B-V2/', speculative_config=None, tokenizer='/mnt/deep_learning_test/testsuite/dataset/llms_inference_llama7b-v2_accelerate/checkpoint/7B-V2/', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.float16, max_seq_len=4096, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, disable_custom_all_reduce=Falsequantization=None, enforce_eager=False,...

[Bug]: Running llama2-7b on H20, Floating point exception (core dumped) appears on float16

> 它每次都会发生吗？还是只针对某些提示？ > > 另外： > > > 如果您遇到崩溃或挂起，使用 .vllm 中的所有函数调用都将被记录下来。检查这些日志文件，并判断哪个函数崩溃或挂起。`export VLLM_TRACE_FUNCTION=1` Yes, this issue is inevitable. On the H20 model, all vllm versions with float16 accuracy will experience this...