onediff icon indicating copy to clipboard operation
onediff copied to clipboard

onediff node 在 3090中无法运行

Open Shiyao-Huang opened this issue 1 year ago • 5 comments

Describe the bug

A clear and concise description of what the bug is.

3090无法运行 comfyUI加速,包括单独的base模型加速以及ipadapter加速

Your environment

OS : ubuntu20.04

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243


 Driver Version: 550.100        CUDA Version: 12.4 

>>> import torch
version.cuda)
print(torch.cuda.is_available())
>>> print(torch.__version__)
2.2.1+cu121
>>> print(torch.version.cuda)
12.1
>>> print(torch.cuda.is_available())

OneDiff git commit id

d71772f06a2933e73b3c78aeaf4f1bf6c04c9333

OneFlow version info if you have installed oneflow

Run python -m oneflow --doctor and paste it here.

/comfyUI# python -m oneflow --doctor
libibverbs not available, ibv_fork_init skipped
path: ['/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow']
version: 0.9.1.dev20240717+cu121
git_commit: ec7b682
cmake_build_type: Release
rdma: True
mlir: True
enterprise: False

How To Reproduce

Steps to reproduce the behavior(code or script): IPA 人脸参考comfyUInode 加入后,第一次编译在KSample部分报错

The complete error message

Requested to load CLIPVisionModelProjection
Loading 1 new model
Requested to load SDXLClipModel
Loading 1 new model
Requested to load SDXL
Requested to load ControlNet
Loading 2 new models
  0%|                                                                                                                                                                                             | 0/20 [00:00<?, ?it/s]Stack trace (most recent call last) in thread 533306:
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38f94a7, in 
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38f8d1c, in 
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38f4598, in vm::ThreadCtx::TryReceiveAndRun()
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b3896d34, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b389a037, in vm::Instruction::Compute()
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b392138f, in vm::FuseInstructionPolicy::Compute(vm::Instruction*)
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b389a037, in vm::Instruction::Compute()
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38a1418, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38a0bbc, in 
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38a3fc7, in vm::OpCallInstructionUtil::Compute(vm::OpCallInstructionPolicy*, vm::Stream*, bool, bool)
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b38a2699, in vm::OpCallInstructionUtil::Compute(vm::OpCallInstructionPolicy*, vm::Stream*, bool, bool)::{lambda()#1}::operator()() const
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b6e74eb9, in StatefulOpKernel::Compute(eager::CallContext*, ep::Stream*, user_op::OpKernel const*, user_op::OpKernelState*, user_op::OpKernelCache const*) const
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b6fd1a18, in 
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b3f8ac20, in Conv2dEngineMgr::GetConv2dEngine(ep::CudaStream*, Conv2dConfig const&, Conv2dArguement const&, std::string const&)
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b3f92ae5, in CutlassConv2dEngine::Init(ep::CudaStream*, Conv2dConfig const&, Conv2dArguement const&, nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::string, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned> > const&)
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96b3f9142b, in 
   Object "/root/miniconda3/envs/comfyUI_sy/lib/python3.9/site-packages/oneflow/../oneflow.libs/liboneflow-832d5503.so", at 0x7f96ab08be79, in 

Aborted (Signal sent by tkill() 532452 0)
Aborted (core dumped)

Additional context

Add any other context about the problem here.

Shiyao-Huang avatar Jul 18 '24 09:07 Shiyao-Huang

请看下是否是显存满了导致了 OOM

strint avatar Jul 19 '24 07:07 strint

请看下是否是显存满了导致了 OOM

watch nvidia 显存占用最高13G.

去除其余部件仅采用lora + sdxl 占用显存8G,仍然报错。

Shiyao-Huang avatar Jul 22 '24 11:07 Shiyao-Huang

方便提供一下工作流么,当前的错误信息还不能明确原因,需要做一下复现

strint avatar Jul 22 '24 12:07 strint

方便提供一下工作流么,当前的错误信息还不能明确原因,需要做一下复现

loraok.json

Shiyao-Huang avatar Jul 23 '24 00:07 Shiyao-Huang

OneDiff git commit id

d71772f06a2933e73b3c78aeaf4f1bf6c04c9333

Please use onediff main branch , @Shiyao-Huang (Look at the version number, it's not in the main ). Lora usage document https://github.com/siliconflow/onediff/blob/main/onediff_comfy_nodes/docs/lora.md

ccssu avatar Jul 26 '24 07:07 ccssu