Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

python gradio/app.py运行报错,提示只能用A100 GPU

Open liuchunyu524 opened this issue 10 months ago • 5 comments

Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Exception in thread Thread-4 (_do_normal_analytics_request): Traceback (most recent call last): File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions yield File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 233, in handle_request resp = self._pool.handle_request(req) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request raise exc from None File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request response = connection.handle_request( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 99, in handle_request raise exc File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 76, in handle_request stream = self._connect(request) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 122, in _connect stream = self._network_backend.connect_tcp(**kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp with map_exceptions(exc_map): File "/home/parkliu/miniconda3/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/parkliu/miniconda3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/parkliu/miniconda3/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/analytics.py", line 61, in _do_normal_analytics_request data["ip_address"] = get_local_ip_address() File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/analytics.py", line 117, in get_local_ip_address ip_address = httpx.get( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_api.py", line 198, in get return request( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_api.py", line 106, in request return client.request( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 827, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 914, in send response = self._send_handling_auth( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth response = self._send_handling_redirects( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects response = self._send_single_request(request) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 1015, in _send_single_request response = transport.handle_request(request) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 232, in handle_request with map_httpcore_exceptions(): File "/home/parkliu/miniconda3/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectTimeout: timed out Exception in thread Thread-6 (_do_normal_analytics_request): Traceback (most recent call last): File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions yield File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 233, in handle_request resp = self._pool.handle_request(req) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request raise exc from None File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request response = connection.handle_request( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 99, in handle_request raise exc File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 76, in handle_request stream = self._connect(request) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 122, in _connect stream = self._network_backend.connect_tcp(**kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp with map_exceptions(exc_map): File "/home/parkliu/miniconda3/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/parkliu/miniconda3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/parkliu/miniconda3/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/analytics.py", line 61, in _do_normal_analytics_request data["ip_address"] = get_local_ip_address() File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/analytics.py", line 117, in get_local_ip_address ip_address = httpx.get( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_api.py", line 198, in get return request( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_api.py", line 106, in request return client.request( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 827, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 914, in send response = self._send_handling_auth( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth response = self._send_handling_redirects( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects response = self._send_single_request(request) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_client.py", line 1015, in _send_single_request response = transport.handle_request(request) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 232, in handle_request with map_httpcore_exceptions(): File "/home/parkliu/miniconda3/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectTimeout: timed out 0%| | 0/100 [00:00<?, ?it/s] Traceback (most recent call last): File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/queueing.py", line 527, in process_events response = await route_utils.call_process_api( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 261, in call_process_api output = await app.get_blocks().process_api( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1788, in process_api result = await self.call_function( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1340, in call_function prediction = await anyio.to_thread.run_sync( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread return await future File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run result = context.run(func, *args) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/gradio/utils.py", line 759, in wrapper response = f(*args, **kwargs) File "/data2/parkliu/Open-Sora-main/gradio/app.py", line 399, in run_inference samples = scheduler.sample( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/init.py", line 75, in sample samples = self.p_sample_loop( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 481, in p_sample_loop for sample in self.p_sample_loop_progressive( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 534, in p_sample_loop_progressive out = self.p_sample( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 429, in p_sample out = self.p_mean_variance( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 95, in p_mean_variance return super().p_mean_variance(self._wrap_model(model), *args, **kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 286, in p_mean_variance model_output = model(x, t, **model_kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 127, in call return self.model(x, new_ts, **kwargs) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/opensora/schedulers/iddpm/init.py", line 96, in forward_with_cfg model_out = model.forward(combined, timestep, y, **kwargs) File "/home/parkliu/.cache/huggingface/modules/transformers_modules/OpenSora-STDiT-v2-stage2/modeling_stdit2.py", line 208, in forward x = block( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/parkliu/.cache/huggingface/modules/transformers_modules/OpenSora-STDiT-v2-stage2/layers.py", line 149, in forward x = x + self.cross_attn(x, y, mask) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/parkliu/.cache/huggingface/modules/transformers_modules/OpenSora-STDiT-v2-stage2/layers.py", line 298, in forward x = xformers.ops.memory_efficient_attention(q, k, v, p=self.attn_drop.p, attn_bias=attn_bias) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 196, in memory_efficient_attention return _memory_efficient_attention( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 294, in _memory_efficient_attention return _memory_efficient_attention_forward( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 310, in _memory_efficient_attention_forward op = _dispatch_fw(inp) File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 98, in _dispatch_fw return _run_priority_list( File "/data2/parkliu/Open-Sora-main/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 73, in _run_priority_list raise NotImplementedError(msg) NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 4608, 16, 72) (torch.bfloat16) key : shape=(1, 2, 16, 72) (torch.bfloat16) value : shape=(1, 2, 16, 72) (torch.bfloat16) attn_bias : <class 'xformers.ops.fmha.attn_bias.BlockDiagonalMask'> p : 0.0 cutlassF is not supported because: bf16 is only supported on A100+ GPUs flshattF is not supported because: bf16 is only supported on A100+ GPUs requires a GPU with compute capability > 7.5 tritonflashattF is not supported because: attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalMask'> bf16 is only supported on A100+ GPUs requires A100 GPU smallkF is not supported because: dtype=torch.bfloat16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalMask'> bf16 is only supported on A100+ GPUs unsupported embed per head: 72

liuchunyu524 avatar Apr 26 '24 03:04 liuchunyu524

用v100多卡能不能跑的起来呢,求助~

liuchunyu524 avatar Apr 26 '24 07:04 liuchunyu524

推理命令为:clear; export CUDA_VISIBLE_DEVICES=0; torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path /data2/parkliu/Open-Sora-main/pretrained_models/OpenSora-v1-HQ-16x256x256.pth --prompt-path ./assets/texts/t2v_samples.txt

liuchunyu524 avatar Apr 26 '24 08:04 liuchunyu524

same

HuangZiy avatar Apr 26 '24 10:04 HuangZiy

I found a solution

comment these two lines of code image

set dtype = "fp32" image

HuangZiy avatar Apr 27 '24 12:04 HuangZiy

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar May 05 '24 01:05 github-actions[bot]

Change the dtype in inference/sample.py to fp16 could work

dtype = "fp16" # use fp16 instead of bf16

wtjiang98 avatar Jul 16 '24 03:07 wtjiang98