chaiNNer icon indicating copy to clipboard operation
chaiNNer copied to clipboard

An unexpected error occurred: TypeError: Network request failed - on Macbook M1

Open DPG7332 opened this issue 3 years ago • 19 comments

Information:

  • Chainner version: 0.11.6
  • OS: MacOS Monterey 12.5.1
  • Device: Macbook Air (M1, 2020)

Description I'm using PyTorch, Upscale with the UltraSharp 4k model and I can successfully upscale an image that is 256x256. However when I try a larger image 2048x2048 the process runs for around 20 minutes and then an error message is displayed that reads: An unexpected error occurred: TypeError: Network request failed.

Logs main.log renderer.log

chainner_error

Thank you

DPG7332 avatar Sep 01 '22 02:09 DPG7332

image

It appears you are running out of RAM while upscaling. Since you are on a mac, upscaling via PyTorch takes place on the CPU and RAM. I would recommend using NCNN instead and converting any models you want to use, since that will take advantage of your GPU. Though, NCNN has been having some issues on Mac as well so YMMV. I plan on adding support for the apple silicon ONNX runtime, so that will be an option at some point in the future as well, but for now I would recommend attempting to use NCNN.

joeyballentine avatar Sep 01 '22 05:09 joeyballentine

Thanks for the reply! I'll work on that. :) Looking forward to apple silicon support in the future.

DPG7332 avatar Sep 03 '22 15:09 DPG7332

It appears you are running out of RAM while upscaling.

Is that really a reason for chainner to fail? I mean, wouldn't the system just use a swap file in the background and continue running, albeit more slowly?

RunDevelopment avatar Sep 12 '22 13:09 RunDevelopment

@RunDevelopment I based that assumption off the fact that it just crashed without warning and the screenshot shows very high ram usage. I'm not actually 100% sure if it's the reason

joeyballentine avatar Sep 12 '22 13:09 joeyballentine

Getting the same error while trying to use SBER finetuned RealESRGAN_x2 model, converted from PTH to fp16 NCNN.

Log
[2022-09-12 17:48:19 +0200] [51092] [INFO] Iterating over frames in video file: C:\Users\WaveCut\Downloads\JUSTaFiLezqcixlbymv.mp4
[2022-09-12 17:48:19 +0200] [51092] [INFO] {'044186ff-1820-47b7-8136-dd8a35c2fba7': {'schemaId': 'chainner:image:resize_factor', 'id': '044186ff-1820-47b7-8136-dd8a35c2fba7', 'inputs': [{'id': '242ce50b-f483-41cb-93df-bd59e8296053', 'index': 0}, 50, 1], 'child': True, 'nodeType': 'regularNode', 'hasSideEffects': False, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}, '242ce50b-f483-41cb-93df-bd59e8296053': {'schemaId': 'chainner:ncnn:upscale_image', 'id': '242ce50b-f483-41cb-93df-bd59e8296053', 'inputs': [{'id': '5db9f9ee-c5f8-4f8c-bdf2-439bd900ce31', 'index': 0}, {'id': '77e8126a-65e6-4c7a-900c-36631542d480', 'index': 0}, 0], 'child': True, 'nodeType': 'regularNode', 'hasSideEffects': False, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}, '77e8126a-65e6-4c7a-900c-36631542d480': {'schemaId': 'chainner:image:simple_video_frame_iterator_load', 'id': '77e8126a-65e6-4c7a-900c-36631542d480', 'inputs': [None], 'child': True, 'nodeType': 'iteratorHelper', 'hasSideEffects': True, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}, '82d52179-ccad-49b7-a86d-2768b966aa60': {'schemaId': 'chainner:image:view', 'id': '82d52179-ccad-49b7-a86d-2768b966aa60', 'inputs': [{'id': '242ce50b-f483-41cb-93df-bd59e8296053', 'index': 0}], 'child': True, 'nodeType': 'regularNode', 'hasSideEffects': True, 'cacheOptions': {'shouldCache': False, 'maxCacheHits': 0, 'clearImmediately': False}}, 'd0691ce0-d489-45ee-b8e9-195094aaf9bd': {'schemaId': 'chainner:image:simple_video_frame_iterator_save', 'id': 'd0691ce0-d489-45ee-b8e9-195094aaf9bd', 'inputs': [{'id': '044186ff-1820-47b7-8136-dd8a35c2fba7', 'index': 0}, 'C:\\Users\\WaveCut\\Downloads', 'musa', 'mp4'], 'child': True, 'nodeType': 'iteratorHelper', 'hasSideEffects': True, 'cacheOptions': {'shouldCache': False, 'maxCacheHits': 0, 'clearImmediately': False}}, '5db9f9ee-c5f8-4f8c-bdf2-439bd900ce31': {'schemaId': 'chainner:ncnn:load_model', 'id': '5db9f9ee-c5f8-4f8c-bdf2-439bd900ce31', 'inputs': ['I:\\NN\\models\\sber_realesrgan_tuned\\RealESRGAN_x2.param', 'I:\\NN\\models\\sber_realesrgan_tuned\\RealESRGAN_x2.bin'], 'child': False, 'nodeType': 'regularNode', 'hasSideEffects': False, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}}

[2022-09-12 17:48:19.295] [info] Backend: [51092] [INFO] Execution options: fp16: True, device: cuda:0

[2022-09-12 17:48:19.954] [error] Backend: find_blob_index_by_name onnx::Unsqueeze_703 failed

[2022-09-12 17:48:19.955] [error] Backend: find_blob_index_by_name onnx::Squeeze_712 failed

[2022-09-12 17:48:20.000] [error] Backend: parse layer_type failed

[2022-09-12 17:48:20.045] [error] Backend: load_model error at layer 1376, parameter file has inconsistent content.

[2022-09-12 17:48:21.183] [error] Python subprocess exited with code 3221225477 and signal null [2022-09-12 17:49:23.995] [info] Attempting to kill backend... [2022-09-12 17:49:23.995] [error] Error killing backend. [2022-09-12 17:49:24.034] [info] Cleaning up temp folders...

iamwavecut avatar Sep 12 '22 16:09 iamwavecut

I mean, the same in terms of non-verbosity, and backend dies.

iamwavecut avatar Sep 12 '22 16:09 iamwavecut

I think there's two action items to be done based on this discussion (besides fixing the actual issues)

  1. When the backend dies like this, we need to tell the user in a more verbose way
  2. We need to have a way to restart the backend without closing chaiNNer. This actually would be useful for installing/updating dependencies as well. I figure we can just refactor how we handle the backend event handlers and whatnot and then trigger that in these cases, followed by a frontend refresh on button press.

joeyballentine avatar Sep 12 '22 16:09 joeyballentine

@iamwavecut Could you link the model used? Also, did this error occur when trying to convert, or when trying to load the model/upscale? And if the latter, how did you convert it?

theflyingzamboni avatar Sep 12 '22 16:09 theflyingzamboni

@iamwavecut Could you link the model used?

https://icedrive.net/s/VWT5tiYG53W9jCxwztB5yWaBtagk both original and converted fp16

Also, did this error occur when trying to convert, or when trying to load the model/upscale? And if the latter, how did you convert it?

The error occurs when I press the RUN button. First popup it's just a general error message (i believe it's about unable to load model or smthn) and right after it second popup says about a network error (I believe its a pipeline status request), so, the backend is already dead at the moment.

The model was converted using recently introduced chaiNNer functionality.

iamwavecut avatar Sep 13 '22 09:09 iamwavecut

The model was converted using recently introduced chaiNNer functionality.

How were you able to get the pth model to convert using chaiNNer? I tried doing it myself, but I get this error when pytorch attempts to export the model to ONNX (the intermediate step in converting to NCNN):

Exporting the operator pixel_unshuffle to ONNX opset version 14 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

The opset version of ONNX we export as does not appear to support one of the operators in the pth model, so I don't know how you got it to convert in the first place.

theflyingzamboni avatar Sep 13 '22 15:09 theflyingzamboni

I believe that depends on the model. All (tens) ESRGAN-based models were converted successfully for me.

iamwavecut avatar Sep 13 '22 15:09 iamwavecut

@iamwavecut the RealESRGAN_x2 model uses pixelunshuffle though, at least the official one. Is the one you were converting an unofficial one?

joeyballentine avatar Sep 13 '22 15:09 joeyballentine

Yeah, that's Sberbank-ai variant of realesrgan

iamwavecut avatar Sep 13 '22 15:09 iamwavecut

X4 and x8 works OK, though

iamwavecut avatar Sep 13 '22 15:09 iamwavecut

Just checked it out and based on the arch in the repo, it should be using pixelunshuffle for 2x and 1x scales, so the same as the official ones.

joeyballentine avatar Sep 13 '22 15:09 joeyballentine

Which is why I'm wondering how it converted for you @iamwavecut. Given that the opset we export to ONNX with does not support that op, you never should have been able to get an ONNX model through chaiNNer with this particular model, never mind generating an NCNN model through chaiNNer.

theflyingzamboni avatar Sep 13 '22 16:09 theflyingzamboni

Oh, I forgot to mention something important: I'm using chaiNNer with local python installation, which is Python 3.9.6

iamwavecut avatar Sep 13 '22 19:09 iamwavecut

That shouldn't matter

joeyballentine avatar Sep 13 '22 19:09 joeyballentine

Hi there, I receive the same error. Also while upscaling using a pytorch model, but with the 'image file iterator' in this case. Mac book pro (not M1). For anything over a small handful of images at some point it will quit the process with that error (most i have managed is 4 in one go). RAM looks likely here too i guess as it showing a red circle like the OP

nocturnal808 avatar Sep 22 '22 16:09 nocturnal808

The problem here is the Tile Size. I ran into the same issue with my M2 Pro with 32 GB of RAM, even with MPS.

@nocturnal808 @DPG7332 If you set the Tile Size manually from Auto to a lower value, e.g., 1024 (works flawless on my machine), you shouldn't see these issues, anymore.

@joeyballentine When set to automatic, I assume PyTorch tries to figure it out by itself?

stonerl avatar Aug 13 '23 11:08 stonerl

I assume PyTorch tries to figure it out by itself?

Yes. We once did a few tests to empirically figure out a rough formula to estimate the VRAM a model needs to upscale an image of a certain size. The result was this function.

RunDevelopment avatar Aug 13 '23 12:08 RunDevelopment

Also btw the RAM always being a red circle was a bug that only recently got fixed

joeyballentine avatar Aug 14 '23 12:08 joeyballentine