ncnn Add python binding for loading bin from memory.

This PR adds a simple python binding for loading a bin from memory. One already exists for loading a param from memory, and the function to load a bin from memory in c++ already exists, so all that's needed is this binding.

I've been using this exact binding in my ncnn_vulkan fork for a long time now, and it works perfectly fine.

I'm currently considering switching over to the main ncnn package, but I need a couple features (like this) before I am able to do so, so expect some more PRs from me.

Nov 22 '23 03:11 joeyballentine

Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Nov 22 '23 03:11 tencent-adm

Did you test this with both CPU and Vulkan inference?

Nov 23 '23 17:11 JeremyRand

@JeremyRand I have not tested with CPU. Why would that matter?

Nov 23 '23 17:11 joeyballentine

When I tried to cherry-pick your binding some months ago and tried it with CPU inference in chaiNNer, I got an immediate segfault. My understanding is that this is because the Pybind deallocates the memory as soon as the bind function returns, which causes a memory safety bug since ncnn doesn't make a copy of the data (since it assumes you're calling from C++ and will manage the memory yourself). I suspect that the reason you didn't see a segfault in Vulkan mode is because the memory management is different (it may make a copy of the data while it's uploading it to the GPU), but I didn't verify this hypothesis.

Nov 23 '23 18:11 JeremyRand

Interesting. Well, IMO that isn't enough reason to justify a binding not being there. If it doesn't work for CPU, just don't use it for CPU inference.

Sounds to me like it's a bug with the CPU version of the c++ code, and therefore is irrelevant to this PR and should be fixed separately

Nov 24 '23 18:11 joeyballentine

Yes, agreed that it's reasonable to have a binding if it works for Vulkan, since it's more efficient than making a copy. If it doesn't work for CPU, might be worth explicitly documenting that, but other than that I have no objection to the concept.

Nov 25 '23 07:11 JeremyRand

We may need to add parameters in load_model to distinguish whether ncnn needs to deeply copy the weight data. This will be available in next year's version :D

Dec 21 '23 06:12 nihui

Implementing this would benefit many projects. Please consider doing so

Apr 12 '24 20:04 Kim2091