burn
burn copied to clipboard
MNIST Inference Web example not working
Describe the bug
The MNIST Inference web example is not working. It appears to be trying to load pkg/mnist_inference_web.js but no such file exists.
To Reproduce
cd examples/mnist-inference-web/
./build-for-web.sh wgpu
./run-server.sh
Expected behavior I expect the example to work.
Screenshots
Desktop (please complete the following information):
- OS: MacOS
- Browser Chrome
- Version 120.0.6099.216 (Official Build) (arm64)
./build-for-web.sh ndarray
appears to work, but introduces new errors into the console:
mnist_inference_web.js:315 panicked at /Users/thekevinscott/code/burn/burn-core/src/record/memory.rs:39:85:
called `Result::unwrap()` on an `Err` value: Utf8 { inner: Utf8Error { valid_up_to: 38, error_len: Some(1) } }
Stack:
Error
at imports.wbg.__wbg_new_abda76e883ba8a5f (http://localhost:8000/pkg/mnist_inference_web.js:299:21)
at http://localhost:8000/pkg/mnist_inference_web_bg.wasm:wasm-function[187]:0x4c6cb
at http://localhost:8000/pkg/mnist_inference_web_bg.wasm:wasm-function[237]:0x51717
at http://localhost:8000/pkg/mnist_inference_web_bg.wasm:wasm-function[206]:0x4d414
at http://localhost:8000/pkg/mnist_inference_web_bg.wasm:wasm-function[107]:0x39ce6
at http://localhost:8000/pkg/mnist_inference_web_bg.wasm:wasm-function[125]:0x477a5
at http://localhost:8000/pkg/mnist_inference_web_bg.wasm:wasm-function[265]:0x51bbb
at __wbg_adapter_18 (http://localhost:8000/pkg/mnist_inference_web.js:75:10)
at real (http://localhost:8000/pkg/mnist_inference_web.js:60:20)
Uncaught RuntimeError: unreachable
at mnist_inference_web_bg.wasm:0x4c7de
at mnist_inference_web_bg.wasm:0x51717
at mnist_inference_web_bg.wasm:0x4d414
at mnist_inference_web_bg.wasm:0x39ce6
at mnist_inference_web_bg.wasm:0x477a5
at mnist_inference_web_bg.wasm:0x51bbb
at __wbg_adapter_18 (mnist_inference_web.js:75:10)
at real (mnist_inference_web.js:60:20)
Uncaught Error: recursive use of an object detected which would lead to unsafe aliasing in rust
at imports.wbg.__wbindgen_throw (mnist_inference_web.js:321:15)
at mnist_inference_web_bg.wasm:0x5229a
at mnist_inference_web_bg.wasm:0x522b6
at mnist_inference_web_bg.wasm:0x4516a
at mnist_inference_web_bg.wasm:0x477a5
at mnist_inference_web_bg.wasm:0x51bbb
at __wbg_adapter_18 (mnist_inference_web.js:75:10)
at real (mnist_inference_web.js:60:20)
I saw the same thing with ndarray, I was able to fix it by re-creating the model in the mnist example and copying model.bin to this example. Maybe something to do with the different cpu architecture the models are created on?
I cannot get wgpu to work, but I see a different error, and also different errors in different browsers. I enabled the features for webgpu and webassembly in Chrome and Brave, but wondering if there are some non-obvious options that also need to be enabled.
error in chrome: panicked at 'called Option::unwrap()on aNone value', /Users/eric/Downloads/burn/burn-wgpu/src/compute/base.rs:120
error in brave: panicked at 'An home directory should exist', burn-compute/src/tune/tune_cache.rs:25
The home directory definitely exists, so I don't know what is going on there, especially since the ndarray example works. I tried enabling the Shared GPUImageDecodeCache browser flag, along with rasterization, but still see the same error.
@ericcarmi Maybe we need to update the record. With the bin recorder, versions can be problematic. Testing with wgpu isn't trivial since not all platform support WebGPU.
We need to review to see if the model changed. If the model does not match the record, you'll have a mismatch. I propose we store a string representation of the record as part of the metadata.
Yeah, that was it. Retraining the model worked w/ ndarray backend
Unable to get the wgpu backend working on chrome following these steps: https://github.com/tensorflow/tfjs/issues/8065#issuecomment-1808785524
@ericcarmi Maybe we need to update the record. With the bin recorder, versions can be problematic. Testing with wgpu isn't trivial since not all platform support WebGPU.
Isn't a headless chrome + puppeter a solution? https://developer.chrome.com/blog/supercharge-web-ai-testing Seems like a good start.
Example doesn't compile currently, since the import location and signature of init_async (for the Wgpu backend) changed
I created a PR to fix some issues with the wasm examples: #1824
Hi, @nathanielsimard @antimora
Thank you for your fixing the PR.
When I tried the mnist-inference-web, I was faced with the issues as follows in both wgpu and ndarray so I report it to you here and show my suggestion.
I tried re-creating model.bin by executing original mnist and copying it, but could not fix the bug.
Desktop (please complete the following information): OS: MacOS Browser Chrome Version 127.0.6533.100(Official Build) (arm64)
Summary and Suggestion
I suggest using onnx model from https://github.com/tracel-ai/burn/tree/main/examples/onnx-inference, instead of using model.bin.
- The key point of this example seems to fusion burn and wasm to render functionalities on browsers. Either of using binary model or onnx model looks not important.
- The
model.binis created by https://github.com/tracel-ai/burn/tree/main/examples/mnist due to readme. Looks like it is hard and a hassle to catch up on the changes of original mnist codes and binary model by modifying themodel.rs. - When adopting onnx model style, we do not have to take care of the above mentioned issue and it is likely to be easier to prevent a regression.
If the above content looks good to you, I will be happy to be in charge of the task.
Case webgpu
reproducing command
% cd examples/mnist-inference-web
% ./build-for-web.sh wgpu
% ./run-server.sh
Case ndarray
reproducing command
% cd examples/mnist-inference-web
% ./build-for-web.sh ndarray
% ./run-server.sh
Difference between original mnist model and mnist-inference-web model
Sorry I could see the entire differences.
At least, the comment // Originally copied from the burn/examples/mnist package may be incorrect and make some confusions, even if exluding the training and validation sections.
% diff -c examples/mnist/src/model.rs examples/mnist-inference-web/src/model.rs
*** examples/mnist/src/model.rs Wed Jul 10 10:50:54 2024
--- examples/mnist-inference-web/src/model.rs Mon Aug 12 17:44:35 2024
***************
*** 1,9 ****
! use crate::data::MnistBatch;
use burn::{
! nn::{loss::CrossEntropyLossConfig, BatchNorm, PaddingConfig2d},
prelude::*,
- tensor::backend::AutodiffBackend,
- train::{ClassificationOutput, TrainOutput, TrainStep, ValidStep},
};
#[derive(Module, Debug)]
--- 1,10 ----
! #![allow(clippy::new_without_default)]
!
! // Originally copied from the burn/examples/mnist package
!
use burn::{
! nn::{BatchNorm, PaddingConfig2d},
prelude::*,
};
#[derive(Module, Debug)]
***************
*** 17,29 ****
activation: nn::Gelu,
}
- impl<B: Backend> Default for Model<B> {
- fn default() -> Self {
- let device = B::Device::default();
- Self::new(&device)
- }
- }
-
const NUM_CLASSES: usize = 10;
impl<B: Backend> Model<B> {
--- 18,23 ----
***************
*** 45,53 ****
conv1,
conv2,
conv3,
- dropout,
fc1,
fc2,
activation: nn::Gelu::new(),
}
}
--- 39,47 ----
conv1,
conv2,
conv3,
fc1,
fc2,
+ dropout,
activation: nn::Gelu::new(),
}
}
***************
*** 69,88 ****
self.fc2.forward(x)
}
-
- pub fn forward_classification(&self, item: MnistBatch<B>) -> ClassificationOutput<B> {
- let targets = item.targets;
- let output = self.forward(item.images);
- let loss = CrossEntropyLossConfig::new()
- .init(&output.device())
- .forward(output.clone(), targets.clone());
-
- ClassificationOutput {
- loss,
- output,
- targets,
- }
- }
}
#[derive(Module, Debug)]
--- 63,68 ----
***************
*** 111,129 ****
let x = self.norm.forward(x);
self.activation.forward(x)
...
If the above content looks good to you, I will be happy to be in charge of the task.
Thank you.
@tiruka, we can use ONNX model since it's more stable but I feel like it changes the nature of the example. We may have a tons of existing references and documentation for this web example. Also I feel like it will confuse others by making them think one needs ONNX to build for web. We already have image-classification-web that uses ONNX file already.
I think a proper way forward is an addition of a test that loads model.bin (panics if fails) that is also hooked up to our CI. We should be fixing early.
@antimora Thank you for your replying.
I think a proper way forward is an addition of a test that loads model.bin (panics if fails) that is also hooked up to our CI. We should be fixing early.
I understand your thought and withdraw my proposal. I will cooperate with you to fix the bug if necessary and you let me know.