bhack
bhack
I cannot share the whole repo. I try do describe the case if we could inject some better user notification/failure. The problem is mainly realated to a ddp wrapped model...
@ZhuJiwei111 Do you have a minimal ddp code to reproduce it as mine was quite large to isolate in a standalone version.
@Hprairie Can you take a look at this?
pytest -k `orignal` and the new `custom` tests are working correctly. `compiled` is failing with ```python FAILED tests/ops/test_selective_scan.py::test_selective_scan[True-True-1-True-True-True-True-True-128-itype0-wtype0-compiled] - AssertionError: expected size 2==2, stride 64==8192 at dim=0; expected size 4==4,...
> Also, I am curious if you have used `opcheck` to test to make sure that you have correctly hooked up the `custom_op`. At a first glance it looks mostly...
What do you think it is causing the failure of the compiled test at https://github.com/state-spaces/mamba/pull/651#issuecomment-2551612669 ?
@Hprairie `opcheck` tests added. Let me know if you want to add more inputs.
Any news on this?
Now it is serialized without error or warning. But loading/deserializing it we get: ```python W0920 16:48:06.206000 25925 site-packages/torch/utils/_sympy/interp.py:159] failed while executing pow_by_natural([VR[0, int_oo], VR[-1, -1]]) W0920 16:48:06.206000 25925 site-packages/torch/fx/experimental/symbolic_shapes.py:5227] failed...
> Now it is serialized without error or warning. But loading/deserializing we get: > > ```python > W0920 16:48:06.206000 25925 site-packages/torch/utils/_sympy/interp.py:159] failed while executing pow_by_natural([VR[0, int_oo], VR[-1, -1]]) > W0920...