bhack

Results 1415 comments of bhack

I cannot share the whole repo. I try do describe the case if we could inject some better user notification/failure. The problem is mainly realated to a ddp wrapped model...

@ZhuJiwei111 Do you have a minimal ddp code to reproduce it as mine was quite large to isolate in a standalone version.

@Hprairie Can you take a look at this?

pytest -k `orignal` and the new `custom` tests are working correctly. `compiled` is failing with ```python FAILED tests/ops/test_selective_scan.py::test_selective_scan[True-True-1-True-True-True-True-True-128-itype0-wtype0-compiled] - AssertionError: expected size 2==2, stride 64==8192 at dim=0; expected size 4==4,...

> Also, I am curious if you have used `opcheck` to test to make sure that you have correctly hooked up the `custom_op`. At a first glance it looks mostly...

What do you think it is causing the failure of the compiled test at https://github.com/state-spaces/mamba/pull/651#issuecomment-2551612669 ?

@Hprairie `opcheck` tests added. Let me know if you want to add more inputs.

Now it is serialized without error or warning. But loading/deserializing it we get: ```python W0920 16:48:06.206000 25925 site-packages/torch/utils/_sympy/interp.py:159] failed while executing pow_by_natural([VR[0, int_oo], VR[-1, -1]]) W0920 16:48:06.206000 25925 site-packages/torch/fx/experimental/symbolic_shapes.py:5227] failed...

> Now it is serialized without error or warning. But loading/deserializing we get: > > ```python > W0920 16:48:06.206000 25925 site-packages/torch/utils/_sympy/interp.py:159] failed while executing pow_by_natural([VR[0, int_oo], VR[-1, -1]]) > W0920...