Benson Ma

Results 80 comments of Benson Ma

The original issue that forced us to use the cutlass fork remains: ``` 2025-09-30T17:33:52.6122654Z /__w/FBGEMM/FBGEMM/fbgemm_gpu/../external/cutlass/tools/util/include/cutlass/util/mixed_dtype_utils.hpp(180): error: namespace "cutlass" has no member "int8_t" 2025-09-30T17:33:52.6124845Z static_assert(cute::is_same_v || ```

Working around this with https://github.com/pytorch/FBGEMM/pull/4964, which points CUTLASS to https://github.com/jwfromm/cutlass/tree/v4.2.1-FBGEMM

Hi @JoheyHan thanks for reporting this issue. We no longer support 1.1.0, as it is superceded by 1.2.0 (and soon to be superceded by 1.3.0). Could you try and see...

Hi @AlienLiang23 we currently don't support Intel GPU as far as Im aware of, but in any case, could you show us the instructions that you ran to perform the...

@AlienLiang23 based on the logs pasted, the actual error appears to be: ``` libcudart.so.12: cannot open shared object file: No such file or directory ``` Unfortunately, every loading error ends...

Hi @fmo-mt, torch 2.7.0 is very old by now and is no longer supported in the latest fbgemm releases. Please try with torch 2.9 instead. The build instructions we work...

Hi @fmo-mt you can use the instructions [here](https://docs.pytorch.org/FBGEMM/fbgemm_gpu/development/TestInstructions.html).

@Jitterx69 Thanks for making this contribution! It looks like the PackAMatrixTest is failing on ARM, could you take a look into this?

Hi @jiayus-nvidia the default values parsed in generate_kernels.py appear to be broken, I see ``` raise ValueError("ARBITRARY_NFUNC must be odd") ``` when I run the script. Could you update the...

Hi @jiayus-nvidia it appears there are some undefined symbols: ``` E OSError: /home/ec2-user/miniconda/envs/build_binary/lib/python3.13/site-packages/fbgemm_gpu/experimental/hstu/fbgemm_gpu_experimental_hstu.so: undefined symbol: _Z13run_hstu_bwd_ILi90EN7cutlass12float_e4m3_tELi128ELb1ELb0ELb0ELb0ELb0ELb0ELb1ELi0EEvR15Hstu_bwd_paramsP11CUstream_st ``` Maybe the code generation step didn't generate all the template instantiations?