侯奇 comments

Results 45 comments of


                                            侯奇

[QUESTION] Not supported on A6000?

> Thanks, > > I add 86 arguments to `/flux/src/cuda/op_registery.cu` line 36 like this: > > ```cuda-c++ > void > init_arch_tag() { > int major, minor; > cudaDeviceGetAttribute(&major, cudaDevAttrComputeCapabilityMajor, 0);...

[QUESTION] Not supported on A6000?

> Does this work? I tried modifying these parts, but it still reports errors after the changes. Could provide more specific guidance on how to modify it? I run it...

[BUG] moe example run error

fixed by https://github.com/bytedance/flux/pull/123. close this

[ENHANCEMENT] 你好，fp8有计划支持吗

> > [@zkyue](https://github.com/zkyue) Yes, fp8 support is on the way. And we will release it in future. > > Thank you for your reply. I am now looking to apply...

[QUESTION]IF FLUX supports RoCE NIC?

it's not tested on RoCE NIC. maybe this is a problem with NVSHMEM. can you run nvshmem examples with nvshmrun on RoCE NIC?

[BUG] Failing to build from source

check the READMe.md and run the install_deps.sh. there is a CUTLASS patch which helps. ``` git clone --recursive https://github.com/bytedance/flux.git && cd flux # Install dependencies bash ./install_deps.sh # For Ampere(sm80)...

[BUG] Failing to build from source

@ZSL98 please help?

[BUG] Failing to build from source

> The problem seems to be the improper gcc version. gcc10 and gcc12 both work but gcc11.4 fails. If you are using gcc11, please comment out the `using cute::operator BTW,...

[doc] remove pypi installation

can we release the version compiled without nvshmem and let users compile with nvshmem themseles?

profiler性能测试疑问

seems SOL time is calculated without divide the TP_SIZE, so is 8x timer larger. should be fixed later.