seungrokj

Results 5 comments of seungrokj

@amathews-amd @shajrawi @andyluo7 @mawong-amd @jeffdaily @liligwu @hongxiayang Please take a look at these when you're available

@fxmarty Can you please add one more triton.Config in https://github.com/huggingface/text-generation-inference/blob/b7e98ba635367daa23c5b1f4a73f51b1f061936a/server/text_generation_server/utils/flash_attn_triton.py#L261 ``` triton.Config( { "BLOCK_M": 128, "BLOCK_N": 64, "waves_per_eu": 1, "PRE_LOAD_V": False, }, num_stages=1, num_warps=4, ), ``` This will improve the...

Hi @james-banks In the python code snippet, the module name should be "_" and "-" is not allowed. ``` Python 3.9.19 (main, Mar 21 2024, 17:11:28) [GCC 11.2.0] :: Anaconda,...

Hi @zhyncs thank you for quick rely! Can you elaborate a little bit differences btw bench_one_bench and launch_server + bench_serving ? Does bench_one_bench process batch*intput_tokens differently from online serving ?

In this screen shot.. it's better to put mi300 information here, otherwise people will keep asking whether the rocm/pytorch:latest-release supports gfx942 or not. ![image](https://github.com/ROCm/ROCm-docker/assets/144636725/d64df205-bf74-42dd-840e-37b5a9dfe8a9)