Avinash Sharma comments

Results 9 comments of


                                            Avinash Sharma

Llama-3-8B f16 fails to compile to vmfb

Yeah I'll upload it here, accidentally submitted the issue before uploading it here :)

Llama-3-8B f16 fails to compile to vmfb

@benvanik I uploaded a zip that has the batch_llama_3_8B.mlir file

Llama-3-8B f16 fails to compile to vmfb

> @aviator19941 The failure is due to https://gist.github.com/pashu123/020217a35f1c643ed03b169ce41f68d9 (embedding kernel). It has a cast from fp16 -> fp32. Please double-check that it's a full fp16 model. Also, could you post...

Llama-3-8B f16 fails to compile to vmfb

Implement initial affinity support for multiple queues.

@benvanik Thanks for the overview, it's very helpful. I will start with NUMA node pinning APIs and topology queries task.

One or more Stablehlo test(s) crashing after llvm bump to 266a5a9cb9daa96c1eeaebc18e10f5a37d638734

ReduceMaxAlongDimUnsignedInt test fails in this PR: https://github.com/llvm/torch-mlir/pull/3544

One or more Stablehlo test(s) crashing after llvm bump to 266a5a9cb9daa96c1eeaebc18e10f5a37d638734

@vivekkhandelwal1 or @renxida do you have cycles to help with this? Got pulled into llama2 work. FYI: I'm bumping to https://github.com/llvm/llvm-project/commit/168ecd706904d6ce221dc5107da92c56aea7c8e9 today (Merged here: https://github.com/iree-org/iree/pull/17978)

[GPU] : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION in GPU while passing inference in CPU

I was able to solve this error by removing the input sizes and only using the input file, i.e. using `--input='@input.0.bin'` instead of `--input='[email protected]'`. It seems like GPU doesn't support...

[GPU] : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION in GPU while passing inference in CPU

> > I was able to solve this error by removing the input sizes and only using the input file, i.e. using `--input='@input.0.bin'` instead of `--input='[email protected]'`. It seems like GPU...