Avinash Sharma

Results 9 comments of Avinash Sharma

Yeah I'll upload it here, accidentally submitted the issue before uploading it here :)

@benvanik I uploaded a zip that has the batch_llama_3_8B.mlir file

> @aviator19941 The failure is due to https://gist.github.com/pashu123/020217a35f1c643ed03b169ce41f68d9 (embedding kernel). It has a cast from fp16 -> fp32. Please double-check that it's a full fp16 model. Also, could you post...

> @aviator19941 The failure is due to https://gist.github.com/pashu123/020217a35f1c643ed03b169ce41f68d9 (embedding kernel). It has a cast from fp16 -> fp32. Please double-check that it's a full fp16 model. Also, could you post...

@benvanik Thanks for the overview, it's very helpful. I will start with NUMA node pinning APIs and topology queries task.

ReduceMaxAlongDimUnsignedInt test fails in this PR: https://github.com/llvm/torch-mlir/pull/3544

@vivekkhandelwal1 or @renxida do you have cycles to help with this? Got pulled into llama2 work. FYI: I'm bumping to https://github.com/llvm/llvm-project/commit/168ecd706904d6ce221dc5107da92c56aea7c8e9 today (Merged here: https://github.com/iree-org/iree/pull/17978)

I was able to solve this error by removing the input sizes and only using the input file, i.e. using `--input='@input.0.bin'` instead of `--input='[email protected]'`. It seems like GPU doesn't support...

> > I was able to solve this error by removing the input sizes and only using the input file, i.e. using `--input='@input.0.bin'` instead of `--input='[email protected]'`. It seems like GPU...