Paul Fultz II
Paul Fultz II
We should fuse this case: ```python p = migraphx.program() m = p.get_main_module() x_0 = m.add_literal(migraphx.create_argument(migraphx.shape(type="float_type", lens=[1]), [1])) x_1 = m.add_literal(migraphx.generate_argument(migraphx.shape(type="float_type", lens=[1, 1648]), 0)) x_2 = m.add_literal(migraphx.generate_argument(migraphx.shape(type="float_type", lens=[1, 1648]), 1)) x_3...
Since mlir supports this fusion we need to update `fuse_mlir` to find such patterns and fuse them. For now we should probably have a flag to enable the fusion until...
Gather mxr/pythons files from all the models including for NCHW/NHWC/fp16/bf16/fp32 and rewrite_dot/on/off. Measure the performance for each file with and without MLIR on all the different hardware. Make a list...
There is two parts for this: 1. Use dynamic shapes to handle the past sequence length. 2. Similar to dynamic batching, we can make two different submodule for each GQA...
Cppcheck passes a pointer and size to `simplecpp:TokenList`, so we need those APIs enabled even if there is `string_view` and `span`.
This will disable building the CLI library when `BUILD_CLI=Off` is set, since dmake also requires filelister.
### Problem Description Since MSVC implements [P0533R9](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p0533r9.pdf) all math function from the standard library become `constexpr` when enabling C++23. Hip treats `constexpr` functions as implicitly as `__host__ __device__` so when...