Benjamin Maxwell comments

Results 109 comments of


                                            Benjamin Maxwell

[Codegen] Add vector transfer + slice foldings in GenericVectorization

Update: 1. The bad hosting of reads/writes is real, so we may want to consider disabling (for the CPU backend?) this or at least having it off by default. However,...

[Codegen] Add vector transfer + slice foldings in GenericVectorization

So before this change we'd get: %15 = scf.for %arg4 = %c0 to %c352 step %c1 iter_args(%arg5 = %14) -> (tensor

[Codegen] Add vector transfer + slice foldings in GenericVectorization

Had a quick look a 2.: The `insert_slice(transfer_write)` does not apply because the transfer_write is masked. So just looking at those two ops it may not be a legal replacement....

[Codegen] Add vector transfer + slice foldings in GenericVectorization

Btw, I forgot to mention but when I took a look at the folds I spotted at least one upstream bug, which I reported here: [llvm/llvm-project#101708](https://github.com/llvm/llvm-project/issues/101708)

[Codegen][CPU] Change AArch64 matmul tile sizes to (6, 16, 1)

@hanhanW Are there any aarch64 IREE benchmarks now? ([benchmarks:android-cpu](https://github.com/iree-org/iree/labels/benchmarks%3Aandroid-cpu) seems to no longer function)

[Codegen][CPU] Change AArch64 matmul tile sizes to (6, 16, 1)

The context for this change is I discovered locally that if the tile size of `8, 16, 1` actually gets used (and not resized), the backend ends up running out...

[CPU] Restrict how scalable flags are propagated

Closing this as we have an alternate solution that does not have a negative performance impact :slightly_smiling_face:

Add notice about the removal of `vector.reshape`

cc @joker-eph

Path renderer gobbles up gigabytes or ram

Not got a fix, but the issue is in `Painter::for_each_line_segment_on_cubic_bezier_curve`. It's asked for the line segments of this curve, `c0 [30303.033,-23565770000], c1 [37037.04,-23565756000], point [28282.83,-23565780000]` (and appends them all to...

Path renderer gobbles up gigabytes or ram

These really large values do not play nice with the path splitting error computation, the distances between `floats` is _really_ large here (2048, I think), which just means things go...