Ben Vanik
Ben Vanik
132GB of allocated device memory is a lot - just because you have that much physical memory does not mean that all of it can be allocated. We never even...
The problem when running up against physical memory limits is that it's not something you can reason about as a sum: you can almost never use all of the physical...
heh, yeah, that'll be a problem :P I'm going to bet that it's some hoisted initializers that are transposing every single parameter or something ridiculous (260=2*130, probably two copies of...
yeah, we suballocate, produce a max value, and then allocate that - if you --mlir-print-ir-before=iree-stream-schedule-allocation / --mlir-print-ir-after=iree-stream-schedule-allocation it'll make it easier to see what's mapping to what
Nice, you've found it - that's what I suspected. As you note when models get this big (though I'd argue for anything deployed of any size) we need to be...
That's great news :) Thinking for when cases worse than this arise something that we should do is have some analysis that forces stream partitioning to min-peak-memory when execution is...
(closing as stale)
isStructurallyEquivalentTo with the cache is what you'll want to use. Currently it has the #3996 TODO about symbols that would make it not work for this, but that could be...
Responding to your question then will take a look at the new code! > From what we want to do in this pass, it seems the best place to run...
generic is useful for people who want to run something on a machine that is not their own - that's why it's the default for things like clang/gcc - host...