torch-mlir [WIP] Use the selective libtorch build

This build path reduces number of source files ~1800 -> ~1400. The selective build trims unused ops at link time so most of the compile time savings comes not from fewer kernels but fewer auxiliary files and slightly leaner dependencies (e.g. FBGEMM is no longer necessary). lightweight_dispatch_ops.yaml is composed of ops that torch-mlir aims to support (i.e. have tablegen definitions) and bare minimum necessary to make pytorch package imports happy (e.g. all of the ops in _meta_registrations.py).

Note there's actually a much leaner build available here by

setting BUILD_LITE_INTERPRETER=1
running build_libtorch.py instead of setup.py to actually build just libtorch (no python bindings)

This path only compiles ~850 source files.

Tests should pass...

Aug 30 '22 00:08 makslevental

I'd like to avoid having yet another file that needs to be updated as we add ops. Is the build time savings massive from this? I think once we have the right caching the difference between 1800 and 1400 won't matter much.

Aug 30 '22 18:08 silvasean

I'd like to avoid having yet another file that needs to be updated as we add ops. Is the build time savings massive from this?

Ya I agree with this - the dubious value at the cost of another file to maintain - my thinking was that the needed torch-mlir ops could be procedurally generated during this build step e.g. using grep (that is in fact how I pulled them out).

I think once we have the right caching the difference between 1800 and 1400 won't matter much.

Indeed it's not, and in actuality on these GHA instances, for whatever reason, that 1400 turns out to actually be about ~1600.

On the otherhand, the real benefit I see here is a lightweight/minimal libtorch.so that can be statically linked against by _jit_ir_importer.so; my understanding from conversations with @powderluv is that this is a wishlist item which enables smoother packaging for environments that can't stomach the weight of stock libtorch.so.

Aug 30 '22 19:08 makslevental

I think we can still statically link with the 1600 files (and the linker will auto prune). But that is a nice to have so we don't link against anything called libtorch.so

Aug 31 '22 02:08 powderluv

I think we can still statically link with the 1600 files (and the linker will auto prune).

It won't. Dispatcher registration code is generated and therefore kernels are not pruned (take a look at any of the build/aten/src/ATen/Register*.cpp files). You only get pruning if you take this path because exactly that step is skipped (op registration code gen).

Aug 31 '22 03:08 makslevental

I think the pruning was an additional benefit. The more important thing is can we have all symbols resolved locally so we don't have to reach out to link to Pytorch's libtorch.so so we could support a range of pytorch versions (which we will have to still validate).

Aug 31 '22 22:08 powderluv