Can no longer run XLA lit tests
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
TF 2.15
Custom code
No
Current behavior?
Previously, I could run the XLA unit tests via bazel test //tensorflow/compiler/xla/...:all.
However, in TF 2.15 after xla was moved to third_party/xla I am encountering issues.
I updated my command to bazel test @local_xla//xla/...:all. While most tests run successfully, it seems there are some hardcoded paths which are preventing the llvm lit tests from running correctly. See the logs below. Probably the lit configs need to be updated?
Standalone code to reproduce the issue
Checkout tensorflow. Configure. Run `bazel test @local_xla//xla/...:all`
Relevant log output
================================================================================
FAIL: @local_xla//xla/mlir/backends/gpu/transforms/tests:gpu_memcpy.mlir.test (see /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/execroot/org_tensorflow/bazel-out/k8-opt/testlogs/external/local_xla/xla/mlir/backends/gpu/transforms/tests/gpu_memcpy.mlir.test/test.log)
[27,548 / 27,604] 348 / 447 tests, 216 failed; [Sched] Testing @local_xla//xla/mlir_hlo/tests:Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test; 45s ... (55 actions, 2 running)
INFO: From Testing @local_xla//xla/mlir/backends/gpu/transforms/tests:gpu_memcpy.mlir.test:
==================== Test output for @local_xla//xla/mlir/backends/gpu/transforms/tests:gpu_memcpy.mlir.test:
Running test /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/execroot/org_tensorflow/bazel-out/k8-opt/bin/external/local_xla/xla/mlir/backends/gpu/transforms/tests/gpu_memcpy.mlir.test.runfiles/org_tensorflow/../local_xla/xla/mlir/backends/gpu/transforms/tests/gpu_memcpy.mlir.test xla/gpu_memcpy.mlir --config-prefix=runlit -v on GPU 0
lit.py: /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/external/llvm-raw/llvm/utils/lit/lit/discovery.py:137: warning: unable to find test suite for 'xla/gpu_memcpy.mlir'
lit.py: /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/external/llvm-raw/llvm/utils/lit/lit/discovery.py:276: warning: input 'xla/gpu_memcpy.mlir' contained no tests
error: did not discover any tests for provided path(s)
================================================================================
FAIL: @local_xla//xla/mlir_hlo/tests:Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test (see /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/execroot/org_tensorflow/bazel-out/k8-opt/testlogs/external/local_xla/xla/mlir_hlo/tests/Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test/test.log)
INFO: From Testing @local_xla//xla/mlir_hlo/tests:Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test:
==================== Test output for @local_xla//xla/mlir_hlo/tests:Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test:
Running test /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/execroot/org_tensorflow/bazel-out/k8-opt/bin/external/local_xla/xla/mlir_hlo/tests/Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test.runfiles/org_tensorflow/../local_xla/xla/mlir_hlo/tests/Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test -v external/local_xla/xla/mlir_hlo/tests/Dialect/mhlo/hlo-collapse-elementwise-map.mlir on GPU 0
lit.py: /root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/external/llvm-raw/llvm/utils/lit/lit/TestingConfig.py:151: fatal: unable to parse config file '/root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/execroot/org_tensorflow/bazel-out/k8-opt/bin/external/local_xla/xla/mlir_hlo/tests/Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test.runfiles/org_tensorflow/external/local_xla/xla/mlir_hlo/tests/lit.site.cfg.py', traceback: Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/external/llvm-raw/llvm/utils/lit/lit/TestingConfig.py", line 139, in load_from_path
exec(compile(data, path, "exec"), cfg_globals, None)
File "/root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/execroot/org_tensorflow/bazel-out/k8-opt/bin/external/local_xla/xla/mlir_hlo/tests/Dialect/mhlo/hlo-collapse-elementwise-map.mlir.test.runfiles/org_tensorflow/external/local_xla/xla/mlir_hlo/tests/lit.site.cfg.py", line 44, in <module>
lit_config.load_config(config, "xla/mlir_hlo/tests/lit.cfg.py")
File "/root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/external/llvm-raw/llvm/utils/lit/lit/LitConfig.py", line 152, in load_config
config.load_from_path(path, self)
File "/root/.cache/bazel/_bazel_root/a8fc6d0749b4f3c43761726a36e8ec4c/external/llvm-raw/llvm/utils/lit/lit/TestingConfig.py", line 126, in load_from_path
f = open(path)
FileNotFoundError: [Errno 2] No such file or directory: 'xla/mlir_hlo/tests/lit.cfg.py'
@ddunl It seems like perhaps the lit configs need to be updated in order to run the XLA tests via TF? I see a few hardcoded paths like this: https://github.com/tensorflow/tensorflow/blob/fc347ca5597a0a2a58d4f0f344d1210afede2cc5/third_party/xla/xla/glob_lit_test.bzl#L54
I see, I think that I probably deleted the transformations that kept this working as these aren't tested on CI anymore from the TF point of view, but I'll try to fix this in the next two weeks or so (I'll be on vacation for a little bit soon so won't get to this as quickly as I normally could).
I see, I think that I probably deleted the transformations that kept this working as these aren't tested on CI anymore from the TF point of view, but I'll try to fix this in the next two weeks or so (I'll be on vacation for a little bit soon so won't get to this as quickly as I normally could).
Thank you!
HI @ddunl, wondering if you had a chance to take a look at this issue yet. Thanks!
@ddunl I managed to get these working with a combination of:
- My changes here https://github.com/trevor-m/tensorflow/commit/c2fabfcb0e67df4f269483f61f1a443b853dded7
Looks like there are a few paths that need to be modified to reflect the TF runfile structure:MLIR_HLO_TOOLS_DIRused for lit config template and alsoXlaSrcRoot()used by the tests. Also, it looks like some string substitution is going awry in some of the .mlir files during the automated transfer from XLA->TF (copybara?) - This commit https://github.com/tensorflow/tensorflow/commit/767225e0d1acdb2ac5f478baba9a158f7c4b5ea0