build
build copied to clipboard
Internal build vs OSS misaligment
We had some c++ flags disaligment between c++ and copybara. See: https://github.com/tensorflow/tensorflow/pull/56276#issuecomment-1164500275
We had also many problem about failure reproducibility between OSS and internal test without forcing OSS with this to reproduce internal failures: https://github.com/tensorflow/tensorflow/pull/56276/commits/11dc383374ea0c34ea60033160d7de5f57aa5116
Also we had multiple rollbacks after merge.
/cc @cheshire any other note?
/cc @learning-to-play
P.s. NIT we Need to enforce: https://github.com/tensorflow/tensorflow/pull/56276#discussion_r915074019
P.s.s. the linking required many CPU cores at every iteration cause we had >50 targets to link (very resource intensive) for just running a single test after an edit. So we needed to do the linking in parallel with many core and with a speedup of LD Gold (https://github.com/tensorflow/build/issues/110).
Also I needed to keep the PR branch freezed without rebasing/merging for months cause a rebase will eventually invalidate the Bazel cache requiring a monster build time again. This is another problem for PR that are open for weeks or months and you could need to rebase to solve conflicts.
We had also a quite tricky issue filtering single test on the development cycle.
You was already notifed at: https://github.com/tensorflow/tensorflow/commit/0f9af911152880e36b2bfa8435e3746b9e329660
@mihaimaruseac I don't know if you could add something here for https://github.com/tensorflow/tensorflow/pull/57468#issuecomment-1231803670
Thanks
I believe https://github.com/tensorflow/tensorflow/commit/96b26a21e8d437745444a9f4b6a8d8e39689d083 is a huge step in fixing this, fixing most (if not all) such issues.
@cheshire Thanks, do you have a full list of the currently enabled warnings as error in copybara related jobs?
I believe https://github.com/tensorflow/tensorflow/commit/96b26a21e8d437745444a9f4b6a8d8e39689d083 is a huge step in fixing this, fixing most (if not all) such issues.
We are still suppressing all the warnings: https://github.com/tensorflow/tensorflow/blob/master/.bazelrc#L295-L301
# Suppress all C++ compiler warnings, otherwise build logs become 10s of MBs.
build:android --copt=-w
build:ios --copt=-w
build:linux --host_copt=-w
build:macos --copt=-w
build:windows --copt=/W0
build:windows --host_copt=/W0
With the mentioned commit we are just enabling the specific unused-result
but if we check this table that flag is one of the default error we have also with -Werror
.
So the point here is to understand what flags we have in the copybara builds as if we are using -Werror
there it has > 50 warning types.
@mihaimaruseac I don't know if you could add something here for tensorflow/tensorflow#57468 (comment)
Sadly, I don't think we can do much here. The issue there was that an internal file with internal code also needed to be updated. Using copybara instead of the internal file would have been too cumbersome.
Though, I also think that this type of breakages is small. It should only occur when you are adding new defines.