decision-forests
decision-forests copied to clipboard
Bazel won't compile on 0.2.1 due to missing absl::Time include
Hi all, thanks for all the effort on this project.
I'm here just to document some compilation errors on Linux Mint / Ubuntu 20.04, GCC 8, Python 3.8, JDK 11, Bazel 3.7.2, tag 0.2.1
:
ERROR: /home/alessandro/.cache/bazel/_bazel_alessandro/12a6f981b14c4d716385955ae691d270/external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/BUILD:19:15: C++ compilation of rule '@ydf//yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer:load_balancer' failed (Exit 1): gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 43 argument(s) skipped)
In file included from external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/load_balancer.cc:16:
external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/load_balancer.h:270:9: error: 'Time' in namespace 'absl' does not name a type
absl::Time time_last_balancing_;
^~~~
external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/load_balancer.cc: In member function 'yggdrasil_decision_forests::utils::StatusOr<bool> yggdrasil_decision_forests::model::distributed_decision_tree::LoadBalancer::AddWorkDurationMeasurement(const std::vector<yggdrasil_decision_forests::model::distributed_decision_tree::LoadBalancer::Measure>&)':
external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/load_balancer.cc:190:17: error: 'time_last_balancing_' was not declared in this scope
(now - time_last_balancing_ >=
^~~~~~~~~~~~~~~~~~~~
external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/load_balancer.cc:190:17: note: suggested alternative: 'num_measures_last_balancing_'
(now - time_last_balancing_ >=
^~~~~~~~~~~~~~~~~~~~
num_measures_last_balancing_
A simple
#include "absl/time/time.h"
in external/ydf/yggdrasil_decision_forests/learner/distributed_decision_tree/load_balancer/load_balancer.h
seems to solve that problem.
Another one arised:
external/ydf/yggdrasil_decision_forests/metric/metric.cc: In function 'void yggdrasil_decision_forests::metric::AddPrediction(const yggdrasil_decision_forests::metric::proto::EvaluationOptions&, const yggdrasil_decision_forests::model::proto::Prediction&, yggdrasil_decision_forests::utils::RandomEngine*, yggdrasil_decision_forests::metric::proto::EvaluationResults*)':
external/ydf/yggdrasil_decision_forests/metric/metric.cc:880:5: error: jump to case label [-fpermissive]
default:
^~~~~~~
In file included from external/ydf/yggdrasil_decision_forests/utils/logging.h:19,
from external/ydf/yggdrasil_decision_forests/metric/metric.h:32,
from external/ydf/yggdrasil_decision_forests/metric/metric.cc:16:
external/ydf/yggdrasil_decision_forests/utils/logging_default.h:72:28: note: crosses initialization of 'const absl::lts_20210324::Status status_for876'
const auto COMBINE_TERMS(status_for, __LINE__) = expr; \
^~~~~~~~~~
external/ydf/yggdrasil_decision_forests/utils/logging_default.h:66:33: note: in definition of macro 'COMBINE_TERMS_SUB'
#define COMBINE_TERMS_SUB(X, Y) X##Y // helper macro
^
external/ydf/yggdrasil_decision_forests/utils/logging_default.h:72:14: note: in expansion of macro 'COMBINE_TERMS'
const auto COMBINE_TERMS(status_for, __LINE__) = expr; \
^~~~~~~~~~~~~
external/ydf/yggdrasil_decision_forests/metric/metric.cc:876:7: note: in expansion of macro 'CHECK_OK'
CHECK_OK(uplift::AddUpliftPredictionImp(option, pred, rnd, eval));
(actually 2/3 of the exact same kind).
This is solved by properly encapsulating cases in respective scopes, in FinalizeEvaluation
, InitializeEvaluation
and AddPrediction
methods in metric.cc
.
switch (option.task()) {
case model::proto::Task::CLASSIFICATION: {
auto* eval_cls = eval->mutable_classification();
const int32_t num_classes =
label_column.categorical().number_of_unique_values();
if (option.classification().roc_enable()) {
// Compute the ROCs.
for (int label_value = 0; label_value < num_classes; label_value++) {
auto* roc = eval_cls->mutable_rocs()->Add();
BuildROCCurve(option, label_column, *eval, label_value, roc);
}
}
} break;
case model::proto::Task::REGRESSION: {
// Performs bootstrapping.
if (option.bootstrapping_samples() > 0) {
LOG(INFO) << "Computing rmse intervals of evaluation metrics with "
"bootstrapping.";
internal::UpdateRMSEConfidenceIntervalUsingBootstrapping(option, eval);
}
} break;
case model::proto::Task::RANKING: {
FinalizeRankingMetricsFromSampledPredictions(option, eval);
} break;
case model::proto::Task::CATEGORICAL_UPLIFT: {
CHECK_OK(uplift::FinalizeUpliftMetricsFromSampledPredictions(
option, label_column, eval));
} break;
default:
break;
}
and analogously in the other methods.
Then, another error.
external/ydf/yggdrasil_decision_forests/model/random_forest/random_forest.cc:548:10: required from here
external/ydf/yggdrasil_decision_forests/utils/compatibility.h:120:5: error: no matching function for call to 'std::vector<yggdrasil_decision_forests::model::proto::VariableImportance>::vector(std::remove_reference<const yggdrasil_decision_forests::utils::StatusOr<std::vector<yggdrasil_decision_forests::model::proto::VariableImportance> >&>::type)'
new (&data_) T(std::move(data));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This was a bit harder to understand, but it's (probably) solved by adding a .status()
in return value of RandomForestModel::GetVariableImportance
in external/ydf/yggdrasil_decision_forests/model/random_forest/random_forest.cc
.
The same error (and solution) was applied to the following
external/ydf/yggdrasil_decision_forests/model/gradient_boosted_trees/gradient_boosted_trees.cc:545:10: required from here
external/ydf/yggdrasil_decision_forests/utils/compatibility.h:120:5: error: no matching function for call to 'std::vector<yggdrasil_decision_forests::model::proto::VariableImportance>::vector(std::remove_reference<const yggdrasil_decision_forests::utils::StatusOr<std::vector<yggdrasil_decision_forests::model::proto::VariableImportance> >&>::type)'
new (&data_) T(std::move(data));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This error was quite a PITA.
external/ydf/yggdrasil_decision_forests/learner/distributed_gradient_boosted_trees/distributed_gradient_boosted_trees.cc:522:14: required from here
external/ydf/yggdrasil_decision_forests/utils/compatibility.h:120:5: error: no matching function for call to 'std::unique_ptr<yggdrasil_decision_forests::model::gradient_boosted_trees::GradientBoostedTreesModel>::unique_ptr(std::remove_reference<const absl::lts_20210324::Status&>::type)'
new (&data_) T(std::move(data));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
the return value had to be a StatusOr
, while a Status
was being returned, so I had to wrap it and return a tils::StatusOr<std::unique_ptr<gradient_boosted_trees::GradientBoostedTreesModel>>(iter_status)
. Again,
external/ydf/yggdrasil_decision_forests/utils/distribute/implementations/grpc/grpc_worker.cc: In member function 'absl::lts_20210324::Status yggdrasil_decision_forests::distribute::{anonymous}::WorkerService::BlockingDoneOnWorker()':
external/ydf/yggdrasil_decision_forests/utils/status_macros_default.h:33:12: error: could not convert '_status' from 'grpc::Status' to 'absl::lts_20210324::Status'
return _status;
^~~~~~~
external/ydf/yggdrasil_decision_forests/utils/distribute/implementations/grpc/grpc_worker.cc:174:5: note: in expansion of macro 'RETURN_IF_ERROR'
An unnecessary AbslStatusToGrpcStatus
was called from a (already) GrpcStatus
.
That led to a complete build.
Also,
./tools/build_pip_package.sh simple
didn't work since it requires python3.9
. Modifying the bash script with 3.8 would have been fine, except from the fact I had a custom built TensorFlow. In this case I would simply recommend not to --force-reinstall
when you pip3 install dist/tensorflow_decision_forests-*-cp${PACKAGE}-cp${PACKAGE}*-linux_x86_64.whl
.
Finally, examples/minimal.py
is bad indented.
Traceback (most recent call last):
File "examples/minimal.py", line 33, in <module>
import tensorflow_decision_forests as tfdf
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/__init__.py", line 53, in <module>
from tensorflow_decision_forests.tensorflow import check_version
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/tensorflow/check_version.py", line 33
if tf_version is None:
^
IndentationError: expected an indented block
Eventually, I have the same error that I got installing via pip
, which is
WARNING:root:Failure to load the custom c++ tensorflow ops. This error is likely caused the version of TensorFlow and TensorFlow Decision Forests are not compatible.
Traceback (most recent call last):
File "examples/minimal.py", line 33, in <module>
import tensorflow_decision_forests as tfdf
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/__init__.py", line 57, in <module>
from tensorflow_decision_forests import keras
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/keras/__init__.py", line 49, in <module>
from tensorflow_decision_forests.keras import core
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/keras/core.py", line 60, in <module>
from tensorflow_decision_forests.tensorflow import core as tf_core
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/tensorflow/core.py", line 35, in <module>
from tensorflow_decision_forests.tensorflow.ops.training import api as training_op
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/tensorflow/ops/training/api.py", line 29, in <module>
raise e
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/tensorflow/ops/training/api.py", line 26, in <module>
tf.load_op_library(resource_loader.get_path_to_datafile("training.so"))
File "/home/alessandro/traiding/lib/python3.8/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/alessandro/traiding/lib/python3.8/site-packages/tensorflow_decision_forests/tensorflow/ops/training/training.so: undefined symbol: _ZN10tensorflow5ScopeD1Ev
and I have no idea why it doesn't find the linked library, since I just compiled it. Should I add something on the LD_LIBRARY_PATH
?
Thanks a lot for the alert and the search of solutions @AlessandroFlati ! :)
Those comments have been integrated in the commits:
- https://github.com/tensorflow/decision-forests/commit/26a2e6751438d5ab1acc72aea63059c90f18ceee
- https://github.com/google/yggdrasil-decision-forests/commit/b591941587143fc9c4ab748566e4e22d9fadd2bb
No problem, I know too well how hard it takes to retrieve such "structured" feedback on non-standard environments. :)
Maybe you have any suggestion for the missing symbol in the linked .so
library?
Can you give me some more details:
-
"I have the same error that I got installing via pip, which is" -> Do you mean that you see this error message both from the pip install version and from your own compilation of tf-df?
-
"undefined symbol: _ZN10tensorflow5ScopeD1Ev" -> Is that the entire message? Just to make sure: You have tf2.7.0 installed (the one compatible with tf-df 0.2.1)?
The answer is yes at all questions :grin: Actually, I have a custom built TF2.7.0, because of Cuda Capabilities 3.0 which were not natively supported (they actually were, but they forgot to update the CMake, I opened a similar issue in their case) - but with the manually built version of TF I manage to use all functionalities of TF2.7, with CUDA, CUDNN etc. - do you know by any chanche where is located that TensorflowScope symbol? Maybe I forgot to compile a specific submodule of Tensorflow, or I don't know..
Closing as obsolete, TF-DF, TF and bazel have changed significantly since. Feel free to open a new issue if you still have this problem.