javacpp
javacpp copied to clipboard
Parsing abseil span.h fails with NPE
I'm attempting to parse https://github.com/abseil/abseil-cpp/blob/master/absl/types/span.h via the maven plugin, but it's failing with a NPE.
Exception:
[INFO] --- javacpp:1.5.4:parse (javacpp-parser) @ tensorflow-core-api ---
[INFO] Detected platform "linux-x86_64"
[INFO] Building platform "linux-x86_64"
[INFO] Targeting /windows/Users/jimne/Desktop/OtherStuff/tensorflow_java/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/internal/c_api/global/tensorflow.java
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/core/util/port.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/tf_attrtype.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/tf_datatype.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/tf_status.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/tf_tensor.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/c_api.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/kernels.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/ops.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/eager/c_api.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/org_tensorflow/tensorflow/c/eager/gradients.h
[INFO] Parsing /home/rnett/.cache/bazel/_bazel_rnett/7f702631aa7a1fe60fd11c0ee4052a1e/external/com_google_absl/absl/types/span.h
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9.359 s (Wall Clock)
[INFO] Finished at: 2020-11-27T18:17:35-08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.bytedeco:javacpp:1.5.4:parse (javacpp-parser) on project tensorflow-core-api: Execution javacpp-parser of goal org.bytedeco:javacpp:1.5.4:parse failed.: NullPointerException -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
The only relevant Infos are:
.put(new Info("gtl::ArraySlice<AbstractTensorHandle*>").cppTypes("absl::Span<AbstractTensorHandle*>"))
.put(new Info("absl::Span<AbstractTensorHandleconst *>", "absl::Span<AbstractTensorHandle*>").pointerTypes("HandleList"))
(gtl::ArraySlice is just using ArraySlice = absl::Span<const T>;)
Yeah, that's pretty advanced C++. It's going to be hard to parse these things without Clang, see issue #51. For now though, in the case of libraries like TensorFlow, we don't usually need to access APIs using it, so we can skip those types this way:
.put(new Info("absl::optional", "absl::Span", "absl::LogSink", "TFLogSink", "std::initializer_list", "std::iterator").skip())
https://github.com/bytedeco/javacpp-presets/blob/master/tensorflow/src/main/java/org/bytedeco/tensorflow/presets/tensorflow.java#L580
After debugging, it's thrown here: https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/tools/Parser.java#L487. type is null. token is int.
I'm trying to parse https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/eager/gradients.h, which uses it as input in a few places. Is there a way I can generate a pointer type for it without actually parsing it? I'm not actually familiar with absl, can I just pass a vector?
There's no C API for that? Not even experimental? There's most likely something...
Not that I've been able to find, it just made it into 2.4.0-rc3. I expect there will be one eventually, but I wanted to see if I could get it to work from c++.
Somehow, probably, but it won't be acceptable for SIG-JVM.
Yeah, I didn't expect it to be this complicated when I started. Most likely going to have to wait for a c api to actually add it.
You can try to map them to generic pointers with .cast().pointerTypes("Pointer").
Checking how the Python API uses pybind11 to access these classes would also help.
Hum, it looks like they are adding C++ to their "C API", so that means we'll need to map those C++ classes. @karllessard
I think the way we should go about is to write manually the wrapping class for absl::Span<AbstractTensorHandle* const>, naming it like AbstractTensorHandleSpan or whatever, similar to what I had to do for std::string[] here:
https://github.com/bytedeco/javacpp-presets/blob/master/tensorflow/src/main/java/org/bytedeco/tensorflow/StringArray.java
We just need to declare the few methods that we find useful, and that's pretty reliable and easy to do. That's how pybind11 works as well, so we shouldn't run into any large issues doing it that way, if they intend their C++ API to be usable from Python that is.
It has to be a mistake that C++ ended up in the C API, @rnett I would suggest to raise this as an issue in https://github.com/tensorflow/tensorflow.
I think the custom wrapper class @saudet is suggesting could be an acceptable workaround but let's see what folks at Google have to say about it first.
Do you know what tensorflow's intended demarcation is between the C api and the C++? Because there's lots of C++ in the .cc files but it seems to be kept out of the header files for the most part. There's no TF_CAPI_EXPORT on any of the gradient stuff either so it may just be internal apis but not marked as such. abstract_context.h and the other abstract_ files have this issue too.
Yes, implementation of that layer is still in C++ but the API itself, as defined in the header, should be pure C and protected by extern "C" to make it explicit. At least that was the original intent of one of its principal author at the time it has been created (Asim Shankar) knowing that it would be easier for other languages to bind to a pure C API than C++, hence the name.
Some of the new header seems to respect properly this design (ex: kernels.h) but you are right that others are not following the same rules (tape.h is another). I don't know why such files land in the ABI layer, unless Google have decided that the "C" in "C API" is no more required. If that's the case, we need to know.