Adam Pocock
Adam Pocock
Can you provide the code where you construct the dataset with and without the train test split? And also ask the training dataset how big it is? I agree that...
Can you check if the featureIDMap from the `MutableDataset` when using the splitter and without is equal to the other? And if the problem still exists if you use `CARTJointRegressionTrainer`...
Ok, so that sounds a lot more like a bug in the tree implementation itself rather than an issue with the train test splitter. Which is good, because the splitter...
If you can export Google's Gemma models from Keras to a TF Saved Model they should be pretty good - https://www.kaggle.com/models/google/gemma-2.
https://www.tensorflow.org/guide/keras/serialization_and_saving#apis
I haven't written any Spark or Scala code since Spark 1.4 so I don't think I'll have much ability to review this PR.
Looks like the inputs to ONNX Runtime aren't being closed, so it's leaking native memory allocated by ORT. I think `inputIds`, `tokenTypeIds` and `attentionMask` should be closed when the result...
`tf.nn.embedding_lookup` is a layer of python over the top of a gather call, which we have in the core ops. There's a bit of complexity there as the thing it...
We have not added a wrapper around the gather ops yet.
I loaded the `silero_vad.onnx` model from https://github.com/snakers4/silero-vad/blob/master/files/silero_vad.onnx successfully on Linux x64 using JDK 17.0.4 with ORT 1.18.0 & 1.16.0, and on macOS arm64 14.5 with ORT 1.18.0. The path you...