zoltar
zoltar copied to clipboard
Common library for serving TensorFlow, XGBoost and scikit-learn models in production.
We should add logging support in the same way we add instrumentation.
Add new `zoltar-benchmark` subproject dedicated to this.
Apply consequences of https://github.com/spotify/scio/pull/1238: * inference on local models should be by default sync * inference on remote models (ml-engine) should be async by default
Sparse vectors are commonly used to extract features, and we should make sure it works seamlessly, let's at least add tests for both models (and maybe example).
This should not be hardcoded here: https://github.com/spotify/zoltar/blob/master/examples/apollo-service-example/src/main/java/com/spotify/zoltar/examples/apollo/IrisPrediction.java#L64-L67 Might come for feature spec or whatever makes sense.
Right now if there are assets in a TF saved model on GCS, they will be downloaded locally to tmp location - we should make it easy to retrieve assets...
Add a support for startup checks, the flow would be: * load model * run user defined check on the model * allow prediction This would be useful in canary...
We should add some convenience methods, maybe in `TensorFlowExtras`, to feed Featran `FloatSparseArray` and `DoubleSparseArray` into TensorFlow. https://github.com/spotify/featran/blob/master/java/src/main/scala/com/spotify/featran/java/JavaOps.scala#L126 The feeding code might look like this: ```java runner .feed("input/raw_indices", Tensors.create(new long[]{0,...