rikai
rikai copied to clipboard
Parquet-based ML data format optimized for working with unstructured data
Docker image was broken but is too complicated to maintain. I simplified the build using 2 builder images (1 for jar 1 for python wheels). I've also added some cleanup...
We should catch the error earlier.
It might be a pyspark error message logging problem.
Base on my bad experience on https://github.com/eto-ai/rikai/issues/612 I think schema verification for ModelType is necessary to make it easier to develop a ModelType.
For Rikai: + [Python notebooks now use the IPython kernel](https://docs.databricks.com/release-notes/runtime/11.0.html#id10)
https://spark.apache.org/releases/spark-release-3-3-0.html Three Highlights for Rikai + Support complex types for Parquet vectorized reader ([SPARK-34863](https://issues.apache.org/jira/browse/SPARK-34863)) + Provide a profiler for Python/Pandas UDFs ([SPARK-37443](https://issues.apache.org/jira/browse/SPARK-37443)) + More comprehensive DS V2 push down capabilities...
We have trouble to get Rikai work with Databricks 9.1 LTS (based on Apache Spark 3.1.2). See https://github.com/eto-ai/rikai/issues/522 And Rikai works well with Databricks 10.4 LTS (based on Apache 3.2.1).
Just tried https://github.com/eto-ai/rikai/blob/main/notebooks/Mojito.ipynb Failed to load Yolo v5 model. I'm not sure if it's my environment. This example notebook was working weeks ago. ``` spark.sql(""" CREATE MODEL yolov5m OPTIONS (device="gpu",...
https://github.com/eto-ai/spark-video/issues/26 ``` scala scala> Int.MaxValue / (1000 * 60 * 60) val res5: Int = 596 scala> Long.MaxValue / (1000 * 60 * 60) val res7: Long = 2562047788015 ```...