Jesse Lord
Jesse Lord
Could add a `.to_filepaths()` method to a driver to only return the file names and still use the built in caching? I can imagine a scenario where I can convince...
Yes, that would be perfect.
For us it would be a shared network drive. I was thinking about putting the shared cache directory in the catalog, but that might only work for my use case.
Thanks, this is very helpful. I will make two catalogs, but I am hoping that most users will only interact with the catalog of images. I will keep trying to...
The json was just a visual example. I am actually converting into arrow arrays and then Polars series for a Polars dataframe. I will dig into the stringify to see...
Yeah, that sounds great. I might have some time to help with this as well. From looking at the livy API I was also uncertain how to encode the bytes,...
Bumping. I am getting the same problem and the workarounds above aren't working for me. Specifically, my user home directory is always the `rootdir` regardless of existence of `setup.py` and...
If you look in the `results/spark_perf_output__2016-08-03_11-22-03_logs/scheduling-throughput.err` text file you will see the error output from spark. This should help you isolate your specific error.
@TomAugspurger commented in #31 that it has something to do with a new xgboost update making it incompatible.
I was able to install the branch from #28 and it works for my use case. @TomAugspurger I would be interested in helping solve the CI problems but I don't...