delta-sharing
delta-sharing copied to clipboard
Performance improvement
Currently 200 K rows, 100 columns can be fully loaded in 20-40 minutes using load_as_pandas python package. Besides, Filtering is not working properly (predicate hints, etc) with this option. Improving the performance is currently only possible by partitioning delta tables monthly rather than daily.
@ckayay Sorry for the late response.
Have you tried to set predicateHints? What do you mean by it not working properly?
Have you tried to run optimize on the original table so it has fewer files and easier for the client to load?