magellan
magellan copied to clipboard
Optimize Spatial Join for Skew Data
Hi,
Consider this as a new feature request: "Optimize Spatial Join for Skew Data"
example: Let's say we have the world country boundary as multi-polygons. We have 1million lat,lon points and we need to find out which country each point belongs to. But in reality, more that 75% of the points are from a single country. When we join using "within", only one executor allocated for the 75% of data, as all there points are in one country. Optimizer for Spatial Join on Skew Data will help resolving this issue.
Thanks, Obaid
This is a feature we are working for for 1.0.6 It's targeted mid November
@harsha2010 thanks. This will be great!! Looking forward for the feature.
@harsha2010 Is this feature included in latest version?