spatialhadoop2
spatialhadoop2 copied to clipboard
Please analyse the function packInRectangles in Repartition.java
I am reading the code of src/edu/umn/cs/spatialHadoop/operations/Repartition.java
, which is to build index.
Anyone can help to analyse the following function:
public static CellInfo[] packInRectangles(Path[] files,
Path outFile, OperationsParams params, Rectangle fileMBR)
throws IOException
I have no idea of what word it does.
This function computes a set of rectangles that can be used to partition the input files. These rectangles (or cells) are supposed to balance the load such that each cell is assigned, roughly, an equal number of records. The way it works is that it reads a random sample from the input file, bulk loads it into an in-memory R-tree using the sort-tile-recursive algorithm (STR), and returns the boundaries of leaf nodes of that R-tree. You can find more details in SpatialHadoop paper [http://spatialhadoop.cs.umn.edu/publications/ICDE15_industrial_522.pdf] Section V.B.