mosaic icon indicating copy to clipboard operation
mosaic copied to clipboard

Provide the ability to supply KNN model instance with already computed Mosaic Chips / Grid Indices

Open kevin-courbet opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe. Currently the instance always calculates mosaic chips for the candidates dataframe, as can be seen from this excerpt of the source code:

 val candidatesDfIndexed = candidatesDf
            .withColumn(getCandidatesRowID, monotonically_increasing_id())
            .withColumn(
              hexRingNeighboursTf.rightGridCol(getCandidatesFeatureCol),
              grid_tessellateexplode(col(getCandidatesFeatureCol), getIndexResolution, keepCoreGeometries = false)
            )
            .checkpoint(true)

However it may be that chips and grid indices have already been calculated for other use cases. This results in a redundant and wasteful step that is rather costly and could be avoided.

Describe the solution you'd like Some kind of "knn.setCandidatesGridIndexCol("grid_index_res_9", 9); knn.setCandidatesChipCol("chip", 9)" API where the model reads the chips and/or indices from existing columns in the supplied dataframes, thereby skipping the expensive operation of calculating them.

kevin-courbet avatar Apr 29 '23 11:04 kevin-courbet