flatbush
flatbush copied to clipboard
Generalize current filter function inside search.
Allow any arbitrary node function to be applied here, e.g. when node index shall be inserted somewhere else.
https://github.com/mourner/flatbush/blob/3a6f0f0cc58476b32659955ccedfb054f7857410/index.js#L239-L241
Can you expand on this with an example? Not sure I'm following...
Suppose you already have a boolean array place that stores the information which indices are inside a query box. In this case you would want to directly adjust this array and avoid a separate result array that is growing during search (and therefore is slower).
My proposal is therefore to have a generalized search function that does not return anything but rather works via dependency injection, meaning a leafNodeFunction (index, nodeMinX, nodeMinY, nodeMaxX, nodeMaxY) => void will be injected.
Ah, I see. In theory, the existing filterFn semantics can be used for this — e.g. if you return false in the function, it won't add the index to the array (but you can do custom handling of the index instead). Would that work? I'd just update the docs about it then.
Ah, I see. In theory, the existing
filterFnsemantics can be used for this — e.g. if you returnfalsein the function, it won't add the index to the array (but you can do custom handling of the index instead). Would that work? I'd just update the docs about it then.
In theory it could work, however I see one issue which is that the function would still return an (empty) array which can be quite confusing.
One idea is to have a separate seach_apply function which contains the logic and does not return anything (only generalizated custom handling via dependency injection). The existing search function would call this function and have the current behavior embedded in the custom function (push to array). Overall this would yield a consistent behavior.
hi @mourner just to provide some additional context. We had to implement a search_apply in https://github.com/bokeh/bokeh/pull/14340
search_apply(minX: number, minY: number, maxX: number, maxY: number, nodeFunction: (index: number, node: Rect) => void): void {
if (this._pos !== this._boxes.length) {
throw new Error("Data not yet indexed - call index.finish().")
}
let nodeIndex: number | undefined = this._boxes.length - 4
const queue = []
while (nodeIndex !== undefined) {
// find the end index of the node
const end = Math.min(nodeIndex + this.nodeSize * 4, upperBound(nodeIndex, this._levelBounds))
// search through child nodes
for (let pos: number = nodeIndex; pos < end; pos += 4) {
const nodeMinX = this._boxes[pos + 0]
const nodeMinY = this._boxes[pos + 1]
const nodeMaxX = this._boxes[pos + 2]
const nodeMaxY = this._boxes[pos + 3]
if (maxX < nodeMinX || maxY < nodeMinY || minX > nodeMaxX || minY > nodeMaxY) {
continue
}
const index = this._indices[pos >> 2] | 0
if (nodeIndex < this.numItems * 4) {
nodeFunction(index, {x0: nodeMinX, y0: nodeMinY, x1: nodeMaxX, y1: nodeMaxY})
} else {
queue.push(index) // node; add it to the search queue
}
}
nodeIndex = queue.pop()
}
}
which we used like this:
search_bounds(minX: number, minY: number, maxX: number, maxY: number): Rect {
const result = empty()
this.search_apply(minX, minY, maxX, maxY, (_index, node) => {
if (node.x0 >= minX && node.x0 < result.x0) {
result.x0 = node.x0
}
if (node.x1 <= maxX && node.x1 > result.x1) {
result.x1 = node.x1
}
if (node.y0 >= minY && node.y0 < result.y0) {
result.y0 = node.y0
}
if (node.y1 <= maxY && node.y1 > result.y1) {
result.y1 = node.y1
}
})
return result
}
However, we'd really love to not duplicate all the Flatbush code in search_apply and just be able to call searchFn. Would it be possible to add the node rect as an optional second argument to the search function?
@bryevdv interesting use case, I'll see what I can do! Most likely we can address both this issue and #45 by extending the filterFn application — e.g. passing it index, nodeMinX, nodeMinY, nodeMaxX, nodeMaxY as arguments. This should be faster than the code above because it avoids additional object allocation (for passing the bbox values).
In theory it could work, however I see one issue which is that the function would still return an (empty) array which can be quite confusing.
IMO it won't be a problem in practice — just needs well-written documentation.