flatbush icon indicating copy to clipboard operation
flatbush copied to clipboard

Generalize current filter function inside search.

Open muendlein opened this issue 6 months ago • 4 comments

Allow any arbitrary node function to be applied here, e.g. when node index shall be inserted somewhere else.

https://github.com/mourner/flatbush/blob/3a6f0f0cc58476b32659955ccedfb054f7857410/index.js#L239-L241

muendlein avatar May 09 '25 23:05 muendlein

Can you expand on this with an example? Not sure I'm following...

mourner avatar May 10 '25 14:05 mourner

Suppose you already have a boolean array place that stores the information which indices are inside a query box. In this case you would want to directly adjust this array and avoid a separate result array that is growing during search (and therefore is slower).

My proposal is therefore to have a generalized search function that does not return anything but rather works via dependency injection, meaning a leafNodeFunction (index, nodeMinX, nodeMinY, nodeMaxX, nodeMaxY) => void will be injected.

muendlein avatar May 10 '25 20:05 muendlein

Ah, I see. In theory, the existing filterFn semantics can be used for this — e.g. if you return false in the function, it won't add the index to the array (but you can do custom handling of the index instead). Would that work? I'd just update the docs about it then.

mourner avatar May 10 '25 21:05 mourner

Ah, I see. In theory, the existing filterFn semantics can be used for this — e.g. if you return false in the function, it won't add the index to the array (but you can do custom handling of the index instead). Would that work? I'd just update the docs about it then.

In theory it could work, however I see one issue which is that the function would still return an (empty) array which can be quite confusing. One idea is to have a separate seach_apply function which contains the logic and does not return anything (only generalizated custom handling via dependency injection). The existing search function would call this function and have the current behavior embedded in the custom function (push to array). Overall this would yield a consistent behavior.

muendlein avatar May 11 '25 20:05 muendlein

hi @mourner just to provide some additional context. We had to implement a search_apply in https://github.com/bokeh/bokeh/pull/14340

  search_apply(minX: number, minY: number, maxX: number, maxY: number, nodeFunction: (index: number, node: Rect) => void): void {
    if (this._pos !== this._boxes.length) {
      throw new Error("Data not yet indexed - call index.finish().")
    }

    let nodeIndex: number | undefined = this._boxes.length - 4
    const queue = []

    while (nodeIndex !== undefined) {
      // find the end index of the node
      const end = Math.min(nodeIndex + this.nodeSize * 4, upperBound(nodeIndex, this._levelBounds))

      // search through child nodes
      for (let pos: number = nodeIndex; pos < end; pos += 4) {
        const nodeMinX = this._boxes[pos + 0]
        const nodeMinY = this._boxes[pos + 1]
        const nodeMaxX = this._boxes[pos + 2]
        const nodeMaxY = this._boxes[pos + 3]

        if (maxX < nodeMinX || maxY < nodeMinY || minX > nodeMaxX || minY > nodeMaxY) {
          continue
        }

        const index = this._indices[pos >> 2] | 0

        if (nodeIndex < this.numItems * 4) {
          nodeFunction(index, {x0: nodeMinX, y0: nodeMinY, x1: nodeMaxX, y1: nodeMaxY})
        } else {
          queue.push(index) // node; add it to the search queue
        }

      }

      nodeIndex = queue.pop()
    }
  }

which we used like this:

 search_bounds(minX: number, minY: number, maxX: number, maxY: number): Rect {
    const result = empty()
    this.search_apply(minX, minY, maxX, maxY, (_index, node) => {
      if (node.x0 >= minX && node.x0 < result.x0) {
        result.x0 = node.x0
      }
      if (node.x1 <= maxX && node.x1 > result.x1) {
        result.x1 = node.x1
      }
      if (node.y0 >= minY && node.y0 < result.y0) {
        result.y0 = node.y0
      }
      if (node.y1 <= maxY && node.y1 > result.y1) {
        result.y1 = node.y1
      }
    })
    return result
  }

However, we'd really love to not duplicate all the Flatbush code in search_apply and just be able to call searchFn. Would it be possible to add the node rect as an optional second argument to the search function?

bryevdv avatar Aug 07 '25 18:08 bryevdv

@bryevdv interesting use case, I'll see what I can do! Most likely we can address both this issue and #45 by extending the filterFn application — e.g. passing it index, nodeMinX, nodeMinY, nodeMaxX, nodeMaxY as arguments. This should be faster than the code above because it avoids additional object allocation (for passing the bbox values).

In theory it could work, however I see one issue which is that the function would still return an (empty) array which can be quite confusing.

IMO it won't be a problem in practice — just needs well-written documentation.

mourner avatar Aug 08 '25 16:08 mourner