data-interval icon indicating copy to clipboard operation
data-interval copied to clipboard

lookup value and the/a piece-wise constant component around it

Open tilowiklundSensmetry opened this issue 4 years ago • 4 comments

I am sampling sections of a piece-wise constant function encoded as an IntervalMap, and for efficiency reasons it would be useful to have a lookup function that also reported the maximal (or just some non-trivial) interval around the query point on which the IntervalMap is constant. There is a quickly cooked up prototype here: https://github.com/tilowiklundSensmetry/data-interval/pull/1/commits/8e0d144a1c1eb76122a902a443100e8b9d84845e that I am guessing would give a (not necessarily maximal) such interval, but I am not certain enough about the semantics of IntervalMap to say whether it is correct.

tilowiklundSensmetry avatar Nov 08 '21 15:11 tilowiklundSensmetry

It's hard to specify the semantics of this function without refering to internal implementation details. E. g., the following function satisfies properties described in your haddock:

lookupAround :: Ord k => k -> IntervalMap k a -> Maybe (Interval k, a)
lookupAround k m = case lookup k m of 
  Nothing -> Nothing 
  Just a -> Just (Interval.singleton k, a)

Bodigrim avatar Nov 09 '21 21:11 Bodigrim

I don't think the ideal semantics are difficult to specify (the maximal interval containing the point on which the function is constant). For performance reasons this might be a bit tricky with the current implementation. Adding something like the following might be sufficient:

If m was created by at most n insertions to an empty map, then the cardinality of Set.fromList [ fst (lookAround x1 m), ..., fst (lookAround xk m) ] is O(n).

tilowiklundSensmetry avatar Nov 10 '21 08:11 tilowiklundSensmetry

I think returning the maximal interval is a better semantics, even if implementation is not straightforward.

Bodigrim avatar Nov 11 '21 20:11 Bodigrim

It is documented that IntervalMap keeps adjacent mappings unmerged even though they ​are connected and mapped to the same value, and toList already returns the result that depends on such representation.

https://github.com/msakai/data-interval/blob/6b62faf1dc4de5a8cad241a10eedf90fa4c2b825/src/Data/IntervalMap/Base.hs#L121-L126

Therefore I feel it is okay for lookupAround to return the result that depends on such representation.

Also, if we implement a function that merges such adjacent mappings (which requires Eq constraint to the value type), returning the maximal interval can be achieved by simply combining the function and lookupAround.

msakai avatar Nov 17 '21 12:11 msakai