dlist icon indicating copy to clipboard operation
dlist copied to clipboard

Documented time bounds are incorrect for persistent use

Open treeowl opened this issue 2 years ago • 3 comments

The usual assumption for pure data structures in Haskell is that the documented time bounds may be amortized, but that amortization must hold up under persistent use. That is not true for dlist. snoc and head are both documented as being O(1), and yet the following program runs far longer than I would ever want to wait. Swapping in Data.Sequence for DList lets this run quickly.

import qualified Data.DList as D
import Data.DList (DList)
import Data.Foldable

blob :: DList Int
blob = foldl' D.snoc mempty [1..10^7]

main :: IO ()
main = print $ sum [ D.head (blob `D.snoc` a) | a <- [1..10^6 :: Int]]

I think the best approach is likely to document that DLists are intended to be used in an affine/linear fashion, and that the bounds are only valid in that context.

treeowl avatar Nov 23 '21 00:11 treeowl

There's a lot of relevant discussion in Reflection Without Remorse: Revealing a hidden sequence to speed up monadic reflection, though the paper goes in a different direction.

treeowl avatar Nov 23 '21 00:11 treeowl

Hrm... affine use is putting too strong a point on it. It's perfectly fine to concat . replicate n a DList. But they should generally only be "inspected" once.

treeowl avatar Nov 23 '21 00:11 treeowl

Yep.

To unwrap example:

(also decreased magnitude one time, so it is faster/easier to play & get code responses)

import qualified Data.DList as D
import Data.DList (DList)
import qualified Data.Sequence as S
import Data.Sequence (Seq)
import Data.Foldable

-- DList
blobD :: DList Int
blobD = foldl1 (D.snoc) mempty [1..10^6]

mainD :: IO ()
mainD = print $ sum [ D.head (D.snoc blobD a) | a <- [1..10^5 :: Int]]

-- Seq
blobS :: Seq Int
blobS = foldl1 (S.|>) mempty [1..10^6]

mainS :: IO ()
mainS = print $ fmap sum [ S.take 1 (blobS S.|> a) | a <- [1..10^5 :: Int]]

The First would do not terminate in meaningful time, but the second terminates fast.

foldr would be the same.

Anton-Latukha avatar Dec 17 '21 16:12 Anton-Latukha