roc
roc copied to clipboard
AssocList for dictionaries
Work-in-progress implementation of an association list, which can become the foundation of a 'pure Roc' Dict implementation, as well as being an example of a custom datastructure.
Fixes (at least partially) #3245
- [x] Basic implementation of most of the interface.
- [x] ~~
union,intersection,difference.~~insertAll. - [x] A runnable example to manually test whether the code is correct.
- [x] Actual tests.
Question: Currently AssocList is an alias of a normal List [Pair k v].
Alternatively we could wrap it with an extra tag to hide that it in a list.
Maybe it makes more sense to do this when wrapping it as a Dict (because there for sure the internals need to be opaque as we want to change it in the future), which is why I have not done it here right now.
This does immediately beg the question: Should we have a separate e.g. len in the AssocList module or tell people to just use List.len?
Should we have a separate e.g.
lenin theAssocListmodule or tell people to just useList.len?
Let's do the AssocList.len function. š
Alternatively we could wrap it with an extra tag to hide that it in a list.
This is what opaque types are for! I haven't written a tutorial on them yet, but basically here's how to use the feature:
AssocList k v := List [Pair k v]
The := is similar to Haskell's newtype; this makes AssocList a nominal type, which means that unlike a type alias, if you try to compare an AssocList k v to a List [Pair k v], you'll get a type mismatch.
You can create instances of it by applying @AssocList, and you can destructure them by using @AssocList in a pattern match. For example:
insertFresh : AssocList k v, k, v -> AssocList k v
insertFresh = \@AssocList list, k, v ->
List.append list (Pair k v)
|> @AssocList
This @ syntax is only allowed in the same scope where the := was used to define the AssocList opaque type. So since you defined it at the top level of the AssocList module, other modules won't be able to destructure an AssocList with @AssocList or create new instances with it, even if you expose the AssocList type to them.
FYI, the reason CI is failing is that we have a check that everything in the examples directory has to have a test verifying its behavior in some way; that way, we don't accidentally end up with examples that don't work, which is a frustrating experience for beginners. š
I can show you how to add that test when you're feeling this is ready to merge!
I have a question about union, intersection and difference.
union as bs: For each key that exists in eitherasorbs, store it with its value in the result. On conflict, keep the value frombs.intersection as bs: For each key that exists in bothasandbs, store this key with the value fromas(?) in the result.difference as bs: For each key that exists inasbut not inbs, store this key with the value fromasin the result.
Are these the expected semantics?
I can show you how to add that test when you're feeling this is ready to merge!
I would love to have some guidance for this. :blush:
I've hit a snag by the way: To support O(n * log(n)) union/intersection/difference it is necessary to sort the elements inside the AssocList.
This is however not possible unless Pair a b can be sorted. I presume that this problem would be solved once we introduce the Hash constraint. But what should we do until that part is ready?
I see two options:
- Do not implement
union/intersection/differencefor now - Use different signatures for these functions, constrained on e.g.
Num, which will change in the future. - Use a naive
O(n²)algorithm that works for all datatypes (that support==).
Are these the expected semantics?
Yes, exactly! šÆ
Use a naive O(n²) algorithm that works for all datatypes
This is fine for now - we can revisit when we get to the version that actually uses a hashmap under the hood. š
Ping me on Zulip and we can chat about adding the test!
Actually, here's a simplification: let's do what this package does here, namely:
- Don't include
intersectionanddiffin the API after all - Rename
uniontoinsertAll
All right, I have implemented insertAll. I took the liberty to include a normalizeWith to allow people to compare AssocLists for equality if they really want/need to, requiring them to come up with a sorter function themselves (just like List.sortWith).
@rtfeldman I am encountering a couple of weird bugs in Roc itself, which make it difficult to make the example more involved at this time.
The current example code wants you to build two AssocLists by entering pairs of lines. The Roc runtime is mismanaging the recursion, however.
Should I make a separate issue for this problem or not?
Example input:
1
a
2
b
3
c
4
d
5
e
6
f
Program invocation:
cat examples/assoc-list/example_input.txt | cargo run -- run examples/assoc-list/main.roc
Expected output:
Finished dev [unoptimized + debuginfo] target(s) in 0.27s
Running `target/debug/roc run examples/assoc-list/main.roc`
šØ Rebuilding host...
This example program takes the AssocList interface for a spin.
Input pairs of lines.
Each pair will become an association in the first AssocList.
Finish by inputting an empty line.
Fist association list:
1 => a
2 => b
3 => c
Input pairs of lines.
Each pair will become an association in the second AssocList.
Finish by inputting an empty line.
Second association list:
4 => d
5 => e
6 => f
Actual output:
Finished dev [unoptimized + debuginfo] target(s) in 0.27s
Running `target/debug/roc run examples/assoc-list/main.roc`
šØ Rebuilding host...
This example program takes the AssocList interface for a spin.
Input pairs of lines.
Each pair will become an association in the first AssocList.
Finish by inputting an empty line.
Fist association list:
1 => a
Input pairs of lines.
Each pair will become an association in the second AssocList.
Finish by inputting an empty line.
Second association list:
4 => d
5 => e
6 => f
2 => b
Input pairs of lines.
Each pair will become an association in the second AssocList.
Finish by inputting an empty line.
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/lib.rs:116:45
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5
Aborted (core dumped)
A separate issue would be great! Ideally with a smaller reproduction case. š
This PR is as good as finished. However, it is blocked by the bug #3351 right now. (The tests are failing because of this bug's symptoms)
@rtfeldman This is currently still blocked by the List.walk memory corruption problem that breaks the test example.
The only function that is affected by this issue, is AssocList.remove.
Maybe it makes sense to comment out that function for now, allowing the rest of AssocList to already be used?
Maybe it makes sense to comment out that function for now, allowing the rest of AssocList to already be used?
Yeah I like that idea! š
to replace current Set, we will need union and difference. These are used in our examples (e.g. AStar)
to replace current
Set, we will needunionanddifference. These are used in our examples (e.g.AStar)
union is implemented under the name insertAll (indicating that it has a bad time complexity; see discussion above).
difference could be implemented similarly on top of remove as, say, removeAll (with similarly bad time complexity). However, this is blocked by #3351.
PR has been unblocked as the issue was resolved!
@Qqwy can we close this since your AssocList implementation landed in-line for the stdlib Dict?
Yes! I would do it myself but I'm out of town right now and only have access to my PC tomorrow evening again, so feel free to do it earlier
On October 8, 2022 7:34:00 PM UTC, Ayaz @.> wrote: @. can we close this since your AssocList implementation landed in-line for the stdlib Dict?
-- Reply to this email directly or view it on GitHub: https://github.com/roc-lang/roc/pull/3258#issuecomment-1272384119 You are receiving this because you were mentioned.
Message ID: @.***> -- Met vriendelijke groet / Sincerely, ~W-M
Have fun out of town!