core-libraries-committee
core-libraries-committee copied to clipboard
`dup` and `arr2`
Hi,
Two functions from one of Yampa's auxiliary modules have become generally useful (they were made public and they have found use in games, other applications, and other libraries):
dup :: a -> (a, a)
dup x = (x, x)
arr2 :: Arrow a => (b -> c -> d) -> a (b, c) d
arr2 = arr . uncurry
Would you consider adding them to base?
I'd say the logical place for dup would be Data.Tuple and for arr2 would be Control.Arrow.
Looks like you proposed adding dup already long ago https://mail.haskell.org/pipermail/libraries/2018-October/029051.html, has something changed?
EDIT: Why dup :: a -> (a, a), why not dup :: Arrow a (a,a) (you are proposing arr2 as arrow lifted uncurry after all!).
It would be nice if this proposal summarized the previous discussion (and rehashed questions & answers already asked and answered).
https://hackage.haskell.org/package/extra-1.8/docs/Data-Tuple-Extra.html has dupe :: a -> (a, a) used by e.g. ghcide.
On Sat, 23 Nov 2024, Oleg Grenrus wrote:
https://hackage.haskell.org/package/extra-1.8/docs/Data-Tuple-Extra.html has dupe :: a -> (a, a).
https://hackage.haskell.org/package/utility-ht-0.0.17.2/docs/Data-Tuple-HT.html#v:double
Looks like you proposed adding dup already long ago https://mail.haskell.org/pipermail/libraries/2018-October/029051.html, has something changed?
I don't see any conclusion in that thread.
I don't think this passes the Fairbairn threshold for base
I don't think this passes the Fairbairn threshold for
base
Why? It appears these are duplicated across many libraries, have been used for years and are primitives.
Many functions in base are not very "exciting"
For one thing, GHC.IO.Device.dup :: IODevice a => a -> IO a already exists in base, so we will have to bikeshed the name already.
There is nothing blocking arr2 in my opinion.
I don't think that's a blocker. GHC.IO.Device is not a common import.
Sorry, it's been 6 years and I'd completely forgotten about that thread. Here's a summary of the discussion back then, together with my comments.
Name
-
Dan Burton indicated that "there is precedent in the arrow literature for calling this function "dup". For example, on page 55 of this paper: http://homepages.inf.ed.ac.uk/wadler/papers/arrows-jfp/arrows-jfp.pdf"
-
Edward Kmett indicated that "dup as the name for this operation has a ton of precedent in languages like forth, and its length is comparable to the already smashed fst and snd."
-
Carter Shonwald commented "dup is nice, but i dont really care what we call it as long as its not annoying or dumbe :)"
See also comments below regarding location.
Signature / implementation
There are several possible implementations of duplicating transformations:
dup :: Arrow a => a b (b,b)
dup = id &&& id
dup :: a -> (a, a)
dup = join (,)
dup :: a -> (a, a)
dup x = (x, x)
The nice thing about the first one is that it works for all arrows, not just functions. We can still use it with functions, but we no longer need to arr dup all over the place.
Location
Obviously this would be partly dependent on the signature. If it works on arrows, then it makes sense to add it to Control.Arrow. That also avoids the name clash with existing code.
@mixphix : For one thing, GHC.IO.Device.dup :: IODevice a => a -> IO a already exists in base, so we will have to bikeshed the name already.
@treeowl :
GHC.IO.Deviceis not a common import.
I'm not sure how uncommon it is. Here's a search: https://github.com/search?q=ghc.io.device+dup+language%3AHaskell&type=code. I don't know how to search among all packages on hackage.
Note that GHC.IO.Device also defines a custom read function that is not compatible with the one in the Read class.
If both dup and arr2 are placed in Control.Arrow, which is not exported by the Prelude, then I'd argue that it's a safe change. Here's what github lists for Haskell code that lists both GHC.IO.Device and Control.Arrow : https://github.com/search?q=ghc.io.device+Control.Arrow+language%3AHaskell&type=code . The only one that looks like it could cause an issue is base-orphans.
EDIT (fix attributions for quotes).
@treeowl : For one thing, GHC.IO.Device.dup :: IODevice a => a -> IO a already exists in base, so we will have to bikeshed the name already. @mixphix :
GHC.IO.Deviceis not a common import.
You've mixed up the attributions for the quotes. @mixphix is the one who said the name clash might be a problem; I'm the one who pointed out that it's not a common import.
@treeowl My bad. Sorry about that. Fixed in the original comment.
Gentle nudge.
I don't know how to search among all packages on hackage.
https://hackage-search.serokell.io/?q=%5Cbdup%5Cb
There are quite a few mentions, although more analysis is needed whether any of them imports Data.Tuple / Control.Arrow at the same time. We use https://github.com/haskell/clc-stackage for impact analysis.
dup sounds good to me, indeed the name follows DUP in Forth and such.
arr2 seems to be a non-descriptive name to me, I would not have guessed what it does. A better name could convince me to add it, but as it stands arr . uncurry is not that long and much more readable.
arr2 seems to be a non-descriptive name to me
I think it's quite common for functions that operate on larger tuples to have a number indicating the arity. For example, in https://hackage.haskell.org/package/raft-0.3.7.2/docs/Data-Tuple-Util.html we have fst3, first3, fst4, first4. The package extra defines similar functions (up to arity 3). The same happens in regex-tdfa, MissingH, haskoin-core. The library tuple uses a similar naming scheme (the specific function names differ).
What are the next steps?
@ivanperez-keera please refer to https://github.com/haskell/core-libraries-committee/blob/main/PROPOSALS.md#the-how
It's up to proposer to decide on exact content of the proposal and prepare impact assessment. If I may suggest, given that the proposal touches frequently imported modules, you likely need a draft implementation first, because otherwise it would be impossible to prepare impact assessment.
Dear CLC members, any non-binding opinions on dup and arr2? @tomjaguarpaw @hasufell @mixphix @velveteer @parsonsmatt @angerman
They seem fine to me and I can see how they help people who use arrows. I agree that dup doesn't seem to pass the Fairbairn threshold versus \x -> (x, x), but I have no objection to it being exported in Control.Arrow specifically if arrow-using people like it. Indeed, I wouldn't be surprised if the impact assessment suggests that it will cause less breakage to export them both from Control.Arrow, and leave Data.Tuple alone.
These seem like useful arrow-related functions, I think exporting them from Control.Arrow (with dup at its arrow type Arrow arr => arr a (a, a)) would be fine.
I'm for an export from Control.Arrow, as @tomjaguarpaw and @mixphix also seem to prefer.
Since we have a new set of CLC members, I'd like to give them the opportunity to add their (non-binding) opinions: @TeofilC @Daniel-Diaz @ChickenProp @noughtmare @mpscholten @doyougnu
@ivanperez-keera can you:
- prepare a GHC MR (deciding on exact function signature/implementation and from where to export these new functions)
- an impact assessment (due to possible identifier clashes)
That would allow us to move to a vote soon.
I think dup fits perfectly in Control.Arrow as it is. The name arr2 doesn't seem intuitive to me, being the function closer to uncurry, but if there's precedence and people already use it with this name I guess that's fine.
My preference is also for dup to be in Control.Arrow. I am not yet convinced by arr2 (maybe if there are lots of uses of it, or if it got a better name).
I'm in favor of both dup and arr2 in Control.Arrow if it does not break too much existing code. In particular, I think the name arr2 is intuitive for a function that lifts functions of two arguments into an arrow; the implicit uncurrying should be expected by people who work with arrows.
I'm in agreement with @noughtmare and @tomjaguarpaw for both dup and arr2 in Control.Arrow
👍 for dup and arr2 in Control.Arrow
+1. Mild preference for dup being Arrow a => a b (b, b) rather than a -> (a, a).
I've basically-never used arrows myself, but it sounds like these would be broadly useful for people who do, and the names feel fine to me. In particular, I sometimes have trouble remembering which is which between curry and uncurry[^1], and I can (low confidence) imagine that arr2 ("turn a two-argument function into an arrow") would give me less pause than arr . uncurry ("turn a thing into an arrow, and that thing is, uh... a function that takes a tuple instead of two arguments, I think?"). If I saw arr . uncurry a lot that wouldn't be an issue, but if I saw it a lot I'd also want a shorter name for it.
[^1]: I think when I first heard the names, I sort of cached them in my brain as "curry takes a normal function and turns it into kind of a weird one, and uncurry does the reverse". But now I think a -> b -> c is normal and (a, b) -> c is kind of weird, but those handles haven't fully swapped.
@ivanperez-keera do you need assistance with any of the steps?
@ivanperez-keera I'm closing the proposal as abandoned in 2 weeks if there is no further progress. You can still come back to it afterwards and re-open it.
Hi,
I'd appreciate a bit of help yes. It's the first time I do this.
@ivanperez-keera
The steps to open an MR are outlined here: https://github.com/haskell/core-libraries-committee/blob/main/PROPOSALS.md#the-how (scroll a little bit down, then under point "5. Implement the proposal")
Here's also a previous base MR adding a new function that should give you a better idea: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13755
Regarding impact assessment, I must confess I've never done one myself, but the steps are documented here: https://github.com/haskell/core-libraries-committee/blob/main/PROPOSALS.md#impact-assessments
If there's anything unclear, we can try to work through it step by step.