core-libraries-committee icon indicating copy to clipboard operation
core-libraries-committee copied to clipboard

Expose the new primops `isByteArrayWeaklyPinned#` and `isMutableByteArrayWeaklyPinned#` from GHC.Exts

Open AndreasPK opened this issue 1 year ago • 9 comments
trafficstars

We will add these new primops to GHC in 9.12 (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13144) and likely backport them to 9.10 as well.

I expect them to be stable and used widely going forward and would like to avoid people depending on ghc-internal for this feature. Therefore I propose we expose it from GHC.Exts in line with most other primops.

For background on the issue see https://gitlab.haskell.org/ghc/ghc/-/issues/22255

For details about what these primops do see the changes to docs/users_guide/exts/ffi.rst in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13144

AndreasPK avatar Sep 02 '24 12:09 AndreasPK

I'm afraid I've forgotten the latest status, according to the CLC, of the GHC. hierarchy in base, and particularly the status of GHC.Exts. Did we make a decision on this? According to @bgamari's Base stability spreadsheet the "Desired visibility" of GHC.Exts is "see document", but I don't know what document that is.

And for reference, here is a big previous discussion: https://github.com/haskell/core-libraries-committee/issues/146

tomjaguarpaw avatar Sep 02 '24 13:09 tomjaguarpaw

I assume it's the document in the foot note which merely raises the question:

## 4. The question of `GHC.Exts`

Historically `GHC.Exts` has been the primary entry-point for users wanting access to all of the primitives that GHC exposes (e.g. primitive types, operations, and other magic). This widely-used module poses a conundrum since, while many of these details are quite stable (e.g. `Int#`), a few others truly are exposing implementation details which cannot be safely used in a GHC-version-agnostic way (e.g. `mkApUpd0#`, `unpackClosure#`, `threadStatus#`). There are at least two ways by which this might be addressed:

 * Export only the subset of primops that we can stabilize (e.g. things like `Int#`, `Weak#`, `newArray#`, etc.) in `GHC.Exts`, leaving the rest to only be exposed via `GHC.Prim` (which should not be used by end-users), or
 * Declare the entirety of `GHC.Exts` to be unstable and export the stable subset from another namespace (e.g. `Word#` and its operations could be exposed by `GHC.Unboxed.Word`)

What eventually was voted on was this comment: https://github.com/haskell/core-libraries-committee/issues/146#issuecomment-1591871779, according to which the following applies to GHC.Exts:

The API of this module is unstable and not meant to be consumed by general public. If you absolutely must depend on it, make sure to use a tight upper bound, e. g., base < 4.X, not just base < 5, because the interface can change rapidly without much warning.

All that being said none of that really makes a clear proposal about how these or other future primops should be handled in regards to GHC.Exts

AndreasPK avatar Sep 02 '24 15:09 AndreasPK

@AndreasPK IMO, there is ghc-experimental, so you can leave out GHC.Exts, and expose new primops in ghc-experimental. (I personally am fine depending on ghc-prim if I need primops; so IMHO they don't need to be re-exported anywhere).

phadej avatar Sep 02 '24 16:09 phadej

@AndreasPK IMO, there is ghc-experimental, so you can leave out GHC.Exts, and expose new primops in ghc-experimental. (I personally am fine depending on ghc-prim if I need primops; so IMHO they don't need to be re-exported anywhere).

Personally I would prefer a wider design/agreement on how to expose primops via ghc-internal/ghc-experiment/base using a sensible module structure over use of GHC.Exts. But without time to design and implement such a proposal exposing them via GHC.Exts seems like the next best option to me.


That being said I'm not adverse to adding it to ghc-experimental instead. But as I understand it the purpose of ghc-experimental is:

ghc-experimental, initially empty, depends on base. Functions and data types here are intended to have their ultimate home in base, but while they are settling down they are subject to much weaker stability guarantees. Example: new type families and type constructors for tuples, https://github.com/ghc-proposals/ghc-proposals/pull/475.

So if the CLC wants to stop exposing any primops from base (maybe even deprecating existing ones in the process) that wouldn't be unreasonable but then it seems unclear if these primops should still be exposed via ghc-experimental or simply remain relegated to ghc-internal.

But then we also don't want to encourage dependencies on ghc-internal. Making for no obvious solution from my perspective.

In light of all these complexities and unanswered questions to me the simplest and most consistent solution would be to expose those via GHC.Exts in line with existing primops and hope for someone to draw up a design for primops in the future.

Ultimately the decision lies with the CLC. If the proposal is rejected however I would still welcome concrete suggestions on how to best expose these primops in particular or primops in general.

AndreasPK avatar Sep 03 '24 12:09 AndreasPK

I agree that the most sensible solution is to export it from GHC.Exts as the module hierarchy currently stands, if they are to be so stable and widely-used.

Would that those interested in adding primops be the ones drawing up a design for primops...

mixphix avatar Sep 03 '24 18:09 mixphix

I'm afraid I've forgotten the latest status, according to the CLC, of the GHC. hierarchy in base, and particularly the status of GHC.Exts. Did we make a decision on this?

No, there was no particular CLC decision on GHC.Exts. I recall that @adamgundry was recently arguing against re-exporting all new primops from GHC.Exts by default?..

In this particular case the primops are of primary interest for packages like bytestring, so I'd strongly support re-exporting them from base instead of ghc-internal / ghc-experimental. Whether it's GHC.Exts or something more specific like a hypothetical GHC.Pinnedness could be up for discussion, if anyone feels strongly enough about it. Otherwise, if there is no huge interest to bikeshed, I'd suggest defaulting to GHC.Exts.

Bodigrim avatar Sep 03 '24 20:09 Bodigrim

I recall that @adamgundry was recently arguing against re-exporting all new primops from GHC.Exts by default?..

In the long term we should move away from the monolithic GHC.Exts, because it is an unstable and unnecessary point of coupling between base and GHC internals, as well as being too large to be easily understood by users. And in the meantime, yes, we certainly shouldn't expose all new primops by default.

That said, there's clearly work to be done to come up with a new design that answers the questions raised in this thread, and that won't happen overnight. Meanwhile, if specific new primops are sufficiently widely useful and stable enough to be desirable in base, following the status quo by adding them to GHC.Exts seems reasonable.

adamgundry avatar Sep 04 '24 07:09 adamgundry

If anyone is in favor of re-exporting these primops from base but has strong objections against doing it via GHC.Exts, please voice your concerns and suggestions until early next week. If there are none, I'll ask @AndreasPK to prepare an MR which we can vote on.

Bodigrim avatar Sep 07 '24 10:09 Bodigrim

I think new primops pop up regularly enough (this, https://github.com/haskell/core-libraries-committee/issues/203, https://github.com/haskell/core-libraries-committee/issues/188) that it would be great to figure out for the new primop export design sooner before the next primop is being added.

I opened a GHC issue, https://gitlab.haskell.org/ghc/ghc/-/issues/25242 as I think it's more of GHC than CLC issue.

phadej avatar Sep 07 '24 12:09 phadej

Oops, sorry, this fall through the cracks.

Dear CLC members, let've vote on the proposal to expose the new primops isByteArrayWeaklyPinned# and isMutableByteArrayWeaklyPinned# from GHC.Exts as implemented in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13351/diffs. While in future things might drift to other conventions, historically all new primops were exposed through GHC.Exts, so this proposal is business-as-usual.

@tomjaguarpaw @mixphix @velveteer @hasufell @parsonsmatt @angerman


+1 from me. Using these primops allows to leverage implicit pinnedness, which allows for zero-copy ByteArray / ByteString operations and is a major win in many scenarios. The primops restore capabilities lost after https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9254 (which was done for a good reason!). It would be a shame to depend on ghc-internal / ghc-experimental to access them, they are no more experimental or internal than isByteArrayPinned# / isMutableByteArrayPinned#, which are (and always were) in GHC.Exts.

Bodigrim avatar Oct 03 '24 22:10 Bodigrim

+1

parsonsmatt avatar Oct 03 '24 22:10 parsonsmatt

+1

velveteer avatar Oct 04 '24 02:10 velveteer

I am not a voter. but I think this is bad.

We don't need to bikeshed a new design, we just need to reexport things elsewhere. Pinned and unpinned arrays is a concept I do not think ever belonged in base because it is too tied to the RTS --- for example, it is unclear if it would apply to e.g. a Web Assembly backend using the built-in GC.

Really we should default to not adding things to base. Needing a CLC proposal is an added fraction that should help with it.

That said, there's clearly work to be done to come up with a new design that answers the questions raised in this thread, and that won't happen overnight.

This strikes me as FUD. Create a new library, put in ghc-experimental, I don't care. Just stick it somewhere and move on with life. This is supposed to be an easier and more regret-free process than putting things in base: erring on putting things in base should never be a way to avoid "new design that answers [..] questions", that's utterly backwards.

If we are like "well, everything else like this is already in base, so fuck it, let's put this in there too", we're going to keep on making that decision, keep on sticking things in base, and never end up with the separation of concerns. We have to fight path-dependency and break old habits.

Ericson2314 avatar Oct 04 '24 02:10 Ericson2314

It would be a shame to depend on ghc-internal / ghc-experimental to access them, they are no more experimental or internal than isByteArrayPinned# / isMutableByteArrayPinned#, which are (and always were) in GHC.Exts.

So some functions are in the wrong spot, so these functions should also be in the wrong spot?

Also, we don't have to limit ourselves to just ghc-internal / ghc-experimental, library names are cheap! If it neither experimental or internal, just make a unboxed-arrays library or something.

We should never say "this should go in base because x and y other libraries are worse". Being in base is the highest level of stability, and should therefore only be justified in positive terms, not negative terms.

Ericson2314 avatar Oct 04 '24 02:10 Ericson2314

-1


I think we need to start being stricter. Just because the module already exists and is named GHC.WhatEver doesn't mean we need to keep adding to it.

Move it elsewhere.

hasufell avatar Oct 04 '24 02:10 hasufell

Weak +1 for exposing more and more primops. -1 for exposing them from base while we try to slim down base conceptually; that seems to run counter to that idea?

angerman avatar Oct 04 '24 04:10 angerman

To clarify this for @hasufell, who asked me to kindly stick to one integer.

-1 as proposed.

angerman avatar Oct 04 '24 06:10 angerman

Also, we don't have to limit ourselves to just ghc-internal / ghc-experimental, library names are cheap! If it neither experimental or internal, just make a unboxed-arrays library or something.

@Ericson2314 Library names may be cheap, but designing, publishing and (indefinitely) maintaining new libraries is not. Somebody has to maintain them, and GHC HQ is sufficiently stretched already that it isn't reasonable to add maintenance burden alongside GHC releases. That's why for things like primops that are tightly coupled to GHC version, I think it makes sense for the GHC-maintained user-facing export to be in ghc-experimental. Anyone is then welcome to make up a new library that selectively re-exports parts of ghc-experimental to provide a more stable interface across GHC versions.

See also discussion on the role of ghc-experimental at https://gitlab.haskell.org/ghc/ghc/-/issues/25326.

(This is to some extent orthogonal to the question of whether these primops are useful enough to be worth exposing from base, on which I have no clear opinion, so apologies for intruding on the vote.)

adamgundry avatar Oct 04 '24 07:10 adamgundry

-1


I agree with @hasufell and @angerman

tomjaguarpaw avatar Oct 04 '24 07:10 tomjaguarpaw

Somebody has to maintain them, and GHC HQ is sufficiently stretched already that it isn't reasonable to add maintenance burden alongside GHC releases.

Somebody has to maintain base, too. And I think it is safe to say that GHC.Exts is as maintained as it is documented. How long would import GHC.Exts be in an average file if it was made explicit? I think it is worth investing time into sorting this out for the sake of GHC itself, let alone for the stability of the rest of the Haskell ecosystem. For that reason I'm -1.

mixphix avatar Oct 04 '24 13:10 mixphix

@adamgundry

Library names may be cheap, but designing, publishing and (indefinitely) maintaining new libraries is not. Somebody has to maintain them, and GHC HQ is sufficiently stretched already that it isn't reasonable to add maintenance burden alongside GHC releases.

You are saying that 1 big library is much less work than N smaller libraries 1/N as large. That...really ought not to be true! That's really bad if it is true!

That's why for things like primops that are tightly coupled to GHC version, I think it makes sense for the GHC-maintained user-facing export to be in ghc-experimental

Sure, I have no problem with that at all. The raw blanket reexport of all primops is indeed highly experimental, and doesn't belong in base. Until it is removed, we'll never be able to achieve the goal of having the same base version across two compilers, liberating us from bound-bumping hell.

But the rest of this discussion was saying that ghc-experimental was not good enough, because these new primops are more stable, bytestring doesn't want to depend on ghc-experimental, etc. It's just in response that assertion that I say it belongs in a new library.

Ericson2314 avatar Oct 04 '24 15:10 Ericson2314

I think it makes sense to put these primops in ghc-experimental. If they are in ghc-experimental then anyone in this thread is free to make their own libraries to provide stable interfaces for these primops if they want to use them in their own packages. This insanity of expecting GHC maintainers to perform all these tasks has to stop.

At the moment there are multiple proposals where GHC contributors have tried to persist the status quo. In these threads, the CLC has decided that they don't like the status quo anymore. How are GHC developers supposed to know which one this is before starting to work on a patch?

Perhaps the CLC could do some work to write some proposals which remove and deprecate the unstable parts of base which they don't want to maintain. It's my personal view that at the moment the whole process is hugely demotivating GHC development. It would be much better if the CLC could communicate beforehand which parts of the base API they don't want to keep maintaining rather than people repeatedly discovering this after performing a reasonable amount of work.

mpickering avatar Oct 04 '24 17:10 mpickering

If they are in ghc-experimental then anyone in this thread is free to make their own libraries to provide stable interfaces for these primops if they want to use them in their own packages.

Fine with me, we can see just how much bytestring maintainers don't want to depend on ghc-experimental when deciding who foots the bill.

It's my personal view that at the moment the whole process is hugely demotivating GHC development.

I wanted splitting base to be motivating, not demotivating, for both GHC dev and the CLC, and I agree that that doesn't appear to be happening yet. Something should change.

Perhaps the CLC could do some work to write some proposals which remove and deprecate the unstable parts of base which they don't want to maintain. [...] It would be much better if the CLC could communicate beforehand which parts of the base API they don't want to keep maintaining rather than people repeatedly discovering this after performing a reasonable amount of work.

I am quite sympathetic to this. So long as base exports a bunch unstable crap, we'll never reach the goal of not having annoying base bumps every GHC release. But if we could proactively deprecate all those things, then after N releases we have the opportunity of ripping of the band-aid, and never dealing with base version churn bullshit ever again.

I strongly encourage everyone to do the big deprecation, then the big breaking change, and get us out of the shitty status quo of dozens of people spending hundreds of hours doing hackage revisions every GHC release. It's absolutely worth the one-time pain of moving out a bunch of modules.

And yes, we'll also end up with clear guidelines of what goes where going forward too, avoiding a lot of confusion in these CLC threads going forwards.

Ericson2314 avatar Oct 04 '24 18:10 Ericson2314

I am quite sympathetic to this. So long as base exports a bunch unstable crap, we'll never reach the goal of not having annoying base bumps every GHC release. But if we could proactively deprecate all those things, then after N releases we have the opportunity of ripping of the band-aid, and never dealing with base version churn bullshit ever again.

FWIW, agree with this. IMHO base is too large and is the cause of too much churn. My stability assessment (specifically the "desired visibility" column) was an attempt to identifying the kernel of base; something that we could in principle support across GHC versions, with other functionality living in other (hopefully topical and therefore readily versioned) packages. However, a great deal of work is needed to get there.

bgamari avatar Oct 04 '24 18:10 bgamari

@mpickering, I definitely don't want GHC developers to feel demotivated! Can you help me understand how the CLC can help them feel more motivated?

As far as I understand it, when GHC developers want to add a feature to GHC they can do so without involving the CLC. If their feature requires something to be exposed from a library then, as you suggest, they can do that too, from a variety of libraries controlled by GHC HQ, including ghc-prim, and now ghc-internal and ghc-experimental too. There's no need to involve the CLC in that process either. They can also develop stable packages through which to expose these library features, or any community member can do so, or any community member can propose the features for inclusion in base at a later date.

You elaborate:

anyone in this thread is free to make their own libraries to provide stable interfaces for these primops if they want to use them in their own packages. This insanity of expecting GHC maintainers to perform all these tasks has to stop.

Could you please clarify what you mean by "all these tasks"? Do you mean the task of creating stable packages for functionality exposed from a GHC package? I must have missed people suggesting that GHC maintainers "perform all these tasks". Could you link to the posts in question?

My immediate suggestion to GHC developers would be: if, in the first instance, you expose new functionality through packages under the purview of GHC HQ then there's no way that CLC can stand in your way. Is that a good enough first step? After that is an established expectation then we can think about how to improve things further.

Or have a missed something that means that that particular suggestion is unworkable?

tomjaguarpaw avatar Oct 04 '24 19:10 tomjaguarpaw

How are GHC developers supposed to know which one this is before starting to work on a patch?

That's a general problem with committee driven workflow.

You simply can't know beforehand and the proposal process requires an up-front implementation.

However, you're free to inquire informally beforehand if something takes a lot of time to implement. But note however that those are still non-binding agreements.

A different angle was https://github.com/haskell/core-libraries-committee/issues/141 with the end goal of developing holistic policies for exactly that purpose.

Perhaps the CLC could do some work to write some proposals which remove and deprecate the unstable parts of base which they don't want to maintain.

As of now, the CLC isn't receiving any funding to do larger work and we are all volunteers.

I expect GHC developers to be on board with the base split and assign resources to drive this forward, including classification of stability of modules.

If you lack resources, please contact the HF.

When we all discussed the base split, I thought it was clear that this will not come for free. And during that discussion we also agreed that we'd rather vote on a per-module basis, afair. So no one needs to come up with an exhaustive up-front list that will drown in bikeshedding.

In the end, we are aiming for a shared goal: less intertwined GHC and base is less work for CLC and for GHC HQ.

hasufell avatar Oct 05 '24 03:10 hasufell

If you lack resources, please contact the HF.

In theory CLC could also ask the HF to provide more resources.

So long as base exports a bunch unstable crap, we'll never reach the goal of not having annoying base bumps every GHC release.

Stable base and reinstallable base are largely orthogonal goals.

I strongly encourage everyone to do the big deprecation, then the big breaking change, and get us out of the shitty status quo of dozens of people spending hundreds of hours doing hackage revisions every GHC release. It's absolutely worth the one-time pain of moving out a bunch of modules.

Sorry, I've heard the pitch "let's wage the war to end all wars" quite a few times, and I'm not buying it. There is a never ending list of "one-time pains", which will cause nothing but resent and discontent.

Ultimately the decision lies with the CLC. If the proposal is rejected however I would still welcome concrete suggestions on how to best expose these primops in particular or primops in general.

Judging from https://gitlab.haskell.org/ghc/ghc/-/commit/39497eeda74fc7f1e7ea89292de395b16f69cee2, I take it that we settled on ghc-experimental to expose them.

Bodigrim avatar Oct 05 '24 07:10 Bodigrim

Stable base and reinstallable base are largely orthogonal goals.

There are only orthogonal if we do a huge amount of shimming.

war to end all wars

I don't want to do anything dramatically high budget. I just want to not be forced on a breaking change to base every GHC release. This is not an unreasonable request, and actually I don't care how we get there. Shim modules / remove modules, it doesn't matter, so long as it gets done.

But I am skeptical it will get done by shimming every last shitty GHC.* module because the Haskell community is not that wealthy right now.

I want the CLC, and you in particular, to be open to some breaking changes of module removal, because the costs of breaking some packages are not necessary higher than the costs of everyone else dealing with base version bump for packages that aren't actually using a part of base that changed. I'm pretty sure the costs of the former are in fact lower.

Ericson2314 avatar Oct 05 '24 15:10 Ericson2314

But I am skeptical it will get done by shimming every last shitty GHC.* module because the Haskell community is not that wealthy right now.

It's either we are wealthy enough both to break things and provide shims, or we are poor - in which case it would be wise to do neither.

I want the CLC, and you in particular, to be open to some breaking changes of module removal, because the costs of breaking some packages are not necessary higher than the costs of everyone else dealing with base version bump for packages that aren't actually using a part of base that changed. I'm pretty sure the costs of the former are in fact lower.

I'm not sure why you are singling me out, because in fact I'm on less conservative part of CLC spectrum. I'm personally quite happy to remove things from base, provided that there is a (short) deprecation period and a proposer has prepared patches for affected packages.

That said, judging from my experience as a Hackage trustee, I disagree with your costs analysis here. Assuming nothing was broken for real, even if a maintainer does not have two minutes per year to make a revision, a client can say --allow-newer and carry on. Pure version bump is never a blocker, but breakage is.

Bodigrim avatar Oct 06 '24 21:10 Bodigrim

Thanks everyone for the lively discussion!

For something like base I think there is a natural tension between a desire for more features and the desire to avoid breaking changes and major versions. Personally I think there is a wide range between bare bones but stable, and feature rich but unstable that can be seen as reasonable.

The decision made here gives a good idea of where on this spectrum the CLC thinks base should be. And while it's far closer to the "slim but stable" end than I would have expected, I think it's still in the reasonable range.

Given that isByteArrayPinned# was stable for over 5 years, and the primops in question in this proposal would most likely have been just as stable. I think it's fair to treat this vote not just as a vote on these particular primops, but also as guidance for how high the bar for stability is for anything new to make it's way into base.

However there is a vast amount of features currently in base falling short of this bar, and there will be new features doing so in the future. At least for those features somewhat closely coupled to GHC I opened https://gitlab.haskell.org/ghc/ghc/-/issues/25326 and would welcome feedback on how GHC should do things from it's side there.


Sure, I have no problem with that at all. The raw blanket reexport of all primops is indeed highly experimental, and doesn't belong in base. Until it is removed, we'll never be able to achieve the goal of having the same base version across two compilers, liberating us from bound-bumping hell.

At least technically when adding primops additional exports only require minor version bumps. There is no reason why we couldn't have base-x.y.z without newPrimop# and base-x.y.(z+1) with newPrimop#. This would allow newer versions of ghc to still work with older versions of base without issue.

Or am I missunderstanding your point? In terms of stability most primops have been very stable over time.And I agree that primops that are expected to be unstable in their behaviour or interface shouldn't be exported from base.


As far as I understand it, when GHC developers want to add a feature to GHC they can do so without involving the CLC. If their feature requires something to be exposed from a library then, as you suggest, they can do that too, from a variety of libraries controlled by GHC HQ, including ghc-prim, and now ghc-internal and ghc-experimental too. There's no need to involve the CLC in that process either. They can also develop stable packages through which to expose these library features, or any community member can do so, or any community member can propose the features for inclusion in base at a later date.

My immediate suggestion to GHC developers would be: if, in the first instance, you expose new functionality through packages under the purview of GHC HQ then there's no way that CLC can stand in your way. Is that a good enough first step? After that is an established expectation then we can think about how to improve things further.

I think for new features that’s reasonable. After all there are no existing users that could be inconvenienced! But it becomes more difficult for changes like this one.

I proposed to expose those primops via base because this makes it easy for people depending on isByteArrayPinned# via base already to work around some of the issues with this primop that have been recently discovered.

This has been rejected and that is fine. But a lot of the arguments left me with the impression that the rejection had less to do with how base is, but with how base should be in the future.

So now ghc maintainers and users of base alike have to find new ways to work around this. Does someone create a package that gives a more stable interface? Should packages just depend on ghc-experimental? Something else? Who has the time to do any of that? All that causes additional friction justified by the premise that it will be better when those primops will be removed or reorganized in base at some point by someone with no concrete plan in place.

It would probably be better to have a ghc-all-array-things or similar package, and to deprecate all these primops in base. But implementing a well designed library takes time and competes with a lot of other tasks for ghc maintainers.

This isn’t helped by expectations about how many resources GHC HQ can contribute to such efforts which doesn’t seem to be in line with what’s currently possible.Causing further frustration as expectations on various sides are at odds with each other. Inevitably causing frustration on all sides.

AndreasPK avatar Oct 07 '24 14:10 AndreasPK