vector icon indicating copy to clipboard operation
vector copied to clipboard

Export constructors for vectors from unsafe module

Open Shimuuar opened this issue 7 months ago • 4 comments

This is proof of concept implementation of #535 for primitive vectors.

Idea work out nicely. There's one small complication. It makes sense to define unsafe function working on representation of a vector but we get name clashes: both mutable and pure primitive vectors defined unsafeCast so it pprobab;y makes sense to keep such function where they're currently defined

Reviews are very welcome.

Fixes #535, Fixes #524, Fixes #517, Fixes #506, Fixes #492, Fixes #171

Shimuuar avatar Sep 03 '25 12:09 Shimuuar

both mutable and pure primitive vectors defined unsafeCast

I think it would make sense to add an Data.Primitive.Unsafe and Data.Primitive.Mutable.Unsafe modules to separate the concerns. This is what I was suggesting in #361 which was scrapped. But, considering that this idea is coming up again and again, maybe it is a good idea after all?

lehins avatar Sep 03 '25 19:09 lehins

Yes it seems in the end library authors need to provide access to internals and it's better to do it systematic way. I hoped to get away with single Unsafe module but looks like 2 are needed.

Shimuuar avatar Sep 05 '25 06:09 Shimuuar

I implemented unsafe modules for everything except Unboxed. It's quite a bit different from other vectors and still require some cleanup. Haddocks are not ready either

General idea is to move definition of data type and all function that work directly on vector representation into Unsafe module. It has nice side effect of separating code specific to vector type from endless specializations.

At the moment all function safe and unsafe are simply reexported from safe modules. Deprecation warning is only applied to pattern synonym.

Shimuuar avatar Sep 18 '25 18:09 Shimuuar

Now PR is mostly done. Implementation follows rules below:

  • For all vector types except unboxed. Two Unsafe modules are added. They provide access to constuctors. All function what work on internal representation are defined there. Exported constructors from safe modules are replaced with deprecated pattern synonyms.
  • For unboxed vectors. Data.Vector.Unboxed.Base renamed to Data.Vector.Unboxed.Unsafe. All constructors for data instances which are safe to expose are exposed (both mutable and immutable). These are newtype wrappers over vector representation. Exports of unsafe constructors

Remaining questions

  1. Right now only constructors are deprecated. Do we want to deprecate unsafe functions as well? (unsafe{From,To}ForeignPtr0, unsafeCoerceMVector, unsafeCast, etc) now they are simply reexported from "safe" modules.

  2. Renaming Data.Vector.Unboxed.BaseData.Vector.Unboxed.Unsafe is breaking change I think. However I want to rework size hints in bundles. And that will be breaking change anyway

  3. New modules are not documented properly yet.

Shimuuar avatar Oct 03 '25 12:10 Shimuuar

I think PR is ready.

Shimuuar avatar Jan 25 '26 13:01 Shimuuar

@lehins Do you have any comments? If no I'm going to merge this PR next weekend

Shimuuar avatar Jan 31 '26 15:01 Shimuuar

Sorry, I did want to review this PR. I was travelling this week, so didn't have time to do this. Give me a few days and I will make sure to review it

lehins avatar Jan 31 '26 22:01 lehins

No problem. I worked on PR for 4 month on and off. It can wait.

On second thought dropping D.V.Unboxed.Base was too risque. It's better to deprecate that module and reexport content of new D.V.Unboxed.Unsafe

Shimuuar avatar Feb 01 '26 07:02 Shimuuar

@lehins Yes I moved some safe functions to Unsafe module. I found that when all function which work on vector representation directly and not via Vector/MVector type classes are defined in one place, it improves readability. And there's little harm in exporting safe function from unsafe module. Similarly Unsafe module for unboxed vector has a lot of stuff that's quite safe.

Real question is what should go into Unsafe modules and what should be eventually removed from safe ones?

  • Right now all function that work on vector representation directly are in defined in unsafe module. I think it's right choice
  • Should we deprecate reexport of unsafe functions?
  • Should we move unsafe indexing function (unsafeIndex etc) to unsafe modules and deprecate it as well?
  • If so should we add D.V.Generic.Unsafe as well for unsafe indexing?

What do you think about creating another one directional pattern synonym that would allow continue pattern matching on vectors in a safe manner? Can't think of any good names for it though.

What about PrimVector/PrimMVector etc?

Shimuuar avatar Feb 10 '26 10:02 Shimuuar

Yes I moved some safe functions to Unsafe module. I found that when all function which work on vector representation directly and not via Vector/MVector type classes are defined in one place, it improves readability. And there's little harm in exporting safe function from unsafe module.

I do not agree with this at all. This will make people wrongfully believe that those functions are unsafe. And I can certainly see harm in it!

Real question is what should go into Unsafe modules and what should be eventually removed from safe ones?

Anything that is unsafe should IMHO be eventually moved: unsafeIndex, unsafeFreeze, etc. It should not be only about access to constructors, it should be about safe vs unsafe functionality.

Right now all function that work on vector representation directly are in defined in unsafe module. I think it's right choice

Totally disagree. There is nothing wrong in providing a safe function that itself uses unsafe constructor.

Should we deprecate reexport of unsafe functions?

Yes, I believe so.

Should we move unsafe indexing function (unsafeIndex etc) to unsafe modules and deprecate it as well?

Yes, I believe so as well

If so should we add D.V.Generic.Unsafe as well for unsafe indexing?

I do believe that would be a good idea for consistency and safety sake

What about PrimVector/PrimMVector etc?

Doesn't sound too bad. Something like this?

  • PrimVector/PrimMVector
  • StorableVector/StorableMVector`
  • BoxedLazyVector/BoxedLazyMVector
  • BoxedStrictVector/BoxedStrictMVector

If these seem too long maybe we could shorten them:

  • PVector/PMVector
  • SVector/SMVector`
  • BLVector/BLMVector
  • BSVector/BSMVector

We could also provide matching type synonyms, so users could just import the type synonyms and use Generic interface if they so choose.

lehins avatar Feb 10 '26 10:02 lehins

It turns that export of Vector(Vector) does not pick pattern defined in module, it packs imported constructor instead. But when module export synonym exporting Mvector(MVector) works fine. It seems only way to keep import ... MVector(..) working is to define deprecated patterns in separate module.

Also haddock doesn't link properly 'U.Vector' and requires full qualification.

Shimuuar avatar Feb 10 '26 11:02 Shimuuar

I do not agree with this at all. This will make people wrongfully believe that those functions are unsafe. And I can certainly see harm in it!

So we disagree on this but your approach is valid as well and I don't feel to strong about his. Let do it your way. This PR is uncomfortably big already so I propose following plan:

  1. Move safe functions back to safe module and merge this PR. It's good intermediate point
  2. Next PR implements deprecation of unsafe conversion functions
  3. Third PR move unsafe indexing into Unsafe modules and deprecates their use from safe modules

We could also provide matching type synonyms, so users could just import the type synonyms and use Generic interface if they so choose.

BTW exactly this was requested in #240 but I think it's better to do this separately

Shimuuar avatar Feb 10 '26 12:02 Shimuuar

I 100% agree with this plan: https://github.com/haskell/vector/pull/540#issuecomment-3877255972

lehins avatar Feb 10 '26 12:02 lehins

OK step 1 is done. It turns out that supporting both qualified imports and import MVector(..)/MVector(MVector) requires quite a bit of jumping through flaming hoops

These patterns are used to deprecate import of wildly unsafe
constructors from safe modules. There are two way in which they
could be imported by user:
 > import Data.Vector...Mutable qualified
 > import Data.Vector...Mutable (MVector(..))
and we want to support both.

But there's a problem. When pattern is exported separately
 > , MU.MVector
 > , pattern MVector

`import ... MVector(..)` doesn't work. It can't find constructor.
Separate pattern doesn't count.

And if constructor is in scope then following export
 > , MU.MVector(MVector)

will always export constructor even if it only visible qualified
and there's pattern with same name.

So there's only one way: use `MVector(MVector)` export, never allow
unsafe constructor in scope in safe modules if they need to export
pattern instead.

This requires defining pattern and all safe functions which make
use of constructor should be defined in separate unexported module.

Shimuuar avatar Feb 25 '26 15:02 Shimuuar

@lehins could you take a look at latest changes? In the end ensuring proper backward compatibility turned out to be quite ugly (but necessary).

Shimuuar avatar Feb 28 '26 11:02 Shimuuar

Fields accessors also double as documentation. Adding them was right. I'll check that every public vector type except unboxed vectors have them.

@lehins thank you for reviews. They're most useful

P.S. There 500 responses in github's CI in addition to genuine error. Where github is going? In any case I'll fix everything later.

Shimuuar avatar Mar 03 '26 19:03 Shimuuar