ocaml Deprecating the representation of a type alias

Hi,

I propose a new form of deprecation warning, intended to make it easier to nicely turn a concrete type into an abstract type. Concretely, given for instance:

type t = string
val open_ : t -> in_channel

the module author may want instead:

type t
val from_string : string -> t
(* ... and maybe more functions to create `t`s *)
val open_ : t -> in_channel

but going from one api to the other would break things without warning, if the author can't update all callers. Obviously, the author can document "this is deprecated" and wait, but as we've seen with the introduction of the "deprecated" warning a while ago, a compiler warning is much more effective.

So what I'm proposing is this intermediate step:

type t = string [@@deprecated_repr]
val from_string : string -> t
(* ... maybe more functions for create `t`s *)
val open_ : t -> in_channel

which would trigger warnings in the callers that rely on the knowledge that t is string, but not in callers that treat t opaquely like in open_ (from_string "a"). After waiting a bit, the author could then remove the = string [@@deprecated_repr], and either expose a new concrete representation or leave the type abstract.

An example where this sort of transition could make sense is the Digest module. This also works if the type t doesn't exist, i.e., given:

val open_ : string -> in_channel

you could switch to:

type t = string [@@deprecated_repr]
val open_ : t -> in_channel

Implementation-wise, the change is actually not very complicated. Taking unification as an example, the compiler implements unification along these lines: "if either type is a variable, occur check and unify the variable to the type, otherwise expand head type constructors recursively then unify by analyzing all cases". So given type t = int [@@deprecated_repr] type u = t, the trunk compiler unifies t and u by expanding them both to int, and then int unifies with int. With the current change, unifying t and u expands them both only to t, and t unifies with t, thus no type error and no warning. Whereas unifying int and u expands them both to int, thus no type error, BUT to expand u to int, the typer had to expand t to int, and that triggers the warning, since that's exactly what won't be possible once t is opaque.

This is one reason I consider this change to be in a draft state: someone that knows more about the typer might be able to poke holes in the implementation. Or e.g. the expansion change probably changes typing a bit in unimportant ways: I suspect that given type _ t = string, int t and string t unify but not anymore if the type is marked as [@@deprecated_repr]. I think that would be fine, it just needs to be documented.

I also think there is more risk than usual to have spammy warnings, so I think this should be a new warning, so it can be specifically disabled if something goes awry. Before working on this kind of polish, I'd like to know if compiler devs agree on the big picture first.

An alternative implementation would be to forget about emitting warnings entirely. Instead type t = string [@@deprecated_repr] could be treated as type t when a command line flag is passed. Implementation wise, this would touch less of the typer, and maybe be good enough considering dune treats warnings as errors in the dev profile anyway. But errors would surface as errors with no good way to tell the users that a @@deprecated_repr is involved (since the typer currently over-expands type constructors, which is what the modifications described above address).

Finally, I tried some real examples. I backported the change to 5.3, put some [@@deprecated_repr] on some Cmdliner types, built a few libraries with limited dependencies, and I observed no real issue but got notified in what looked like exactly the expected places (with some sucky errors messages, which are another thing that would need to be improved):

# dune-release

File "bin/help.ml", line 252, characters 20-77:
252 |       let conv, _ = Cmdliner.Arg.enum (List.rev_map (fun s -> (s, s)) topics) in
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error (alert deprecated): implicitly converting between a type and its deprecated representation

# ocp-indent

File "src/indentArgs.ml", line 48, characters 25-41:
48 |     Arg.(value & opt_all config_converter []
                              ^^^^^^^^^^^^^^^^
Alert deprecated: implicitly converting between a type and its deprecated representation

File "src/indentArgs.ml", line 89, characters 21-36:
89 |     Arg.(value & opt range_converter (None,None)
                          ^^^^^^^^^^^^^^^
Alert deprecated: implicitly converting between a type and its deprecated representation

and more such examples in ocp-index and mirage-dns.

In fact, the warnings were more useful than I expected. My expectation was that warnings would be noisy because if two types are treated interchangeably, there might be a lot of back and forth between a type constructor and its representation, and so I thought I'd need to silence all but the first warning. But at least in this case, of the 14 warnings I saw, every single one pointed to a call site that needed to be tweaked.

May 26 '25 09:05 v-gb

@garrigue: do you have an opinion on the high-level approach?

May 28 '25 13:05 nojb

Thanks for submitting this! There are at least two things to think about before this could be merged:

It changes the way expansion works, and expansion is a pretty central operation in the type checker. I have added this PR to the agenda of an occasional meeting of type-checker folks to see if the change here is the right one.
It also changes the surface language. We can have a design debate here to see if this is a good change. (For example, should the text printed be customizable?)

So, what do folks here think about this as a language change?

May 28 '25 13:05 goldfirere

I have not yet understood all the details, but this seems that this more a partial compatibility mode than a conservative change. That is, expand_delaying_deprecated_repr is only use in some specific cases, and most of the time expansion is disabled for those types. So it seems a bit weak if the goal is to avoid breaking existing code.

This said, there are already mechanisms that could help support this partially.

unify has already some support for "slow" expansion, attempting to expand the two sides until a common type is found. However, this is only for non-parameteric types. As you observed yourself, for non-parametric types, one would also need to look at (weak) injectivity. Or go with you assumption that, for those types, it is reasonable to assume it. Indeed, this is comparable to private types in that respect.
You use a specific global variable to propagate information, but now unification has its own environment.

Enabling warnings outside unification seems more challenging, as a lot of work would have to be duplicated. (Am I right that in your code we only get a failure in that case ?)

Jun 05 '25 11:06 garrigue

I have not yet understood all the details, but this seems that this more a partial compatibility mode than a conservative change. That is, expand_delaying_deprecated_repr is only use in some specific cases, and most of the time expansion is disabled for those types. So it seems a bit weak if the goal is to avoid breaking existing code.

This should be an essentially conservative change: adding the attribute makes most code type the same as before, but potentially with warnings.

Due to warnings not being emitted when no location is provided, it's fairly likely that there are operations not covered by tests that should trigger warnings but wouldn't, which means the warning is probably partially implemented.

Putting aside the corner case of constructors with phantom parameters that I mentioned initially, adding the attribute could cause changes on use of ~through_deprecated_repr:false, but these cases should be unlikely to be problematic. From memory, it's things like the typer peaking through types to figure out where to warn on let _ = f x (wouldn't warn anymore, which seems fine) or whether in let () = f g, optional parameters of g need to be omitted (wouldn't omit parameters if the type of g is a type constructor annotated with @@deprecated_repr, which could break code). I suspect it'd be possible to improve some cases with an expand function that returns an optional warning in addition to the expanded type, so callers can emit the warning only when they act on the expanded type.

Personally, I'm more worried about warning spam, if the compiler is repeatedly expanding types for whatever reason.

unify has already some support for "slow" expansion, attempting to expand the two sides until a common type is found. However, this is only for non-parameteric types. As you observed yourself, for non-parametric types, one would also need to look at (weak) injectivity. Or go with you assumption that, for those types, it is reasonable to assume it. Indeed, this is comparable to private types in that respect.

What is this slow expansion you are referring to? this maybe? If you have:

type t1
type t2 = t1
type t3 = t2
type t4 = t3

and you unify a t_i and t_j, is it somehow able to expand only the type with the larger index?

You use a specific global variable to propagate information, but now unification has its own environment.

Maybe that's possible, although the global variable is for the expand function rather than for unify/moregen/equal, so I'd need to look into it.

Enabling warnings outside unification seems more challenging, as a lot of work would have to be duplicated. (Am I right that in your code we only get a failure in that case ?)

I'm not sure what you have in mind when you say "outside unification". The tests show warnings being emitted during type and signature subtyping, for instance. If there's a function of this nature that I haven't updated, the code would type as normal, but there could be missing or spurious warnings, depending on the situation (missing is more likely, because no warning location should be provided in that case, which disables warnings). I think it'd be possible to warn in the let () = f g example above, which shouldn't require a ton of duplication.

Jun 05 '25 14:06 v-gb