fpm Prevent Name Collisions Between Packages

In #86, it became clear that we'd like a solution to prevent name collisions of modules between packages. Starting this thread here to discuss solutions.

Aug 17 '20 18:08 everythingfunctional

My preferred solution would just be that all modules in a library must start with the name of the library. Thus, you may have a module that is just the same name as the library. This is likely to be a common design; organize your library however you like, expose the public API via a module with the same name as the library.

Aug 17 '20 19:08 everythingfunctional

I agree, I think this is the way to go.

Structuring the module and file names as we did so far helps emulate namespaces and subpackages, but I don't see a good reason to enforce it. The user can still do it if they prefer.

Aug 17 '20 19:08 milancurcic

Here is the "minimal" proposal:

All module names must start with the library name, in particular the module name should be equal to:

either the library name and underscore, such as stdlib_* or toml_*
or just the library name, such as stdlib or toml

Do we all agree with this "minimal" proposal?

I do. If we all do, we have something to solve our immediate problem, which is name collisions between packages. This gives us time to discuss naming conventions, which I think is important to have, but that's a separate issue.

Aug 17 '20 20:08 certik

I agree, this is a good solution.

Aug 18 '20 07:08 LKedward

Sorry if I bring this back up - there are newer discussions (including on Fortran Discourse about module name collisions.

I've forked fpm and started a "module namespacing" branch to address this issue: https://github.com/perazz/fpm/tree/namespaced-modules

Things need to be discussed

What coding style should I follow? I see fpm does not use many object-oriented facilities, most likely for compiler compatibility. Would having a module_t type, that extends the current simple string_t used for modules, make sense in fpm, or that's too advanced?
My idea is to first provide a backward-compatible module_t class (already implemented), then extend it with more facilities for conflict resolution, but the actual rules that do actually resolve the conflicts remain undefined.

Nov 22 '22 11:11 perazz

What coding style should I follow? I see fpm does not use many object-oriented facilities, most likely for compiler compatibility. Would having a module_t type, that extends the current simple string_t used for modules, make sense in fpm, or that's too advanced?

I'd say it's fine to use inheritance, but I'm not sure I understand why module_t would extend string_t.

My idea is to first provide a backward-compatible module_t class (already implemented), then extend it with more facilities for conflict resolution, but the actual rules that do actually resolve the conflicts remain undefined.

What would it mean to do "conflict resolution"? If you just want to report it to the user, I'm pretty sure we already do that. If you want to somehow be able to compile the projects anyway, I'm not sure how you would do that since the compiler will (eventually) see the conflict no matter what you try to do differently at compile time.

Nov 22 '22 13:11 everythingfunctional

but I'm not sure I understand why module_t would extend string_t.

The idea was that fpm could automatically add prefixes or other unique identifiers to the plain module name, that's why extending its string seems legit - also a good way to make it automatically backward compatible

would it mean to do "conflict resolution"?

I guess there's no consensus on what direction fpm should take yet, but I think that, with prefixed module names, fpm could at least generate one ghost version of each package that has all unique names, so the user could revert to using that if they want to avoid name conflicts. Think about how gfortran mangles routine names:

___myModule_MOD_myRoutine

that could extend to something like

___fpmPackage_FPM_myModule_MOD_myRoutine

in other words, each fpm module could be prefixed by

packageName_FPM_

or some other unique identifier

Nov 22 '22 13:11 perazz

That's an interesting idea, but I think it may just move the ambiguity problem into fpm, not necessarily solve it. In order to achieve this, first, you'll need fpm to understand the source code much more than it currently does. Second, you'll need to make copies of the source code with the module names changed in both:

The module definition, and
The places the module is used

At that point, when you see a use statement, how will you know which package it comes from, and thus which "mangled" name to change it to? Especially if it is a module name that exists in multiple dependencies.

Nov 22 '22 14:11 everythingfunctional

Yes to both. Ambiguity would be moved at the registry level: there could be collisions if one wanted to have a package that has the same name for both modules and the package name as a package in the main fpm registry; (or one could go one level up further and have the unique module name as registry+package+module but at this point my head is spinning).

If we restrict the renaming to modules only, a full language parser is not needed. It could be prescribed that for a package to be in the official fpm registry, it should only use module names from the "unique" representation. That could be checked/enforced really easily

Nov 22 '22 14:11 perazz