Should PAUSE index .pod files?
This came up here: https://github.com/CPAN-API/cpan-api/issues/336
Moose's metadata has 'provides' entries for .pod files. metacpan doesn't index .pod files, so old versions of some packages were still being retained in its index, back when the names were represented by .pm files.
- what should PAUSE do when it sees a .pod file?
- should it behave any differently when that .pod file is in 'provides' metadata?
- which also raises the question: should PAUSE attempt to verify the contents of 'provides' data?
On Fri, 29 Aug 2014 15:31:09 -0700, Karen Etheridge [email protected] said:
This came up here: CPAN-API/cpan-api#336 Moose's metadata has 'provides' entries for .pod files. metacpan doesn't index .pod files, so old versions of some packages were still being retained in its index, back when the names were represented by .pm files.
• what should PAUSE do when it sees a .pod file?
AFAICS, the PAUSE indexer has no reason to look into .pod files.
• should it behave any differently when that .pod file is in 'provides' metadata?
Provides metadata are treated as overrides for what pause's own indexer would calculate. PAUSE never did a perfect job on indexing and people needed a way to override PAUSE's stupidity. PAUSE usually has no reasons to disapprove the contents of the provides metadata. If there are such, we need to adjust.
• which also raises the question: should PAUSE attempt to verify the contents of 'provides' data?
See above. But it all depends on how much hassle other parties have when we are too permissive.
andreas
This one bit me today. I was looking at some DBIx::Class documentation on metacpan, and following a link to https://metacpan.org/pod/DBIx::Class::Manual::SQLHackers, which apparently isn't indexed, and therefore not available on metacpan.
I took this up on the #metacpan IRC channel, where HAARG pointed out this as the underlying issue.
So here's another argument in favor of indexing .pod files: The index is apparently used to determine what "exists" on CPAN, and if one module refers to something that doesn't "exist" in this way, things break. :slightly_smiling_face:
The Makefile.PL for DBIx-Class-Manual-SQLHackers (which uses Module::Install) has the following:
# This is an ANDK sanctioned experiment
provides 'DBIx::Class::Manual::SQLHackers' => { file => undef };
This results in the following in META.yml:
provides:
DBIx::Class::Manual::SQLHackers:
file: ~
That path doesn't look right. I wonder if @ribasushi wanted to end up with
provides:
DBIx::Class::Manual::SQLHackers:
file: lib/lib/DBIx/Class/Manual/SQLHackers.pod
I think that in his comment from 2014, @andk was saying that if you give the path to a .pod file in provides, then PAUSE will look at it (even though the code that identifies files to consider indexing doesn't look at .pod files)? But still the regular rule would apply, that the file must contain a package statement for it to be indexed.
If specified in provides, the referenced file does not need a package statement. For generated packages for example there is a convention of pointing it to the generating .PL script or META.yml. It's likely that the file must exist in the distribution however.
provides metadata is required to specify a file. CPAN::Meta will refuse to generate meta files will a missing file in provides. PAUSE also rejects this. If a file name is specified but does not exist, it will still be indexed by PAUSE.
At least on the PAUSE side, including appropriate provides data will get a dist indexed. I'm not certain how metacpan reacts to a dist without any .pm files though. And it still presents a problem for installers like cpan and cpanm because they won't be able to compare the indexed version against a local version, since that relies on finding the corresponding .pm file.
For indexing, one main module matching the distribution name must be specified and you must have permissions for it. It should be fine if this is specified via provides. (for PAUSE)
But yes, you won't be able to usefully version the documentation files, nor will CPAN installers be able to tell they're installed at all (only the perldoc tool will).
My $0.02: this is not a PAUSE problem but a metacpan problem. PAUSE shouldn't index .pod files, as they are not useable code. But metacpan has its own indexer logic and could make links to pod files available with the normal links we use for indexed .pm files.
PAUSE shouldn't index .pod files, as they are not useable code.
On the face of it I agree.
But metacpan has its own indexer logic and could make links to pod files available with the normal links we use for indexed .pm files.
This would suffice for simple cases but leaves open the question of who owns the DBIx::Class::Manual::SQLHackers name if a corresponding POD file appears in multiple distributions. How should MetaCPAN resolve links in that case? Resolving this very question is what the index is for, so employing it for documentation pages does make some sense after all.
This isn’t an accident either: it follows from the fact that documentation page names follows the exact same allocation rules as module names. Running perldoc DBIx::Class::Manual::SQLHackers can only lead to one particular .pod or .pm file within a perl installation, same as use DBIx::Class can only load one particular .pm file.