perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Add MANIFEST.SKIP file

Open Leont opened this issue 2 years ago • 11 comments

This replaces the lists of files to skip in Porting/manicheck, t/porting/manifest.t and Porting/makerel with a centralized and standardized MANIFEST.SKIP file.

This would also allow us to add a make manifest target, but I left that for a follow-up PR.

Leont avatar Mar 11 '22 12:03 Leont

This seems a good idea, as it's already a mechanism familiar to CPAN authors, being the standard way those dists usually work.

leonerd avatar Mar 11 '22 12:03 leonerd

need to sync with #19513 effort IMO we should merge this one first

atoomic avatar Mar 11 '22 22:03 atoomic

After spending more time reading the PR for MANIFEST.SKIP and MANIFEST.DEV from #19523 and #19513

Here is my thought:

  • I prefer the syntax of MANIFEST.SKIP this makes it so much simpler for example to skip the .gitignore files everywhere
  • The PR with the MANIFEST.DEV is updating a few extra tools

I m currently working on a prototype to reconcile both branches and keep the best from each

atoomic avatar Mar 11 '22 22:03 atoomic

After spending more time reading the PR for MANIFEST.SKIP and MANIFEST.DEV from #19523 and #19513

Here is my thought:

* I prefer the syntax of `MANIFEST.SKIP` this makes it so much simpler for example to skip the `.gitignore` files everywhere

* The PR with the MANIFEST.DEV is updating a few extra tools

I m currently working on a prototype to reconcile both branches and keep the best from each

I don't think this branch needs anything from that other branch to be complete. I do think this branch is asking for a follow up, but I'm not sure yet what that follow up should look like.

But most of all, I think we need to agree on requirements before changing any behavior. Because arguing about an implementation when we're actually disagreeing about requirements leads to very muddy discussions.

Leont avatar Mar 12 '22 00:03 Leont

I agree this branch can be safely merged as its own. IMO we try to easily control what is ship and what is not packaged in the tarball.

This is a good start and we can do some incremental changes

I'm all in to press the merge button :-) there

atoomic avatar Mar 12 '22 00:03 atoomic

On Sat, 12 Mar 2022 at 01:08, Leon Timmermans @.***> wrote:

After spending more time reading the PR for MANIFEST.SKIP and MANIFEST.DEV from #19523 https://github.com/Perl/perl5/pull/19523 and #19513 https://github.com/Perl/perl5/pull/19513

Here is my thought:

  • I prefer the syntax of MANIFEST.SKIP this makes it so much simpler for example to skip the .gitignore files everywhere

  • The PR with the MANIFEST.DEV is updating a few extra tools

I m currently working on a prototype to reconcile both branches and keep the best from each

I don't think this branch needs anything from that other branch to be complete. I do think this branch is asking for a follow up, but I'm not sure yet what that follow up should look like.

But most of all, I think we need to agree on requirements before changing any behavior. Because arguing about an implementation when we're actually disagreeing about requirements leads to very muddy discussions

The requirements my PR was focused on was related to the request made by Nicolas R in response to my adding Porting/updateAUTHORS.pl that he thinks that it and other Porting tools that can only be used inside of a git checkout should not be included in the final tarball.

The patch you have posted addresses the fact that we have a list of exceptions in some places which essentially function like a MANIFEST.SKIP.

The two problems are similar but not identical.

The reason I chose to implement MANIFEST.dev is because we have a variety of places where we use the MANIFEST to get a list of all the files that we need to process. Originally I thought harness was one of them, but I was mistaken, that does not depend on all of our tests being in the MANIFEST, as (these days?) it uses File::Find. However it does include various other tests.

I have no problem with this patch replacing the hard coded lists of exceptions. However it does not address what my patch was trying to address and only touches on part of the problem.

So lets say we decide that the shipped tarball should not include any of t/porting or Porting/, and to be clear I am not saying we should do this, I haven't done a file by file review to see if it is reasonable to do so, but let us imagine it is reasonable and try your branch. So that means we apply this patch:

$ git diff diff --git a/MANIFEST b/MANIFEST index 981b156cc8..79fb98b041 100644 --- a/MANIFEST +++ b/MANIFEST @@ -5405,84 +5405,6 @@ pod/rofftoc Generate a table of contents in troff format pod/splitman Splits perlfunc into multiple man pages pod/splitpod Splits perlfunc into multiple pod pages Policy_sh.SH Hold site-wide preferences between Configure runs. -Porting/acknowledgements.pl Generate perldelta acknowledgements text -Porting/add-package.pl Add/Update CPAN modules that are part of Core -Porting/add-pod-file Utility to add new pod/.pod file to core distribution -Porting/bench.pl Run benchmarks against t/perf/benchmarks -Porting/bisect.pl A tool to make bisecting easy -Porting/bisect-example.sh Example script to use with git bisect run -Porting/bisect-runner.pl Tool to be called by git bisect run -Porting/bump-perl-version bump the perl version in relevant files -Porting/check-cpan-pollution Check for commits that may wrongly touch CPAN distros -Porting/check83.pl Check whether we are 8.3-friendly -Porting/checkansi.pl Check source code for ANSI-C violations -Porting/checkAUTHORS.pl Check that the AUTHORS file is complete -Porting/checkcfguse.pl Check that config symbols are being used -Porting/checkcfgvar.pl Check that config scripts define all symbols -Porting/checkpodencoding.pl Check POD encoding -Porting/checkURL.pl Check whether we have working URLs -Porting/checkVERSION.pl Check whether we have $VERSIONs -Porting/cmpVERSION.pl Compare whether two trees have changed modules -Porting/config.sh Sample config.sh -Porting/config_H Sample config.h -Porting/config_h.pl Reorder config_h.SH after metaconfig -Porting/core-cpan-diff Compare core distros with their CPAN equivalents -Porting/core-team.json Membership of the Perl Core Team -Porting/corecpan.pl Reports outdated dual-lived modules -Porting/corelist.pl Generates data for Module::CoreList -Porting/corelist-diff Tool to produce corelist diffs -Porting/corelist-perldelta.pl Generates data perldelta from Module::CoreList -Porting/deparse-skips.txt List of test files to ignore/skip for deparse tests. -Porting/docs-team-charter.pod Perl Documentation Team charter -Porting/epigraphs.pod the release epigraphs used over the years -Porting/exec-bit.txt List of files that get +x in release tarball -Porting/exercise_makedef.pl Brute force testing for makedef.pl -Porting/expand-macro.pl A tool to expand C macro definitions in the Perl source -Porting/findrfuncs Find reentrant variants of functions used in an executable -Porting/git-deltatool Mark commits for perldelta in git notes -Porting/git-find-p4-change Find the change for a p4 change number -Porting/git-make-p4-refs Output git refs for each p4 change number, suitable for appending to .git/packed-refs -Porting/GitUtils.pm Generate the contents of a .patch file -Porting/Glossary Glossary of config.sh variables -Porting/harness-timer-report.pl Analyze the timings from the test harness -Porting/how_to_write_a_perldelta.pod Bluffer's guide to writing a perldelta. -Porting/leakfinder.pl Hacky script for finding memory leaks -Porting/Maintainers Program to pretty print info in Maintainers.pl -Porting/Maintainers.pl Information about maintainers -Porting/Maintainers.pm Library to pretty print info in Maintainers.pl -Porting/make-rmg-checklist Generates a checklist 4 the release manager -Porting/make_dot_patch.pl Make a .patch file from a git WD -Porting/make_snapshot.pl Make a tgz snapshot of our tree with a .patch file in it -Porting/makemeta Create the top-level META.yml -Porting/makerel Release making utility -Porting/manicheck Check against MANIFEST -Porting/manifest_lib.pl Library for checking and sorting the MANIFEST -Porting/manisort Sort the MANIFEST -Porting/mksample Generate Porting/config_H and Porting/config.sh -Porting/new-perldelta.pl Generate a new perldelta -Porting/newtests-perldelta.pl Generate Perldelta stub for newly added tests -Porting/perldelta_template.pod Template for creating new perldelta.pod files -Porting/perlgov-team-update Tool to update perlgov from perl-core-teaml -Porting/perlhist_calculate.pl Perform calculations to update perlhist -Porting/pod_lib.pl Code for handling generated pods -Porting/pod_rules.pl generate lists of pod files for Makefiles -Porting/podtidy Reformat pod using Pod::Tidy -Porting/pumpkin.pod Guidelines and hints for Perl maintainers -Porting/README.pod Outline of contents of Porting directory -Porting/README.y2038 Perl notes for the 2038 fix -Porting/release_announcement_template.txt -Porting/release_managers_guide.pod Release Manager's Guide -Porting/release_schedule.pod Schedule for future releases -Porting/rt_list_patches A tool to help you apply patches from RT -Porting/security_template.pod Template for vulnerability announcements -Porting/sort_perldiag.pl Keep our diagnostics orderly -Porting/sync-with-cpan Sync with CPAN -Porting/timecheck.c Test program for the 2038 fix -Porting/timecheck2.c Test program for the 2038 fix -Porting/todo.pod Perl things to do -Porting/updateAUTHORS.pl Tool to automatically update AUTHORS and .mailmap from git log data -Porting/valgrindpp.pl Summarize valgrind reports -Porting/vote_admin_guide.pod Perlgov Vote Administrator guide pp.c Push/Pop code pp.h Push/Pop code defs pp_ctl.c Push/Pop code for control flow @@ -6071,47 +5993,6 @@ t/perf/optree.t Test presence of some op optimisations t/perf/speed.t See if optimisations are keeping things fast t/perf/taint.t See if optimisations are keeping things fast (taint issues) t/perl.supp Perl valgrind suppressions -t/porting/args_assert.t Check that all PERL_ARGS_ASSERT macros are used -t/porting/authors.t Check that all authors have been acknowledged -t/porting/bench.t Check Porting/bench.pl runs ok -t/porting/bench/badhash a test file for t/porting/bench.t -t/porting/bench/badname a test file for t/porting/bench.t -t/porting/bench/badversion.json a test file for t/porting/bench.t -t/porting/bench/callsub.json a test file for t/porting/bench.t -t/porting/bench/callsub2.json a test file for t/porting/bench.t -t/porting/bench/oddentry a test file for t/porting/bench.t -t/porting/bench/ret0 a test file for t/porting/bench.t -t/porting/bench/synerr a test file for t/porting/bench.t -t/porting/bench_selftest.t run Porting/bench.pl's selftest facility -t/porting/bincompat.t Check that {non_,}bincompat_options are ordered -t/porting/checkcase.t Check whether we are case-insensitive-fs-friendly -t/porting/checkcfgvar.t Check that all config.sh-like files are good -t/porting/cmp_version.t Test whether all changed module files have their VERSION bumped -t/porting/copyright.t Check that copyright years match -t/porting/corelist.t Check that Module-CoreList has perl versions for the current perl -t/porting/customized.dat Data file for porting/customized.t -t/porting/customized.t Check all CUSTOMIZED files are as they should be -t/porting/diag.t Test completeness of perldiag.pod -t/porting/dual-life.t Check that dual-life bins are in utils/ -t/porting/exec-bit.t Check that exec-bit bins are identified -t/porting/extrefs.t Check perl headers don't make extern refs -t/porting/filenames.t Check the MANIFEST for filename portability. -t/porting/FindExt.t Test win32/FindExt.pm -t/porting/globvar.t Check that globvar.sym is sane -t/porting/known_pod_issues.dat Data file for porting/podcheck.t -t/porting/libperl.t Check libperl.a sanity -t/porting/maintainers.t Test that Porting/Maintainers.pl is up to date -t/porting/manifest.t Test that this MANIFEST file is well formed -t/porting/pending-author.t Check if any pending commit would break tests -t/porting/perlfunc.t Test that Functions_pm.PL can parse perlfunc.pod -t/porting/pod_rules.t Check that various pod lists are consistent -t/porting/podcheck.t Test the POD of shipped modules is well formed -t/porting/re_context.t Check assumptions made by save_re_context() -t/porting/readme.t Check that all files in Porting/ are mentioned in Porting/README.pod -t/porting/regen.t Check that regen.pl doesn't need running -t/porting/ss_dup.t Check that sv.c:ss_dup handles everything -t/porting/test_bootstrap.t Test that the instructions for test bootstrapping aren't accidentally overlooked. -t/porting/utils.t Check that utility scripts still compile t/re/alpha_assertions.t See if things like '(*postive_lookahed:...) work properly t/re/anyof.t See if bracketed char classes [...] compile properly t/re/begin-once.t Checking that /o freeze a variable in a RegExp diff --git a/MANIFEST.SKIP b/MANIFEST.SKIP index 62e0f88f57..374cc4e315 100644 --- a/MANIFEST.SKIP +++ b/MANIFEST.SKIP @@ -4,3 +4,6 @@ .git_patch ^.git\b ^.github/ + +^t/porting/ +^Porting/

Now lets see what would happen if run make test_porting in the git checkout:

cd t && ./perl -Ilib -I. harness porting/*.t ../lib/diagnostics.t porting/FindExt.t ......... ok porting/args_assert.t ..... ok porting/authors.t ......... ok porting/bench.t ........... ok porting/bench_selftest.t .. ok porting/bincompat.t ....... ok porting/checkcase.t ....... ok porting/checkcfgvar.t ..... [skipping not-expected 'Porting/config.sh'] porting/checkcfgvar.t ..... Failed 2/18 subtests (less 1 skipped subtest: 15 okay) porting/cmp_version.t ..... ok porting/copyright.t ....... ok porting/corelist.t ........ ok porting/customized.t ...... ok porting/diag.t ............ ok porting/dual-life.t ....... ok porting/exec-bit.t ........ 1/? # Failed test 71 - Everything in Porting/exec-bit.txt has +x in repo at porting/exec-bit.t line 66

Files missing exec bit:

../Porting/Maintainers.pl

../Porting/add-package.pl

../Porting/bench.pl

../Porting/bisect-example.sh

../Porting/bisect-runner.pl

../Porting/bisect.pl

../Porting/check83.pl

../Porting/checkAUTHORS.pl

../Porting/checkURL.pl

../Porting/checkVERSION.pl

../Porting/checkansi.pl

../Porting/checkcfguse.pl

../Porting/checkcfgvar.pl

../Porting/checkpodencoding.pl

../Porting/cmpVERSION.pl

../Porting/config_h.pl

../Porting/corecpan.pl

../Porting/corelist-perldelta.pl

../Porting/corelist.pl

../Porting/expand-macro.pl

../Porting/findrfuncs

../Porting/harness-timer-report.pl

../Porting/make_dot_patch.pl

../Porting/make_snapshot.pl

../Porting/makerel

../Porting/mksample

../Porting/newtests-perldelta.pl

../Porting/perlhist_calculate.pl

../Porting/sort_perldiag.pl

../Porting/sync-with-cpan

../Porting/updateAUTHORS.pl

../Porting/valgrindpp.pl

porting/exec-bit.t ........ Failed 1/71 subtests porting/extrefs.t ......... ok porting/filenames.t ....... ok porting/globvar.t ......... ok porting/libperl.t ......... ok porting/maintainers.t ..... ok porting/manifest.t ........ ok porting/pending-author.t .. ok porting/perlfunc.t ........ ok porting/pod_rules.t ....... ok porting/podcheck.t ........ ok porting/re_context.t ...... ok porting/readme.t .......... Can't get contents of Porting/ directory. porting/readme.t .......... Dubious, test returned 255 (wstat 65280, 0xff00) No subtests run porting/regen.t ........... ok porting/ss_dup.t .......... ok porting/test_bootstrap.t .. ok porting/utils.t ........... ok ../lib/diagnostics.t ...... ok

Test Summary Report

porting/checkcfgvar.t (Wstat: 0 Tests: 16 Failed: 0) Parse errors: Bad plan. You planned 18 tests but ran 16. porting/exec-bit.t (Wstat: 0 Tests: 71 Failed: 1) Failed test: 71 porting/readme.t (Wstat: 65280 Tests: 0 Failed: 0) Non-zero exit status: 255 Parse errors: No plan found in TAP output Files=32, Tests=43923, 113 wallclock secs ( 6.68 usr 0.13 sys + 110.74 cusr 3.80 csys = 121.35 CPU) Result: FAIL makefile:867: recipe for target 'test_porting' failed make: *** [test_porting] Error 1

So the outcome here would be that we would lose reasonably valuable data from MANIFEST about what those tests and Porting scripts do and are for, and we would have failing tests. I also think it would be moderately unsatisfactory in at least one other way: we have devs who do development work on platforms where we do not have git. Without a MANIFEST file that enumerates all the files they SHOULD have in the tarball they would use to develop from they would have no way to know if their tarball actually contained everything, which is one of the purposes of the existing MANIFEST file, we have tests and code that does exactly this check already. They might be able to validate it is the tarball that some tooling intended to create using a message digest, but there wouldn't be an automated way to tell if that tooling was producing the correct list without comparing the list to what git says, which would be problematic on a system that does not support git.

Now of course you can code up solutions for all the test failures and tooling breakage that come from this, and for instance replace the code that reads manifest with a File::Find call. In my PR I did a review of all the code that I could find that tries to open up MANIFEST and process it as a list of files and replace it with something else. But that work still needs to be done before this patch meets the intention of my patch. But even if you did that there is still the material loss of data about the files and their content and intent.

So I am -1 on this patch alone as a solution to the problem of excluding arbitrary files from the MANIFEST. For a limited set of git dotfiles sure it is fine, but as far as being a solution to the larger problem that started this discussion debate it alone is not. I also feel that losing the descriptions of the files, and the enumeration of them would be a backwards step, both are valuable, especially the description.

I could imagine us doing both, applying your patch and my patch with suitable adjustments, but this PR alone doesn't solve the problem we started out discussing and which mine tries to solve.

cheers, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq avatar Mar 12 '22 07:03 demerphq

I agree with @demerphq both PRs are solving two different problems.

  • #19523 simplifies the way we can easily exclude common noise
  • #19513 provides a mechanism to avoid shipping not necessary files as part of the tarball

To be honest I would like the best of the two. I'm not sure it's easily doable but definitively worth the try. As @demerphq just showed above moving an explicit list of files to some rules... is going to require some unit tests adjustments.

I do not think we should use File::Find for listing porting files but probably prefer a git ls-files with the manifest.skip

For example we could provide some shared helper to list all the porting files: get_porting_files, .... Then tools and test could rely on that to get the list of porting/* files This could be a no go if we need to check some generated files not git versioned. But I'm not sure such files exist.

##
## list of all files from MANIFEST and ignored by MANIFEST.SKIP
##

# note: need to temporary chdir to the root directory 
sub get_files_from_all_manifests {

    local $ExtUtils::Manifest::Quiet = 1;

    require ExtUtils::Manifest;

    my @from_manifest;# = keys %{ ExtUtils::Manifest::maniread("MANIFEST") };
    my @from_manifest_skip;

    my $skip = ExtUtils::Manifest::maniskip( "MANIFEST.SKIP" );

    my @ls_files = `git ls-files --full-name`;
    die q[Fail to run git ls-files] if $?;
    chomp(@ls_files);

    foreach my $f ( @ls_files ) {
        next unless $skip->($f);
        push @from_manifest_skip, $f;
    }

    my %uniq = map { $_ => 1 } @from_manifest, @from_manifest_skip;

    return ( sort keys %uniq );
}

##
## list of Porting files listed in MANIFEST or ignored by MANIFEST.SKIP
##

sub get_porting_files {
    return grep { qr{^Porting} } get_files_from_all_manifests();
}

Sum-up: Could we use MANIFEST.SKIP to exclude all the internal / tools and at the same time rely on it for our need to check the porting files (test & tools)? I'm not sure, but I think it worth the try. We would have cleaner syntax and easy way to exclude files. We could at the same time simplify code trying to extract content from the MANIFEST.

atoomic avatar Mar 12 '22 15:03 atoomic

Note: https://github.com/Perl/perl5/pull/19542 provides one suggestion to merge both ideas from https://github.com/Perl/perl5/pull/19513 and https://github.com/Perl/perl5/pull/19523

atoomic avatar Mar 17 '22 17:03 atoomic

What should be done about this PR?

khwilliamson avatar Jun 08 '22 15:06 khwilliamson

IMO we should close this PR, but I'm sure other persons can have mixed feelings.

atoomic avatar Jun 08 '22 16:06 atoomic

What should be done about this PR?

I think it's up to the PSC to decide what to do with this.

Leont avatar Jun 10 '22 15:06 Leont

What should be done about this PR?

I think it's up to the PSC to decide what to do with this.

This p.r. for adding a MANIFEST.SKIP file to the core distribution has languished since July and has a small merge conflict. It received one reviewer's Approval but then attracted dissents. @leont, if you want to move this forward, can you resolve the merge conflict? After that, I propose we give the PSC two weeks to make a ruling on this p.r. @rjbs, @book, @leonerd

jkeenan avatar Jan 17 '23 00:01 jkeenan