Remove tests files and other config-related files from the gem packages
Current rubygems guidelines suggest to only include runtime files as a best practice: https://github.com/rubygems/guides/issues/90. Additionally, the test-files contain insecure Gemfile.lock files which can sometimes trigger other vulnerability scanners when they scan bundler-audit.
In this MR I propose to remove the following:
- files starting with
.in the root of the repository (includes GitHub workflows and issue templates, RSpec configuration, Yard settings) Gemfileas it is only needed for development and not used in runtimegemspec.ymlwhich is used to construct gemspec and not used after thatspec/includes test files and fixtures
To consider:
Rakefilereferencesrspecandyardwhich are development-only dependencies.bundler-audit.gemspecsince it is only used in development to produce metadata, added to the gem package as a YAML-serializedGem::Specification- ~~
ChangeLog.mdprobably is not looked at ever and only takes space.~~
Prior art
Since RuboCop added test_files to deprecated gemspec attributes, the directive was removed in https://github.com/rubysec/bundler-audit/pull/370 (see discussion in https://github.com/rubocop/rubocop/issues/10675).
File list difference
To test, I built the gem with rake build and compared the listing before and after the change:
tar -xOzf pkg/bundler-audit-0.9.2.gem data.tar.gz | tar -tzf -
Full diff:
--- before.txt 2024-09-12 06:48:59
+++ after.txt 2024-09-12 06:48:38
@@ -1,21 +1,10 @@
-.document
-.github/FUNDING.yml
-.github/ISSUE_TEMPLATE/bug-report.md
-.github/ISSUE_TEMPLATE/feature-request.md
-.github/workflows/ruby.yml
-.gitignore
-.rspec
-.rubocop.yml
-.yardopts
COPYING.txt
ChangeLog.md
-Gemfile
README.md
Rakefile
bin/bundle-audit
bin/bundler-audit
bundler-audit.gemspec
-gemspec.yml
lib/bundler/audit.rb
lib/bundler/audit/advisory.rb
lib/bundler/audit/cli.rb
@@ -34,37 +23,3 @@
lib/bundler/audit/scanner.rb
lib/bundler/audit/task.rb
lib/bundler/audit/version.rb
-spec/advisory_spec.rb
-spec/audit_spec.rb
-spec/bundle/insecure_sources/Gemfile
-spec/bundle/insecure_sources/Gemfile.lock
-spec/bundle/secure/Gemfile
-spec/bundle/secure/Gemfile.lock
-spec/bundle/unpatched_gems/Gemfile
-spec/bundle/unpatched_gems/Gemfile.lock
-spec/bundle/unpatched_gems_with_dot_configuration/.bundler-audit.yml
-spec/bundle/unpatched_gems_with_dot_configuration/Gemfile
-spec/bundle/unpatched_gems_with_dot_configuration/Gemfile.lock
-spec/cli/formats/json_spec.rb
-spec/cli/formats/junit_spec.rb
-spec/cli/formats/text_spec.rb
-spec/cli/formats_spec.rb
-spec/cli_spec.rb
-spec/configuration_spec.rb
-spec/database_spec.rb
-spec/fixtures/advisory/CVE-2020-1234.yml
-spec/fixtures/advisory/not_a_hash.yml
-spec/fixtures/config/bad/empty.yml
-spec/fixtures/config/bad/ignore_contains_a_non_string.yml
-spec/fixtures/config/bad/ignore_is_not_an_array.yml
-spec/fixtures/config/valid.yml
-spec/fixtures/lib/bundler/audit/cli/formats/bad.rb
-spec/fixtures/lib/bundler/audit/cli/formats/good.rb
-spec/integration_spec.rb
-spec/report_spec.rb
-spec/results/insecure_source_spec.rb
-spec/results/result_spec.rb
-spec/results/unpatched_gem_spec.rb
-spec/scanner_spec.rb
-spec/spec_helper.rb
-spec/task_spec.rb
Closes #361
Some discussion about this with @postmodern https://ruby.social/@postmodern/112890522679233029
Just realized that excluding gemspec.yml and keeping gemspec might not have too much sense, since the gemspec depends on the yml file (and not used in runtime), so probably might be excluded as well.
Some examples in the wild (see Rubygems top 10 https://rubygems.org/stats):
- activerecord includes changelog, license, readme, examples, and code
- bundler includes changelog, license, readme, code, and gemspec (latter with a comment "include the gemspec itself because warbler breaks w/o it")
- I am curious how warbler is affected by gemspec, given most of the gems do not include it
- aws-sdk-core includes license, changelog, version, code, RBS signatures, and a cert bundle
- rack bundles license, readme, specification, and code, and interesting, includes readme, changelog, and contribution guide in extra_rdoc_files
- diff-lcs is the one that does not follow pattern. It uses hoe to generate gemspec (which I admit, haven't seen in the last decade). It includes most of the files, including specs. It does not though include the gemspec file