logstash icon indicating copy to clipboard operation
logstash copied to clipboard

Remove duplicate gems when producting logstash artifacts

Open donoghuc opened this issue 2 months ago • 20 comments

Release notes

Removal of duplicated gems in logstash artifacts.

What does this PR do?

Bundler is used to manage a gem environment that is shipped with logstash artifacts. By default, bundler will install newer/duplicate gems than shipped with ruby distributions (in logstash's case jruby). Duplicate gems in the shipped environment can cause issues with code loading with ambiguous gem specs or gem activation issues. This commit adds a step to compute the duplicate gems managed with bundler (and therefore direct/transitive dependencies of logstash/plugins) and removes copies shipped with jruby. Note that there are two locations to do the deduplication at. Both the stdlib gems as well as what jruby refers to as "bundled" gems. The existing pattern for excluding files from artifacts is used to implement the deduplication. Note that for the standard lib gems only remove duplicate gemspec files as removal of the code itself triggers noisy warning from ruby and code loading problems.

Why is it important/What is the impact to the user?

In some cases security scanners would pick up vendored/standard lib gems which typically trail in version shipped with the jruby distrubuted with logstash artifacts. While the newer code was loaded for logstash (and therefore not a practical threat) the scanner would still produce noise and require justifications. By removing old/duplicated gems we remove the false positives on the scanners.

How to test this PR locally

Build a container artifact and look for duplicated gems:

➜  logstash git:(deduplicate-gem-env) ✗ ARCH="aarch64" rake artifact:docker
Using system java: /Users/cas/.jenv/shims/java
Skipping bundler install...
Building logstash-core using gradle
./gradlew assemble
To honour the JVM settings for this build a single-use Daemon process will be forked. For more on this, please refer to https://docs.gradle.org/8.11.1/userguide/gradle_daemon.html#sec:disabling_the_daemon in the Gradle documentation.
Daemon will be stopped at the end of the build

> Task :downloadJRuby UP-TO-DATE
Download https://repo1.maven.org/maven2/org/jruby/jruby-dist/9.4.13.0/jruby-dist-9.4.13.0-bin.tar.gz

BUILD SUCCESSFUL in 4s
33 actionable tasks: 2 executed, 31 up-to-date
[plugin:install-default] Installing default plugins
Installing logstash-codec-avro, logstash-codec-cef, logstash-codec-collectd, logstash-codec-dots, logstash-codec-edn, logstash-codec-edn_lines, logstash-codec-es_bulk, logstash-codec-fluent, logstash-codec-graphite, logstash-codec-json, logstash-codec-json_lines, logstash-codec-line, logstash-codec-msgpack, logstash-codec-multiline, logstash-codec-netflow, logstash-codec-plain, logstash-codec-rubydebug, logstash-filter-aggregate, logstash-filter-anonymize, logstash-filter-cidr, logstash-filter-clone, logstash-filter-csv, logstash-filter-date, logstash-filter-de_dot, logstash-filter-dissect, logstash-filter-dns, logstash-filter-drop, logstash-filter-elastic_integration, logstash-filter-elasticsearch, logstash-filter-fingerprint, logstash-filter-geoip, logstash-filter-grok, logstash-filter-http, logstash-filter-json, logstash-filter-kv, logstash-filter-memcached, logstash-filter-metrics, logstash-filter-mutate, logstash-filter-prune, logstash-filter-ruby, logstash-filter-sleep, logstash-filter-split, logstash-filter-syslog_pri, logstash-filter-throttle, logstash-filter-translate, logstash-filter-truncate, logstash-filter-urldecode, logstash-filter-useragent, logstash-filter-uuid, logstash-filter-xml, logstash-input-azure_event_hubs, logstash-input-beats, logstash-input-couchdb_changes, logstash-input-dead_letter_queue, logstash-input-elasticsearch, logstash-input-exec, logstash-input-file, logstash-input-ganglia, logstash-input-gelf, logstash-input-generator, logstash-input-graphite, logstash-input-heartbeat, logstash-input-http, logstash-input-http_poller, logstash-input-jms, logstash-input-pipe, logstash-input-redis, logstash-input-stdin, logstash-input-syslog, logstash-input-tcp, logstash-input-twitter, logstash-input-udp, logstash-input-unix, logstash-input-elastic_serverless_forwarder, logstash-integration-jdbc, logstash-integration-kafka, logstash-integration-logstash, logstash-integration-rabbitmq, logstash-integration-snmp, logstash-integration-aws, logstash-output-csv, logstash-output-elasticsearch, logstash-output-email, logstash-output-file, logstash-output-graphite, logstash-output-http, logstash-output-lumberjack, logstash-output-nagios, logstash-output-null, logstash-output-pipe, logstash-output-redis, logstash-output-stdout, logstash-output-tcp, logstash-output-udp, logstash-output-webhdfs
Installation successful
[artifact:archives] Building tar.gz/zip of default plugins for OS: linux, arch: arm64
Adding duplicate gems to exclude path: base64, bigdecimal, cgi, date, ffi, fileutils, jar-dependencies, jruby-openssl, json, logger, net-http, net-imap, net-pop, net-protocol, net-smtp, psych, racc, rake, rexml, ruby2_keywords, timeout, uri
Full exclude_paths list:
 - **/*.gem
 - **/test/files/slow-xpath.xml
 - **/logstash-*/spec
 - bin/bundle
 - bin/rspec
 - bin/rspec.bat
 - vendor/**/gems/*/test/**/*
 - vendor/**/gems/*/spec/**/*
 - vendor/**/gems/**/Gemfile.lock
 - vendor/**/gems/**/Gemfile
 - vendor/jruby/lib/ruby/gems/shared/gems/jar-dependencies-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/jar-dependencies-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/jar-dependencies-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/net-imap-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/net-imap-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/net-imap-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/net-pop-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/net-pop-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/net-pop-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/net-smtp-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/net-smtp-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/net-smtp-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/racc-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/racc-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/racc-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/rake-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/rake-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/rake-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/rexml-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/rexml-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/rexml-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/base64-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/base64.rb
 - vendor/jruby/lib/ruby/stdlib/base64/**/*
 - vendor/jruby/lib/ruby/stdlib/base64
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/bigdecimal-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/bigdecimal.rb
 - vendor/jruby/lib/ruby/stdlib/bigdecimal/**/*
 - vendor/jruby/lib/ruby/stdlib/bigdecimal
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/cgi-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/cgi.rb
 - vendor/jruby/lib/ruby/stdlib/cgi/**/*
 - vendor/jruby/lib/ruby/stdlib/cgi
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/date-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/date.rb
 - vendor/jruby/lib/ruby/stdlib/date/**/*
 - vendor/jruby/lib/ruby/stdlib/date
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/ffi-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/ffi.rb
 - vendor/jruby/lib/ruby/stdlib/ffi/**/*
 - vendor/jruby/lib/ruby/stdlib/ffi
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/fileutils-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/fileutils.rb
 - vendor/jruby/lib/ruby/stdlib/fileutils/**/*
 - vendor/jruby/lib/ruby/stdlib/fileutils
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/jar-dependencies-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/jar-dependencies.rb
 - vendor/jruby/lib/ruby/stdlib/jar-dependencies/**/*
 - vendor/jruby/lib/ruby/stdlib/jar-dependencies
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/jruby-openssl-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/jruby-openssl.rb
 - vendor/jruby/lib/ruby/stdlib/jruby-openssl/**/*
 - vendor/jruby/lib/ruby/stdlib/jruby-openssl
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/json-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/json.rb
 - vendor/jruby/lib/ruby/stdlib/json/**/*
 - vendor/jruby/lib/ruby/stdlib/json
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/logger-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/logger.rb
 - vendor/jruby/lib/ruby/stdlib/logger/**/*
 - vendor/jruby/lib/ruby/stdlib/logger
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/net-http-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/net-http.rb
 - vendor/jruby/lib/ruby/stdlib/net-http/**/*
 - vendor/jruby/lib/ruby/stdlib/net-http
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/net-protocol-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/net-protocol.rb
 - vendor/jruby/lib/ruby/stdlib/net-protocol/**/*
 - vendor/jruby/lib/ruby/stdlib/net-protocol
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/psych-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/psych.rb
 - vendor/jruby/lib/ruby/stdlib/psych/**/*
 - vendor/jruby/lib/ruby/stdlib/psych
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/racc-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/racc.rb
 - vendor/jruby/lib/ruby/stdlib/racc/**/*
 - vendor/jruby/lib/ruby/stdlib/racc
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/ruby2_keywords-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/ruby2_keywords.rb
 - vendor/jruby/lib/ruby/stdlib/ruby2_keywords/**/*
 - vendor/jruby/lib/ruby/stdlib/ruby2_keywords
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/timeout-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/timeout.rb
 - vendor/jruby/lib/ruby/stdlib/timeout/**/*
 - vendor/jruby/lib/ruby/stdlib/timeout
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/uri-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/uri.rb
 - vendor/jruby/lib/ruby/stdlib/uri/**/*
 - vendor/jruby/lib/ruby/stdlib/uri
[artifact:tar] building build/logstash-9.3.0-SNAPSHOT-linux-aarch64.tar.gz
[docker] Building docker image
➜  logstash git:(deduplicate-gem-env) ✗ docker image ls
REPOSITORY                                 TAG              IMAGE ID       CREATED          SIZE
docker.elastic.co/logstash/logstash-full   9.3.0-SNAPSHOT   fa5a1591bf02   54 seconds ago   1.48GB
docker.elastic.co/logstash/logstash        9.3.0-SNAPSHOT   fa5a1591bf02   54 seconds ago   1.48GB
python                                     3                671d8548cfc6   2 weeks ago      1.61GB
➜  logstash git:(deduplicate-gem-env) ✗ docker run -it fa5a1591bf02 /bin/bash
bash-5.1$ find / -name *rexml*
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/doc/rexml
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml/rexml.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-filter-xml-4.3.2/lib/logstash/filters/xml/patch_rexml.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/aws-sdk-core-3.234.0/lib/aws-sdk-core/xml/parser/rexml_engine.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/specifications/rexml-3.4.4.gemspec
/usr/share/logstash/vendor/jruby/lib/ruby/gems/shared/gems/rss-0.2.9/lib/rss/rexmlparser.rb
find: ‘/root’: Permission denied
find: ‘/var/cache/ldconfig’: Permission denied
find: ‘/proc/tty/driver’: Permission denied
bash-5.1$ find / -name *uri*
/sys/kernel/security
/sys/module/spurious
/usr/lib64/security
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/twitter-6.2.0/lib/twitter/entity/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml/security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/http-cookie-1.1.0/lib/http/cookie/uri_parser.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/addressable-2.8.7/lib/addressable/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/sequel-5.97.0/lib/sequel/plugins/blacklist_security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/sequel-5.97.0/lib/sequel/plugins/whitelist_security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/tzinfo-data-1.2025.2/lib/tzinfo/data/definitions/Indian/Mauritius.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/tzinfo-data-1.2025.2/lib/tzinfo/data/definitions/Europe/Zurich.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/uri-1.0.4
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/uri-1.0.4/lib/uri
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/uri-1.0.4/lib/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rack-session-2.1.1/security.md
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rack-protection-4.2.1/lib/rack/protection/content_security_policy.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/http-3.3.0/lib/http/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/elasticsearch-api-8.19.1/lib/elasticsearch/api/actions/security
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/elasticsearch-api-8.19.1/lib/elasticsearch/api/namespace/security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/mustermann-3.0.4/bench/uri_parser_object.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/mustermann-3.0.4/bench/capturing.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/nio4r-2.7.4-java/ext/libev/ev_iouring.c
/usr/share/logstash/vendor/bundle/jruby/3.1.0/specifications/uri-1.0.4.gemspec
/usr/share/logstash/vendor/jruby/lib/ruby/gems/shared/specifications/default/open-uri-0.3.0.gemspec
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/uri/lib/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/uri/lib/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/optparse/lib/optparse/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/security.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/security
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/security_option.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/s3_uri_signer.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/uri_formatter.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendor/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendor/uri/lib/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendor/uri/lib/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendored_uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/uri_credentials_filter.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/uri_normalizer.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/open-uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/optparse/uri.rb
/usr/share/logstash/lib/pluginmanager/pack_fetch_strategy/uri.rb
/usr/share/logstash/logstash-core/lib/logstash/util/safe_uri.rb

Related issues

  • https://github.com/elastic/logstash/issues/17873

donoghuc avatar Oct 23 '25 00:10 donoghuc

:robot: GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

github-actions[bot] avatar Oct 23 '25 00:10 github-actions[bot]

This pull request does not have a backport label. Could you fix it @donoghuc? 🙏 To fixup this pull request, you need to add the backport labels for the needed branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

mergify[bot] avatar Oct 23 '25 00:10 mergify[bot]

Exhaustive tests https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/2745

donoghuc avatar Oct 23 '25 00:10 donoghuc

Updated exhaustive tests: https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/2757

donoghuc avatar Oct 23 '25 22:10 donoghuc

Moving back to draft form. Need to track down a gem loading issue. Somehow removal of psych is breaking at least the plugin manager. Need to trace where the GEM_HOME/GEM_PATH are getting set to point to the bundled gems.

donoghuc avatar Oct 23 '25 23:10 donoghuc

After further investigation it seems that removing stdlib gems is going to be more trouble than its worth. Digging in to the example failures we see a case where logstash code does something like require 'yaml'. This returns true because something in the ruby internals already assumes psych is loaded. Logstash then blows up with unitialized constant errors. If we manually activate psych we still get a warning emitted from ruby itself saying it thinks we've deleted a standard gem.

WIth that in mind I did validate that when we do have a bundled gem that is the code that is loaded/used during logstash exectuion (this sounds obvious, but wanted to double check).

I think we can safely remove the duplicate "bundled" gems still, but not move forward with the removal of the standard lib gems. Practically, I imagine that CVEs in the standard lib gems wont last too long as they are shipped with the interpreter. We still have the ability to mitigate by shipping newer versions in the lag time between being able to take up latest jruby.

I am curious in this comment https://github.com/elastic/logstash/issues/17873#issuecomment-3139425107 @jsvd were you indicating to remove just the gemspecs from the stdlib location?

donoghuc avatar Oct 24 '25 20:10 donoghuc

Exhaustive tests: https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/2797

donoghuc avatar Oct 27 '25 22:10 donoghuc

@donoghuc after some tests, I agree that we can't delete all of stdlib and there will have to be a compromise between just deleting gemspecs and some ruby deletions as well.

Testing locally I got to this diff of rubyUtils.gradle https://gist.github.com/jsvd/4869daf70ea740f6a9f24eaba9b55a62

With it it is possible to run ./gradlew installDefaultGems and ruby tests pass.

Uncommenting excludes of **/lib/ruby/stdlib starts breaking down Logstash.

The benefit of tackling this issue at the rubyUtils.gradle vs artifacts.rake is that the code and gemspecs are excluded as soon as possible, vs only. excluding for packaging.

jsvd avatar Oct 30 '25 16:10 jsvd

Thanks for looking. I see the benefit. I will see if we can dynamically compute in rubyUtils.gradle the paths to exclude like i've done in the artifacts.rake. probably maintaining a static list will be too hard.

donoghuc avatar Oct 30 '25 17:10 donoghuc

@jsvd how would you feel about moving the deduplication logic to the installDefaultGems task (which calls in to the plugin:install-default rake task). The reason I would like to do it there is that this resolves the full gem set for logstash and all its bundled plugins. The reason this is important is that at this point we will have the full ruby gem env and we can do duplicate analysis (compute the duplicated gems and remove those that are safe to). As far as testing, this would pair nicely with https://github.com/elastic/logstash/pull/18330 which aims to ensure the installDefaultGems task is called as a dependency before any test suite is run. I think ideally we would programatically do the duplicate detection which will allow us to not have to manage a static list anywhere. I think that fundamentally adds a dependency of generating the full gem env so we can get a complete list of duplicates.

Let me know if you think that is acceptable.

donoghuc avatar Oct 30 '25 22:10 donoghuc

I am not opposed to doing it at that stage, my rationale for doing it asap was to ensure that anything on top of the vendored jruby HAD to live off what was available and there was no confusion between what was read from vendored jruby vs vendored gems. that said, we know that much of stdlib can't be removed at all because bundler needs it.

So we can aim at installDefaultGems-time then, and 100% on not having a static list, my rubyUtils.gradle gist was just a way to validate the concept, even if it had to be hardcoded, we'd have to find a way to auto generate that later.

jsvd avatar Oct 31 '25 17:10 jsvd

/run exhaustive tests

donoghuc avatar Nov 03 '25 22:11 donoghuc

run exhaustive tests

donoghuc avatar Nov 03 '25 22:11 donoghuc

I think that the failing fips unit tests are due to the container running installDefault gems but then that not being an explicit depednecy on running the unit tests. This related PR should solve that: https://github.com/elastic/logstash/pull/18330/files i'll confirm, but if so i will likely fold that in to this PR.

donoghuc avatar Nov 03 '25 23:11 donoghuc

The tests are revealing some real issues. I would have thought that this commit https://github.com/elastic/logstash/pull/18340/commits/527b84fd93b1eb47d4f39e9405f0ee394e12cf08 would fix the genInstaller tests. I assumed that using a separate gem_home there was the source of loading the wrong psych. However that does not appear to be the case! I will have to dig deeper in to what is going on. I do think regardless that the commit there is an improvement and related to this PR. I'm open to splitting it out though if requested.

donoghuc avatar Nov 05 '25 00:11 donoghuc

@donoghuc after a ridiculous amount of time digging into rubygems and bundler, I can see why prefer-local is giving us problems, looking at the changelog for 2.6.4 (JRuby 9.4.13.0 ships with 2.6.3):

2.6.4 (February 17, 2025)

  • https://github.com/rubygems/rubygems/pull/8484
  • https://github.com/rubygems/rubygems/pull/8412

We can see in this simple example how bundler 2.6.3 still pulls json, while 2.6.4 uses the one with JRuby: https://gist.github.com/jsvd/74b703967cb89c35dd49a359fc1a5e82

jsvd avatar Nov 27 '25 23:11 jsvd

I ended up with a rather complicated changeset: https://github.com/elastic/logstash/commit/00ee944e6a50fcaca5a8ca7a311be66dfb36de5a

It reduces the total gemspec to 479, from 485 of our PR18340~1 (i.e. without prefer-local) and from 513 in main.

The downside is that we default to using all that comes with JRuby, so typically older gems.

jsvd avatar Nov 28 '25 18:11 jsvd

Nice find! That is very interesting. Glad its been addressed already upstream...

The downside is that we default to using all that comes with JRuby, so typically older gems.

I'm not actually convinced that is a "downside" I think that friction (us having to explicitly manage gems we need to ship newer versions of in our gemspec/Gemfile) will help us make sure we are 1. staying up to date on jruby and 2. making sure to check that jruby is shipping newer gems.

I think ideally we would keep the logic to use --prefer-local AND the logic to clean up duplicated gems (the ones resulting in the case where jruby includes an old version that we explicitly need a newer one).

How we achieve the former is a bit of an open question in my mind. Would you want us to take up the patches you included in your exploritory commit? Or would you suggest we wait until we get a jruby version that includes those? Or something else entirely?

donoghuc avatar Dec 01 '25 23:12 donoghuc

Based on an in person discussion with @jsvd We will work on adding the --prefer-local optimization once we get to a newer jruby. For now, in the main branch we will move forward with the deduplication strategy we have worked out so far. The progress on --prefer-local is captured with https://github.com/elastic/logstash/pull/18451 when we are ready to take that on in the future. I have reverted https://github.com/elastic/logstash/pull/18340/commits/fb0d726a2a79f64a96dbf82b4e88af2d67cd55a2 to move this along :)

donoghuc avatar Dec 03 '25 00:12 donoghuc

:green_heart: Build Succeeded

History

  • :broken_heart: Build #3936 failed 334a24413aa96ba87993e8da53072ea636460d21
  • :green_heart: Build #3906 succeeded fb0d726a2a79f64a96dbf82b4e88af2d67cd55a2
  • :broken_heart: Build #3887 failed 4ee007550123349c92523327cc634c627d1ab41d
  • :green_heart: Build #3843 succeeded 336135a62ef3290fa97a2b2c58303b2ee2ddd4c6
  • :broken_heart: Build #3838 failed 527b84fd93b1eb47d4f39e9405f0ee394e12cf08

elasticmachine avatar Dec 10 '25 02:12 elasticmachine

@mergifyio backport 8.19 9.1 9.2

github-actions[bot] avatar Dec 10 '25 18:12 github-actions[bot]