Cleanup object files on gems with extensions
I've just noticed that gems with compiled extensions use a lot of disk space, most of it are object files that, as far as I understand, are not needed on runtime.
For example, on sassc-2.4.0 does use (on my linux computer) 114.3 MiB, where extension code is 110.9 MiB but the important files (as far as I understand) are just libsass, with 1,7 MB and libsass.so 3,2MB. That is about 108 MB of extra space that could also make slower deployments, when generating bundles and testing when caching gems on CI, or simply on developers machines that duplicate gems on different environments (per project GEM_HOME).
Here are my current environment details:
$ gem env version 3.0.3
What are your thoughts on this?
I've just done find $GEM_HOME -iname '*.o' -print0 | xargs -0 rm for testing, and it seems that nothing broke, and saved about 150 Mb of space on a tiny rails project.
I think this is not really RubyGems problem, but this should be handled in compiled extension logic itself on the gem side.
https://github.com/sass/sassc-ruby/issues/200 is this the same problem?
I'm not sure, the uploaded gem (and the downloaded gem) is pretty small (it does not include the compiled code), it is compiled by rubygems during the install procedure, I'm not sure about the details of the compilation procedure, but it is called by rubygems.
I think that rubygems could add a cleanup step on that procedure (instead of making every gem out there implement it). May be that is not possible, in that case, that responsibility should be better documented, and may be a hook could be provided for gem developers to implement that (I could not found any documentation about this).
As far as I understand, the only hook that the gem developers can use to cleanup object files is the install make task on the Make file that is bundled in the gem. (or generated by mkmf)
On the other hand, I think that the sassc case just add visibility on the issue, but it does happen with every gem with extensions.
If that cleanup step cannot be implemented by rubygems, there might be possible to:
-
Add a list of generated files (and cleanup the rest), as there are already many lists of files on the gemspec file.
-
Implement that cleanup on Mkmf and related gems, so gems that use it get that cleaning.
Not sure where the code implementing this should live, but automatically cleaning up generated .o files after installation sounds like a good idea to me.
@deivid-rodriguez in theory it could be possible to add another clean at the end here https://github.com/rubygems/rubygems/blob/3e0641a6c8eff4d9a5d26856b1ac7417168b4cc0/lib/rubygems/ext/builder.rb#L42 But the impact of this change is unknown for me.
Running make clean might remove also the important files (sassc and sassc.so on our example).
@eloyesp but those should be already copied to proper place during install AFAIK.
[retro@retro ext (master=)]❤ make install
/usr/bin/install -c -m 0755 libsass.so /home/retro/.rubies/ruby-2.5.7/lib/ruby/site_ruby/2.5.0/x86_64-linux/sassc
Just ran into the same sass issue, and after tracing the installation procedure all the way through gem install ended up here. In my experience, it is generally fine to clean or even distclean the source directories after the binary program/library has been built and installed.
I don't forseee any clear bad consequences from doing this, so I'd say feel free to propose a PR if interested.
It seems that calling clean does not break anything, but I've just tested locally with a rails app and modifying that single line. The gain was about 90 Mb.
- ['clean', '', 'install'].each do |target|
+ ['clean', '', 'install', 'clean'].each do |target|
Yeah, I have seen a similar gain: yesterday I added a find /usr/local/bundle/gems -name '*.o' -delete to a dockerfile of an application, and the resulting image size went down ~250Mb.
Going to add a +1 on this. The size of these artifacts can become significant (one Rust Gem we have has 1g+ of artifacts).
Ideally we would perform a distclean
I was looking into https://github.com/mastodon/mastodon/issues/21151 and ran into this issue. I was looking at some other apps I have worked on, and they all do a variation of find /usr/local/bundle/gems -name '*.o' -delete.
I think it would be pretty beneficial to have a way to clean these. Maybe it could be an option that needs to be opted in on, and bundle install --deployment could do that by default?
What about to make this option into mkmf somehow? That way author would be able to decide post-install actions.
Ended up doing something similar to what @simi suggests for rb_sys/mkmf — https://github.com/oxidize-rb/rb-sys/blob/49f89667f3074f80381d8cd95cb0d8baec69ba00/gem/lib/rb_sys/mkmf.rb#L136
An alternative here would be to support some type of command for Bundler: like bundle autoclean. This would cover the most common use case of removing bloat for Docker images, CI caches, etc.
An alternative here would be to support some type of command for Bundler: like
bundle autoclean. This would cover the most common use case of removing bloat for Docker images, CI caches, etc.
While working on #4017 I've noticed that we should first fix some install issues, that is making sure that build files are properly placed, as currently ruby is actually using the build artifacts instead of the installed files. Adding bundle autoclean without merging 1163afd8ceff1d3f13f408b262c4ed3c1cc262b3 might break certain gems.
Also, adding autoclean at the bundler level would have less impact and more complexity (bundler is already more complex than rubygems).