dive icon indicating copy to clipboard operation
dive copied to clipboard

Provide details on inefficient/wasted space

Open vertex-github opened this issue 6 years ago • 7 comments

It would be really helpful if dive could provide more insight into its classification of wasted/inefficient space. I have images that dive claims have 250Mb of wasted space and Id love if dive could tell me why it thinks the space is wasted (the files and/or source layers and their respective sizes).

vertex-github avatar Oct 14 '19 13:10 vertex-github

The lower left pane in the UI has a bit more detail than just the final count. Here's an example dive output, only the lower left pane:

[Image Details]─────────────────────────────────────────────
Total Image size: 363 MB                                    
Potential wasted space: 6.6 MB                              
Image efficiency score: 98 %                                
                                                            
Count   Total Space  Path                                   
    4        2.3 MB  /var/cache/debconf/templates.dat       
    2        1.0 MB  /var/cache/debconf/templates.dat-old   
    4        815 kB  /var/log/dpkg.log                      
    2        562 kB  /var/cache/apt/pkgcache.bin            
    4        452 kB  /var/lib/dpkg/status                   
    2        425 kB  /var/cache/apt/srcpkgcache.bin         
    3        365 kB  /var/lib/dpkg/status-old               
    2        322 kB  /var/log/lastlog                       
    2         70 kB  /var/log/tallylog                      
...

In the above example, of the 6.6MB wasted space, dive claims that /var/cache/debconf/templates.dat is the worst offending file, showing up in 4 layers, accounting to 2.3 MB of the total wasted space. The second worst offending file is /var/cache/debconf/templates.dat-old... (and so on).

The unfortunate behavior with the UI is that there is currently no way to show output that is scrolled off the pane. However, if you run your dive command with --json output.json then you can capture the full list of files and details in an exported output.json file.

There is plenty of room for improvement for the UI, especially the image details pane. I'm open to suggestions for improvement!

wagoodman avatar Nov 05 '19 05:11 wagoodman

Maybe just a simple header to indicate what the file list is below? I didn't realise what that file listing was. At first glance it appeared to be "biggest files in the image", maybe?

twirrim avatar Jul 26 '20 23:07 twirrim

How does it know these are needless files? Is this based onheuristics, i.e., a of patterns of files are known to be cruft? (or probably not needed in a final build/image?)

hansbogert avatar Dec 20 '20 12:12 hansbogert

How does it know these are needless files? Is this based onheuristics, i.e., a of patterns of files are known to be cruft? (or probably not needed in a final build/image?)

@wagoodman Could you please elaborate this? How do I see this list in the json file?

Suggestion for UI improvement: Use SHIFT+TAB (or CTRL) to tab into this pane and then scroll with Arrow keys. TAB should bring you back to the normal two panes.

KUGA2 avatar Dec 01 '21 11:12 KUGA2

I'd also like to see more info on this - the metrics are pretty ambiguous and their names aren't clear. I also suspect that they may be mis-counting some files as "wasted space" when they're not, but that's unavoidable. Knowing how the metrics are counted will help users figure out what they should set the metric thresholds at to account for inaccuracy, and also where to look for inefficiencies that might be missed by the scan.

HildaHay avatar Jul 26 '23 19:07 HildaHay