u-root icon indicating copy to clipboard operation
u-root copied to clipboard

cpio+reproducible is breaking hardlinks

Open Itxaka opened this issue 1 month ago • 10 comments

Describe the bug MakeReproducible is removing the Nlink values from records, which breaks any hardlinks

To Reproduce Steps to reproduce the behavior: You got some hard links in there (try fedora and /usr/sbin/mkfs.ext{2,3,4} which link to mke2fs for example) Do a cpio of that dir with the MakeReproducible method for the records. Extract the cpio archive and check the hardlinks, they will all be 0 on size and do nothing.

Expected behavior A clear and concise description of what you expected to happen. As with gnu cpio, those links should work still after archiving

Additional context Add any other context about the problem here.

Test case to reproduce:

Source main.go:

package main

import (
	"fmt"
	"github.com/u-root/u-root/pkg/cpio"
	"os"
	"path/filepath"
	"strings"
)

func main() {
	archiver, err := cpio.Format("newc")
	if err != nil {
		fmt.Print(err)
		os.Exit(1)
	}

	cpioFileName := "/tmp/test.cpio"
	cpioFile, err := os.Create(cpioFileName)
	if err != nil {
		fmt.Print(err)
		os.Exit(1)
	}
	defer cpioFile.Close()

	rw := archiver.Writer(cpioFile)
	cr := cpio.NewRecorder()

	if err = os.Chdir("/tmp/test_cpio"); err != nil {
		fmt.Print(err)
		os.Exit(1)
	}

	// Walk through the source directory and add files to the cpio archive
	err = filepath.Walk(".", func(filePath string, fileInfo os.FileInfo, err error) error {
		if err != nil {
			return err
		}

		if strings.Contains(filePath, "initramfs.cpio") {
			return nil
		}

		rec, err := cr.GetRecord(filePath)
		if err != nil {
			return fmt.Errorf("getting record of %q failed: %w", filePath, err)
		}

		if err := rw.WriteRecord(cpio.MakeReproducible(rec)); err != nil {
			return fmt.Errorf("writing record %q failed: %w", filePath, err)
		}

		return nil
	})
	if err != nil {
		fmt.Print(err)
		os.Exit(1)
	}

	if err := cpio.WriteTrailer(rw); err != nil {
		fmt.Print(err)
		os.Exit(1)
	}
}

Dir that is gonna be cpio-ed:

$ ls -ltrahi /tmp/test_cpio 
total 2,0M
16782 drwxr-xr-x  2 itxaka itxaka   80 jun 26 15:53 .
16783 -rw-r--r--  2 itxaka itxaka 1,0M jun 26 15:53 test2
16783 -rw-r--r--  2 itxaka itxaka 1,0M jun 26 15:53 test1

output file is perfectly reproducible as expected, no matter how many times is recreated, sha256sum is always the same.

But extracting it produces broken hardlinks:

$ cpio -i < /tmp/test.cpio
2049 blocks
$ ls -ltrahi                
total 1,0M
    1 drwxrwxrwt 22 root   root    620 jun 26 15:57 ..
16808 -rw-r--r--  1 itxaka itxaka    0 jun 26 15:58 test2
16807 -rw-r--r--  1 itxaka itxaka 1,0M jun 26 15:58 test1
16806 drwxr-xr-x  2 itxaka itxaka   80 jun 26 15:58 .

As you can see, test2 is now an empty broken file.

Creating a custom MakeReproducible that does not set the r.NLink = 0 makes the extraction work as expected.

It makes the archives still reproducible no matter how many times is recreated:

$ sha256sum /tmp/test.cpio && rm /tmp/test.cpio && go run main.go
108d920cb415029573b72b14e61ae03b31b5d99820d877f23c09134e38f9e43f  /tmp/test.cpio

$ sha256sum /tmp/test.cpio && rm /tmp/test.cpio && go run main.go
108d920cb415029573b72b14e61ae03b31b5d99820d877f23c09134e38f9e43f  /tmp/test.cpio

And when extracting the cpio archive, the hardlinks are still valid:

$ ls -ltrahi  
total 2,0M
    1 drwxrwxrwt 22 root   root    620 jun 26 16:05 ..
16843 -rw-r--r--  2 itxaka itxaka 1,0M jun 26 16:05 test2
16843 -rw-r--r--  2 itxaka itxaka 1,0M jun 26 16:05 test1
16806 drwxr-xr-x  2 itxaka itxaka   80 jun 26 16:05 .

Itxaka avatar Jun 26 '24 14:06 Itxaka