libarchive icon indicating copy to clipboard operation
libarchive copied to clipboard

archive_entry_pathname returns a nullptr

Open hgrossauer opened this issue 1 year ago • 6 comments

Basic Information Version of libarchive: 3.6.2 How you obtained it: build from source Operating system and version: CentOS-7 What compiler and/or IDE you are using (include version): gcc (GCC) 11.1.1 20210428 (Red Hat 11.1.1-1)

Description of the problem you are seeing:

  • What did you do?
r = archive_read_next_header(m_ina, &m_entry);
if(r != ARCHIVE_OK){ /* error handling and bail out */ }
std::string filename = archive_entry_pathname(m_entry);
  • What did you expect to happen?

Get a valid pointer.

  • What actually happened?

SIGSEGV because archive_entry_pathname returns a nullptr

Some gdb output:

(gdb) p r
$8 = 0
(gdb) p *m_ina
$7 = {magic = 14594245, state = 4, vtable = 0x87e2a0 <archive_read_vtable>, archive_format = 327680, archive_format_name = 0x7fea780029b0 "ZIP 2.0 (uncompressed)", file_count = 1,
  archive_error_number = 0, error = 0x0, error_string = {s = 0x7fea78101700 "Xar not supported on this platform", length = 0, buffer_length = 64}, current_code = 0x7fea78000e90 "ANSI_X3.4-1968",
  current_codepage = 4294967295, current_oemcp = 4294967295, sconv = 0x7fea78002940, read_data_block = 0x0, read_data_offset = 0, read_data_output_offset = 0, read_data_remaining = 0,
  read_data_is_posix_read = 0 '\000', read_data_requested = 0}
(gdb) p *m_entry
$9 = {archive = 0x0, stat = 0x0, stat_valid = 0, ae_stat = {aest_atime = 0, aest_atime_nsec = 0, aest_ctime = 0, aest_ctime_nsec = 0, aest_mtime = 1696927792, aest_mtime_nsec = 0, aest_birthtime = 0,
    aest_birthtime_nsec = 0, aest_gid = 0, aest_ino = 0, aest_nlink = 0, aest_size = 0, aest_uid = 0, aest_dev_is_broken_down = 0, aest_dev = 0, aest_devmajor = 0, aest_devminor = 0,
    aest_rdev_is_broken_down = 0, aest_rdev = 0, aest_rdevmajor = 0, aest_rdevminor = 0}, ae_set = 92, ae_fflags_text = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0,
      length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_fflags_set = 0,
  ae_fflags_clear = 0, ae_gname = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0},
    aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_hardlink = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0},
    aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_pathname = {aes_mbs = {
      s = 0x7fea78104380 "GEM??_eSyStep_OC_IODD_V1.0.0.1/", length = 31, buffer_length = 252}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0},
    aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_symlink = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0},
    aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_uname = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0},
    aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_sourcepath = {
    aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0,
      buffer_length = 0}, aes_set = 0}, encryption = 0 '\000', mac_metadata = 0x0, mac_metadata_size = 0, digest = {md5 = '\000' <repeats 15 times>, rmd160 = '\000' <repeats 19 times>,
    sha1 = '\000' <repeats 19 times>, sha256 = '\000' <repeats 31 times>, sha384 = '\000' <repeats 47 times>, sha512 = '\000' <repeats 63 times>}, acl = {mode = 16893, acl_head = 0x0, acl_p = 0x0,
    acl_state = 0, acl_text_w = 0x0, acl_text = 0x0, acl_types = 0}, xattr_head = 0x0, xattr_p = 0x0, sparse_head = 0x0, sparse_tail = 0x0, sparse_p = 0x0, strmode = '\000' <repeats 11 times>,
  ae_symlink_type = 0}

hgrossauer avatar Mar 13 '24 08:03 hgrossauer

The m_entry structure here does not hold a valid filename because ae_pathname.aes_set == 0. So it makes sense that you would get NULL when querying the pathname. Some archive formats can have entries that lack filenames; can you provide details of the particular archive that was being read?

kientzle avatar Mar 18 '24 03:03 kientzle

In the meantime I have “fixed” my code to expect and handle this behavior. Personally I would prefer returning a pointer to an empty string instead of a nullptr. Or maybe the behavior (no matter what) should be documented, I haven’t found anything regarding this.

hgrossauer avatar Mar 18 '24 12:03 hgrossauer

Pull Requests to update the documentation are greatly appreciated.

kientzle avatar Mar 19 '24 15:03 kientzle

@kientzle Do we need to free(...) the string returned by archive_entry_pathname(entry) after working with it?

vadimkantorov avatar Aug 30 '24 17:08 vadimkantorov

Do we need to free(...) the string returned by archive_entry_pathname(entry) after working with it?

No. The archive_entry keeps a pointer to all the strings it returns.

kientzle avatar Aug 31 '24 05:08 kientzle

Some archive formats can have entries that lack filenames. So it makes sense that you would get NULL when querying the pathname.

Having said that, it looks like this principle is not systematically and uniformly applied. For example, for raw archives that don't contain any indication of the original filename, archive_entry_pathname() returns the fixed string "data" instead of a null pointer.

fdegros avatar Feb 24 '25 00:02 fdegros