archive_entry_pathname returns a nullptr
Basic Information Version of libarchive: 3.6.2 How you obtained it: build from source Operating system and version: CentOS-7 What compiler and/or IDE you are using (include version): gcc (GCC) 11.1.1 20210428 (Red Hat 11.1.1-1)
Description of the problem you are seeing:
- What did you do?
r = archive_read_next_header(m_ina, &m_entry);
if(r != ARCHIVE_OK){ /* error handling and bail out */ }
std::string filename = archive_entry_pathname(m_entry);
- What did you expect to happen?
Get a valid pointer.
- What actually happened?
SIGSEGV because archive_entry_pathname returns a nullptr
Some gdb output:
(gdb) p r
$8 = 0
(gdb) p *m_ina
$7 = {magic = 14594245, state = 4, vtable = 0x87e2a0 <archive_read_vtable>, archive_format = 327680, archive_format_name = 0x7fea780029b0 "ZIP 2.0 (uncompressed)", file_count = 1,
archive_error_number = 0, error = 0x0, error_string = {s = 0x7fea78101700 "Xar not supported on this platform", length = 0, buffer_length = 64}, current_code = 0x7fea78000e90 "ANSI_X3.4-1968",
current_codepage = 4294967295, current_oemcp = 4294967295, sconv = 0x7fea78002940, read_data_block = 0x0, read_data_offset = 0, read_data_output_offset = 0, read_data_remaining = 0,
read_data_is_posix_read = 0 '\000', read_data_requested = 0}
(gdb) p *m_entry
$9 = {archive = 0x0, stat = 0x0, stat_valid = 0, ae_stat = {aest_atime = 0, aest_atime_nsec = 0, aest_ctime = 0, aest_ctime_nsec = 0, aest_mtime = 1696927792, aest_mtime_nsec = 0, aest_birthtime = 0,
aest_birthtime_nsec = 0, aest_gid = 0, aest_ino = 0, aest_nlink = 0, aest_size = 0, aest_uid = 0, aest_dev_is_broken_down = 0, aest_dev = 0, aest_devmajor = 0, aest_devminor = 0,
aest_rdev_is_broken_down = 0, aest_rdev = 0, aest_rdevmajor = 0, aest_rdevminor = 0}, ae_set = 92, ae_fflags_text = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0,
length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_fflags_set = 0,
ae_fflags_clear = 0, ae_gname = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0},
aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_hardlink = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0},
aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_pathname = {aes_mbs = {
s = 0x7fea78104380 "GEM??_eSyStep_OC_IODD_V1.0.0.1/", length = 31, buffer_length = 252}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0},
aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_symlink = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0},
aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_uname = {aes_mbs = {s = 0x0, length = 0, buffer_length = 0},
aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0, buffer_length = 0}, aes_set = 0}, ae_sourcepath = {
aes_mbs = {s = 0x0, length = 0, buffer_length = 0}, aes_utf8 = {s = 0x0, length = 0, buffer_length = 0}, aes_wcs = {s = 0x0, length = 0, buffer_length = 0}, aes_mbs_in_locale = {s = 0x0, length = 0,
buffer_length = 0}, aes_set = 0}, encryption = 0 '\000', mac_metadata = 0x0, mac_metadata_size = 0, digest = {md5 = '\000' <repeats 15 times>, rmd160 = '\000' <repeats 19 times>,
sha1 = '\000' <repeats 19 times>, sha256 = '\000' <repeats 31 times>, sha384 = '\000' <repeats 47 times>, sha512 = '\000' <repeats 63 times>}, acl = {mode = 16893, acl_head = 0x0, acl_p = 0x0,
acl_state = 0, acl_text_w = 0x0, acl_text = 0x0, acl_types = 0}, xattr_head = 0x0, xattr_p = 0x0, sparse_head = 0x0, sparse_tail = 0x0, sparse_p = 0x0, strmode = '\000' <repeats 11 times>,
ae_symlink_type = 0}
The m_entry structure here does not hold a valid filename because ae_pathname.aes_set == 0. So it makes sense that you would get NULL when querying the pathname. Some archive formats can have entries that lack filenames; can you provide details of the particular archive that was being read?
In the meantime I have “fixed” my code to expect and handle this behavior. Personally I would prefer returning a pointer to an empty string instead of a nullptr. Or maybe the behavior (no matter what) should be documented, I haven’t found anything regarding this.
Pull Requests to update the documentation are greatly appreciated.
@kientzle Do we need to free(...) the string returned by archive_entry_pathname(entry) after working with it?
Do we need to free(...) the string returned by archive_entry_pathname(entry) after working with it?
No. The archive_entry keeps a pointer to all the strings it returns.
Some archive formats can have entries that lack filenames. So it makes sense that you would get NULL when querying the pathname.
Having said that, it looks like this principle is not systematically and uniformly applied.
For example, for raw archives that don't contain any indication of the original filename, archive_entry_pathname() returns the fixed string "data" instead of a null pointer.