`Subject.get_best_query`'s `with_filenames` argument is not honored
Since the latest libdnf update, hawkey.Subject.get_best_query does not honor the with_filenames argument. Filenames are still matched even if with_filenames is explicitly set to False.
import dnf
import dnf.subject
import fedrq.config
import hawkey
def get_base() -> dnf.Base:
"""
Use fedrq to generate a Base object with filelists loaded
.
This can also be done manually.
"""
base = (
fedrq.config.get_config(backend="dnf", load_filelists="always")
.get_rq(branch="rawhide")
.base
)
assert "filelists" in base.conf.optional_metadata_types
assert isinstance(base, dnf.Base)
return base
def main() -> None:
base = get_base()
print("Hawkey version:", hawkey.VERSION)
print("dnf version:", dnf.VERSION)
path = "/usr/share/ansible"
# Filenames are not enabled for the query
subject = dnf.subject.Subject(path).get_best_query(base.sack, with_filenames=False)
# Query length should be 0
print("Query length:", len(subject))
# for package in subject:
# print(package)
if __name__ == "__main__":
main()
$ python with_filelists.py
Hawkey version: 0.73.0
dnf version: 4.19.0
Query length: 0
$ python with_filelists.py
Hawkey version: 0.73.3
dnf version: 4.21.1
Query length: 3
This is breaking fedrq's unit tests that check that our wrapper of the Subject API returns the correct results when with_filenames is disabled.
After a bit of experimenting - this regression is a side effect of the patch https://github.com/rpm-software-management/libdnf/pull/1670.
The string in question /usr/share/ansible is found during search in provides
https://github.com/rpm-software-management/libdnf/blob/fbd34472e5edea0a3ed88ae14a47666e23538e4a/libdnf/sack/query.cpp#L2751-L2757
This causes the query is returned even before the string was searched in file lists.
It looks like POOL_FLAG_ADDFILEPROVIDESFILTERED causes that when filelists metadata are loaded, they are always added to provides. @kontura can you please take a look?
The change in behavior with POOL_FLAG_ADDFILEPROVIDESFILTERED is unintended. I changed the way libsolv sets up the lazy fileprovides lookup so that it works like before. See https://github.com/openSUSE/libsolv/commit/2150865d616ec454375de2c3bcd244c42c56e782
I don't know if this is relevant to the slowness issue.
The change in behavior with POOL_FLAG_ADDFILEPROVIDESFILTERED is unintended. I changed the way libsolv sets up the lazy fileprovides lookup so that it works like before. See openSUSE/libsolv@2150865
@mlschroe the linked commit doesn't seem to affect this particular issue.
Though while looking into this I extracted the following snippet:
#include "solv/pool.h"
#include "solv/repo.h"
static const char * input = "/usr/lib/file";
int main() {
Id p, pp;
Solvable * s;
Pool * pool = pool_create();
pool_set_flag(pool, POOL_FLAG_ADDFILEPROVIDESFILTERED, 1);
Repo * repo = repo_create(pool, "repo");
Repodata * data = repo_add_repodata(repo, 0);
// add pkg with provide
s = pool_id2solvable(pool, repo_add_solvable(repo));
s->name = pool_str2id(pool, "foo", 1);
s->evr = pool_str2id(pool, "1-2", 1);
s->arch = pool_str2id(pool, "x86_64", 1);
Id reldep = pool_rel2id(pool, s->name, s->evr, REL_EQ, 1);
s->provides = repo_addid_dep(repo, s->provides, reldep, 0);
// add "/usr/lib/file" file to pkg
Id did = repodata_str2dir(data, "/usr/lib", 1);
repodata_add_dirstr(data, s - pool->solvables, SOLVABLE_FILELIST, did, "file");
repodata_internalize(data);
pool_addfileprovides(pool);
pool_createwhatprovides(pool);
// Get all packages providing `input`
Id path_id = pool_str2id(pool, input, 1);
FOR_PROVIDES(p, pp, path_id) {
s = pool->solvables + p;
printf("FOR_PROVIDES: %s\n", pool_solvable2str(pool, s));
}
pool_free(pool);
return 0;
}
It is odd to me that while the POOL_FLAG_ADDFILEPROVIDESFILTERED is set /usr/lib/file is found as a provide but without the flag its not.
I understand the flag changes how the file entries (that would not be included in the primary file) are added but I would expect the end result to be the same.
I did find another approach that seems to work regardless of the POOL_FLAG_ADDFILEPROVIDESFILTERED:
int flags;
Queue q;
queue_init(&q);
flags = SELECTION_PROVIDES;
selection_make(pool, &q, input, flags);
flags = SELECTION_FILELIST|SELECTION_SUBTRACT;
selection_make(pool, &q, input, flags);
Queue pq;
queue_init(&pq);
selection_solvables(pool, &q, &pq);
for (int j = 0; j < pq.count; j++) {
printf("provides wihtout filelists: %s\n", pool_solvable2str(pool, pool_id2solvable(pool, pq.elements[j])));
}
queue_free(&q);
Yes, this happens because the path_id is added after the pool_createwhatprovides(() call. Fixed now in commit https://github.com/openSUSE/libsolv/commit/081580de9f964aae61d25e1787c37c7ab8b4fa69
Thanks, that fixes this issue.