pybind11 icon indicating copy to clipboard operation
pybind11 copied to clipboard

Support frozenset, tuple as dict keys

Open ecatmur opened this issue 3 years ago • 7 comments

Description

https://github.com/pybind/pybind11/discussions/3836

Add frozenset as a pybind11 type. Allow type_caster<const T> to be (explicitly) selected and specialized, defaulting to (inherit from) type_caster<T>. Use type_caster<const K> for std::set, std::map etc. key converter. Specialize type_caster<const C> for std::set, std::vector etc. Add tests.

Suggested changelog entry:

std::map<std::set<int>, int> now converts to/from Dict[FrozenSet[int], int]; likewise std::set<std::vector<int>> to/from Set[Tuple[int, ...]].

ecatmur avatar Apr 19 '22 17:04 ecatmur

The frozen set casters are a decent idea, but I don't know how I feel about the tuple ones considering we have std::tuple to fulfill that role already. Thoughts @rwgk @henryiii ?

Skylion007 avatar Apr 19 '22 19:04 Skylion007

The frozen set casters are a decent idea, but I don't know how I feel about the tuple ones considering we have std::tuple to fulfill that role already. Thoughts @rwgk @henryiii ?

Skylion007 avatar Apr 19 '22 19:04 Skylion007

The frozen set casters are a decent idea, but I don't know how I feel about the tuple ones considering we have std::tuple to fulfill that role already.

I think the problem is that historically (and currently) Python's tuple has a bit of a mixed role: it's the container you go for when you need an immutable homogeneous sequence, as well as being the heterogeneous sequence type. For example, in typing you can write Tuple[int, ...] as well as Tuple[int, str].

FWIW, when PEP 351 was rejected (over 15 years ago!) Raymond Hettinger made the argument that list->tuple and set->frozenset are the only two transformations you actually ever need: https://mail.python.org/pipermail/python-dev/2005-October/057586.html That suggests a simpler approach would be to adjust list->tuple and set->frozenset after casting before insertion into dict (or set), avoiding the hairiness of type_caster<type const> and transparently supports all types casting to list and set (including third party stuff) at a small runtime cost. I'll propose that as an alternative.

ecatmur avatar Apr 20 '22 00:04 ecatmur

I tried the branch and get multiple errors during compilation on my Mac:

error: ambiguous partial specializations of 'type_caster' const_name("FrozenSet[", "Set[") + key_conv::name

Oh, sorry. Maybe #3887 would work? It's a simpler approach but the observable behavior should be the same.

Thus I was unable to test it further, however I have some observations.

  1. Personally I do not support an idea of the implicit cast from non-const std::set to frozenset. As I mentioned in #3836 -- there should be a clear line between set and it's immutable version as it might confuse Python/pybind users. Ideally functions should only accept dicts coming (or getting out) in form of std::map<const std::set<...>, ...>. Is there a way to restrict that?

There isn't really such a thing as std::map<const std::set<...>, ...> - it isn't possible to get a non-const reference to the key in a std::map (not without an illegal const_cast, anyway) so no-one would bother putting const on the key. The value_type is std::pair<const Key, Value> so the keys are const as long as they're in the map.

So there isn't a conversion from non-const std::set to frozenset, it's from const std::set to frozenset.

  1. I have also noticed that tests lack some of essential scenarios. What happens in case when code mix const and non-cost types? What happens when arguments are passed by copy?

I'm only implementing this for the caster interface, not the binder interface, so everything is being passed by copy already and there's no difference between const and non-const at the top level.

  1. Following @Skylion007 concern about tuples as I have similar thoughts. std::tuple is already filling that niche, there is no need to expand in that area. As Python tuple will be converted to std::vector it will not only allow appending, but actually allow to modify it's contents -- I do not think this is an expected behaviour. If a user wants to do it, just use py::tuple and manually transfer the data to the mutable container.

We already allow passing tuple to C++ functions that take a std::vector (etc.) - there's one or two tests that show that already, and I've just added some more.

If Python code inserts a tuple key into a dict[tuple[str, ...], int] and passes it to C++ code as a std::map<std::vector<std::string>, int> then the C++ code can't modify the std::vectors (since they're const within the map) and if it removes them, inserts modified keys and passes the map back to Python then they'll be completely distinct objects. There's no way for a std::vector to alias a tuple and allows modifying it.

ecatmur avatar Apr 21 '22 01:04 ecatmur

I looked up and down both PRs quite a bit, but didn't get a chance yet to fully appreciate all details. It looks very interesting: not a lot of code, but makes something work that currently doesn't.

Which of the PRs do you prefer? I'd like to focus on the one that you think is best.

Another idea: could you please transfer the additional tests for tuple and frozenset that already work on current HEAD to a separate PR, even if it's a tiny PR? That will make it more clear here what already works and what doesn't. The more immediately improved test coverage is a nice plus.

rwgk avatar Apr 23 '22 01:04 rwgk

I looked up and down both PRs quite a bit, but didn't get a chance yet to fully appreciate all details. It looks very interesting: not a lot of code, but makes something work that currently doesn't.

Which of the PRs do you prefer? I'd like to focus on the one that you think is best.

#3887. While this one (#3886) is more idiomatic C++, #3887 is more Pythonic and that's appropriate for code that affects the Python side of things - it's automatically creating a Python API, so should stick to types that would appear naturally in Python programs. The clever recursive stuff would work in the STL binder API, perhaps.

Another idea: could you please transfer the additional tests for tuple and frozenset that already work on current HEAD to a separate PR, even if it's a tiny PR? That will make it more clear here what already works and what doesn't. The more immediately improved test coverage is a nice plus.

Sure: #3900 is for tuple tests. #3901 splits out the core frozenset functionality (any_set as a common base type, and std::set cast from any_set).

ecatmur avatar Apr 24 '22 21:04 ecatmur

Let's focus on merging #3901 for now.

Skylion007 avatar Apr 27 '22 15:04 Skylion007