ordered-set
ordered-set copied to clipboard
Support item assignment
Since OrderedSet
supports indexing, it's natural to assume that assignments and del
would work as well, but they don't:
>>> s = OrderedSet(['foo'])
>>> s[0]
'foo'
>>> s[0] = 'bar'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'OrderedSet' object does not support item assignment
>>> del s[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'OrderedSet' object doesn't support item deletion
There's an easy workaround for del
(s.remove(s[0])
), but not for item assignments. As far as I can tell, there is no way to reorder items at all, so it's impossible to replace an element with another one (while preserving its index).
@rspeer if you would support the addition of this feature, I would like to implement it.
This would require changing the behavior of multiple other methods.
- If you delete an item, what happens to the indices of other items? Is there an index whose value is None or a sentinel value now? Do we hide that value from iteration? How does this affect slicing?
- If you assign an index to a value that's already in the set, what happens? Do we delete the value at the index you assigned? Do we delete the old index of that value so it can move to the new index?
I really don't understand these questions. Python already has a mutable ordered data structure - list
. All the design decisions have already been made. If you delete an item, of course the indices of the following items should change. Why is that even a question? An ordered set is not a mapping, where the user explicitly associates a value with a key. An ordered set is essentially a list with O(1) membership testing.
Besides, it's already possible to remove items from an ordered set. All I'm asking for is a way to delete items by index instead of value.
You know, there's a way to get an open source developer to consider your feature, and the above comment is not it.
Please think through the following case:
>>> oset = OrderedSet(['a', 'b', 'c'])
>>> oset[1] = 'a'
>>> print(oset.index('a')) # 0 or 1?
>>> print(oset[0]) # what value is here now?
>>> print(len(oset)) # should be 2, but how did we get to this state without constructing a new OrderedSet?
You know, there's a way to get an open source developer to consider your feature, and the above comment is not it.
Why not? I explained why I don't understand your questions, and answered them to the best of my ability. What did I do wrong?
I'll admit that I overlooked this problem:
If you assign an index to a value that's already in the set, what happens?
In this case, I'd just throw an exception.
>>> print(len(oset)) # should be 2, but how did we get to this state without constructing a new OrderedSet?
Again, I don't understand the question. Why would you need to construct a new OrderedSet?
The part that I objected to was "Why is that even a question?" followed by an attempt to explain to me what my library does.
The requirements are clearer now:
-
__delitem__
should rebuild the mapping from items to indices in place, likediscard
does - Assigning a value that already exists in the set should raise an error
Looking at the code, I notice that I merged an implementation of pop()
that allows popping from somewhere besides the end of the list. It doesn't rebuild the mapping, so that's broken. This shows that I need to implement __delitem__()
correctly, and then pop()
should use it.
I still don't really understand what I did to offend you, and I was about to inquire about it in more detail, but ultimately I've decided it would only end up derailing this conversation further. I'm sorry I came across as rude; I have to admit I was a bit miffed when the first response to my 6 month old problem was some... let's say, odd questions.
So to wrap this up, let's review the requirements:
-
__delitem__
should remove an item by index, just likediscard()
removes an item by value. This is, I think, the only sane way to implement it. Replacing the item with some sentinel likeNone
would be unexpected and unlike any other container that exists in python. -
__setitem__
is more interesting. There are essentially 3 cases:- Assigning a unique value, like
{'a', 'b'}[0] = 'c'
. - Assigning an existing value to a smaller index, like
{'a', 'b'}[0] = 'b'
. - Assigning an existing value to a larger index, like
{'a', 'b'}[1] = 'a'
.
I think the only reasonable thing to do in case of (iii) is to throw an exception. A possible alternative would be to swap the two elements, but I don't think it's a good idea for an assignment to change the indices of other items. Plus, assignments usually overwrite values, so that's another reason why swapping the items would be unexpected. Another option would be to delete the replaced item, resulting in
{'a'}
. But it would be weird for an assignment to make the container shorter. Pretty much the same logic also applies to (ii). So (i) should result in{'c', 'b'}
, while (ii) and (iii) should throw an exception. - Assigning a unique value, like
I agree with @Aran-Fey's breakdown. __delitem__
seems straightforward. As for __setitem__
, 2.ii and 2.iii doing anything other than raising an exception (ValueError
) would be surprising, but we can probably support 2.i without much difficulty or confusion. Since membership checking is O(1), checking if an exception must be raised for __setitem__
would be efficient.
It might be simpler to not support __setitem__
, but if nothing else we can probably move forwards with __delitem__
.
I think the only reasonable thing to do with __setitem__
given an existing item is to raise an error. That makes __setitem__
actually the easier case, so I can support it too.
@twiddli https://github.com/rspeer/ordered-set/issues/91#issuecomment-1428531221
Hi, would you mind supporting this too #79?
Hi, Would you be interested in making a PR for adding these features?
@twiddli #91 (comment)
Hi, would you mind supporting this too #79?
Hi, Would you be interested in making a PR for adding these features?
In my fork of yours, I managed to implement __delitem__
fairly easily, but __setitem__
seems more complicated because you rely on python's stable insertion order of dicts during __iter__
. Changing that might affect the iteration performance, so I'm not sure if it's something you'd like.
@twiddli #91 (comment)
Hi, would you mind supporting this too #79?
Hi, Would you be interested in making a PR for adding these features?
In my fork of yours, I managed to implement
__delitem__
fairly easily, but__setitem__
seems more complicated because you rely on python's stable insertion order of dicts during__iter__
. Changing that might affect the iteration performance, so I'm not sure if it's something you'd like.
If you wish so, you may implement __delitem__
in StableSet
and then, you might or may not need to re-implement on its descendent OrderedSet
(it might not be necessary, if the elements list are being updated through another function that you might have used).
__setitem__
- you can also implement this on OrderedDict
alone, which might make more sense.
Please note that I wanted to clean and reorder the commit history before making https://github.com/rspeer/ordered-set/pull/92 but I didn't do this before I published my fork (oops) so your commit history on master is different then mine (sorry about that, I'm not intending to force push to master again).
So please either recreate your feature branch on the updated master
branch (you can use cherry pick), or make a PR onto old_master
and I fill cherry pick your commit onto the new master
myself after its merged.
Thanks :)