lucene icon indicating copy to clipboard operation
lucene copied to clipboard

FilterIndexInput and FilterIndexOutput override the behavior of the instance that they decorate

Open blerer opened this issue 1 month ago • 9 comments

Description

FilterIndexInput and FilterIndexOutput do not always delegate calls to the instances that they decorate which can cause them to override the behavior of the instances they are delegating to. For example, it is possible in your own IndexInput implementation to override readMapOfStrings but if an instance of your classe is decorated by a FilterIndexInput instance when readMapOfStrings is called the logic that will be called is the one of DataInput.readMapOfStrings. This behavior is verified by the TestFilterIndexInput.testOverrides unit test. This approach seems error prone to me and go against the implementation of the methods like readMapOfStrings that could have been made final if they should not have been overridden.

Happy to provide a patch if that problem is recognize as such. :-)

Version and environment details

No response

blerer avatar Nov 25 '25 13:11 blerer

I agree this is a problem with wrapper/delegation pattern like Filter*. I can't remember the reason why this method was left non-final but it should be either this or the filters should delegate properly.

dweiss avatar Nov 25 '25 18:11 dweiss

Coincidentally, I noticed a similar thing with FilterWeight just yesterday: https://github.com/apache/lucene/issues/15450

It looks like FilterIndexInput was discussed in https://github.com/apache/lucene/pull/11958#discussion_r1029188569.

I appreciate the idea that non-abstract methods shouldn't be delegated, because the base implementation should be functionally correct (since they just rely on behavior of the abstract methods). On the other hand, we miss out on potential optimizations in subclasses (unless we specifically delegate them in our Filter* subclasses).

I can see reasonable arguments pointing both ways, but right now the Filter* classes are inconsistent. For example, if we do decide that only abstract methods should be delegated, then we should stop delegating matches in FilterWeight.

I think that we can reasonably create one static test implementation for all Filter* classes to ensure that they either delegate all non-final methods or they only delegate abstract methods. (I suppose we could just generate Filter* classes that do one or the other too.)

msfroh avatar Nov 25 '25 23:11 msfroh

The problem isn't new - it's the same old issue that you can come across in FilterInputStream and other delegation-pattern classes in the JDK. Not all methods use delegate-forwarding there; some are inherited from the superclass. For example readAllBytes isn't handed over to the delegate. Most of the time this should not matter... but when it does, you'll spend two days debugging this stuff. I've been there.

dweiss avatar Nov 26 '25 09:11 dweiss

Fair enough.

Do you think there's any value in enforcing more consistent behavior in Lucene? (Whether it's "delegate all non-final methods" or "only delegate abstract methods".) Or should we just acknowledge that it's going to be trappy and folks should pay close attention? (Of course, even if it's consistent one way or the other, it will probably still bite someone at some point.)

msfroh avatar Nov 26 '25 19:11 msfroh

Can we have our cake and eat it too? Define two patterns and use different naming conventions. One can be a FilterFooBar (that delegates nothing) and the other an i don't know DelegatingFooBar (that delegates everything)?

msokolov avatar Nov 26 '25 23:11 msokolov

My experience tells me that delegating everything in decorator-style classes leads to fewer headaches.

dweiss avatar Nov 27 '25 08:11 dweiss

My fear is that having two patterns like FilterFooBar and DelegatingFooBar will introduce another type of confusion. At this stage where methods have been public and overridable for years, making them final could break some people code without some good reasons. Is there a drawback to delegating everything? It is the behavior that I would expect from a decorating class and it is relatively easy to create some unit tests that detect that when new method are added they are not overriden by the decorator class.

blerer avatar Nov 27 '25 08:11 blerer

I am for making those classes delegate everything. And for adding a unit test. I've been using ArchUnit to statically verify this kind of thing in the past but it may be an overkill in Lucene. A simple test that extends a Filter* class and then verifies if all methods are properly delegated will be simpler and faster.

dweiss avatar Nov 27 '25 09:11 dweiss

I will create a PR.

blerer avatar Nov 27 '25 09:11 blerer