timefold-solver
timefold-solver copied to clipboard
NPE when using entitySelector with sorting and STEP cache type
Describe the bug
Using a sorted entitySelector with cacheType STEP causes a null pointer exception.
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "java.util.List.iterator()" because "this.cachedEntityList" is null
at ai.timefold.solver.core.impl.heuristic.selector.entity.decorator.SortingEntitySelector.iterator(SortingEntitySelector.java:41)
at ai.timefold.solver.core.impl.heuristic.selector.common.iterator.AbstractOriginalChangeIterator.<init>(AbstractOriginalChangeIterator.java:23)
at ai.timefold.solver.core.impl.heuristic.selector.move.generic.ChangeMoveSelector$2.<init>(ChangeMoveSelector.java:101)
at ai.timefold.solver.core.impl.heuristic.selector.move.generic.ChangeMoveSelector.iterator(ChangeMoveSelector.java:101)
at ai.timefold.solver.core.impl.constructionheuristic.placer.PooledEntityPlacer$PooledEntityPlacingIterator.createUpcomingSelection(PooledEntityPlacer.java:30)
at ai.timefold.solver.core.impl.constructionheuristic.placer.PooledEntityPlacer$PooledEntityPlacingIterator.createUpcomingSelection(PooledEntityPlacer.java:23)
at ai.timefold.solver.core.impl.heuristic.selector.common.iterator.UpcomingSelectionIterator.hasNext(UpcomingSelectionIterator.java:27)
at ai.timefold.solver.core.impl.constructionheuristic.DefaultConstructionHeuristicPhase.solve(DefaultConstructionHeuristicPhase.java:45)
at ai.timefold.solver.core.impl.solver.AbstractSolver.runPhases(AbstractSolver.java:83)
at ai.timefold.solver.core.impl.solver.DefaultSolver.solve(DefaultSolver.java:193)
at testschedules.Main.main(Main.java:15)
To Reproduce
-
Clone https://github.com/aioobe/timefold-repro-step-sorted-entities
-
Run with
./gradlew -Dorg.gradle.java.home=/path/to/java17 run
Environment
Timefold Solver Version or Git ref: 1.0.0
Output of java -version:
java version "17.0.7" 2023-04-18 LTS
Java(TM) SE Runtime Environment (build 17.0.7+8-LTS-224)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.7+8-LTS-224, mixed mode, sharing)
Output of uname -a or ver:
Linux xyz 5.15.0-76-generic #83~20.04.1-Ubuntu SMP Wed Jun 21 20:23:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Additional information
I think the issue is the following:
DefaultConstructionHeuristicPhase.java:45tries to start iterating over the entities- Since cacheType
STEPis used, thecachedEntityListinSortingEntitySelectorhas not been initialized. - The
cachedEntityListwould have been initialized inDefaultConstructionHeuristicPhase.java:47, but that has obviously not been executed yet.
So there's a little chicken and egg situation.
Thanks for reporting!
There are various combinations of cacheType and selector configurations, some of which we know do not work. If we know they do not work, we typically add a fail-fast describing that combination as not possible.
Obviously there are some combinations that do not work, of which we do not yet know. :-) I'll have to look into this and figure out if this is a combination that should work but doesn't, or if this is a combination that will just be "fixed" by adding another fail-fast.
I'm adding this on my backlog.
Gotcha. If it matters, my use case is that I would very much like to be able to reevaluate how difficult an entity is to place "continuously" (in between steps). It can be expensive, but hopefully not something that's contradictory or unsound.
I think injecting yourself at that level exposes you to APIs which are not stable. But even if that's OK for you, it certainly puts you at the mercy of behaviors which are undocumented and can change without warning.
In the process of working on https://github.com/TimefoldAI/timefold-solver/issues/123, which may already solve your use case in an "official" way, I have recently brought a significant improvement to explanation performance (https://github.com/TimefoldAI/timefold-solver/pull/171).
Another pair of eyes never hurts, so if you want to pull that and tell me what kind of results you're getting with that, I'd appreciate that. Data from my experiments show that this is a significant improvement, but nothing beats real-world data.
Wow, those numbers look amazing. Really appreciate the work you and Geoffrey are doing in this area. If I have the time the coming days, I'll pull it and take a look. Otherwise I will most likely be the first person to use it when it's out!
I think injecting yourself at that level exposes you to APIs which are not stable.
I'm a little bit surprised that you say so since the docs says "Almost every Selector supports setting a cacheType", and isn't STEP intended precisely for cases when we want to reevaluate before each step? (Noting that it says "almost", and that it may very well be the case that STEP ends up not being supported here.)
I looked briefly at the lines involving the stack trace... I jotted down some lines that may or may not be on the right track.
So DefaultConstructionHeuristicPhase.java:45 which looks as follows
for (Placement<Solution_> placement : entityPlacer) { // <--- placement fetched here (which is not safe if using STEP)...
ConstructionHeuristicStepScope<Solution_> stepScope = new ConstructionHeuristicStepScope<>(phaseScope);
stepStarted(stepScope);
decider.decideNextStep(stepScope, placement); // <----- ...although not used until here
...
}
should perhaps be expressed something like this...
Iterator iter = null;
while (true) {
ConstructionHeuristicStepScope<Solution_> stepScope = new ConstructionHeuristicStepScope<>(phaseScope);
stepStarted(stepScope);
// Deferred initialization of iterator
if (iter == null || usingStepType) {
iter = entityPlacer.iterator(); // <--- Safe to initialize iterator
}
/////////// Correspond to to the original for-each loop //////////
if (!iter.hasNext()) {
break;
}
Placement<Solution_> placement = iter.next();
//////////////////////////////////////////////////////////////////
decider.decideNextStep(stepScope, placement);
...
}
Somewhat messy. Could probably be cleaned up.
I'm a little bit surprised that you say so since the docs says "Almost every Selector supports setting a cacheType", and isn't
STEPintended precisely for cases when we want to reevaluate before each step? (Noting that it says "almost", and that it may very well be the case thatSTEPends up not being supported here.)
Technically, you are correct. But it also has some side-effects - for example, if you want to cache a move selector, you may find that the hundreds (thousands? millions?) of possible moves either don't fit in memory, or they take ages to generate. JUST_IN_TIME is great, because it will only generate what you'll actually use.
This is indeed a chicken-and-egg problem, and I currently can not think of a solution.
The code you suggested has one big problem - it has to trigger the step start, even though the placer iterator may later prove to be empty. In effect, step listeners are triggered when they should not be. It's starting to look like one of those fail-fast situations, but I'm not giving up just yet.
Looking at the documentation, we never specifically say that step cache isn't supported with sorted selection.
However, every example we give includes phase cache, which works. In fact, the entire documentation seems to mention step cache only once. One more piece in the puzzle.
Another user reporting a similar problem: https://stackoverflow.com/questions/77088034/optaplanner-issue-not-able-to-replicate-a-java-working-method-in-benchmarking
This time, it is in probabilistic selector, which does not extend caching selector, but does caching nonetheless. There are obviously places besides the caching selector hierarchy where caching happens; useful to know.