activitysim icon indicating copy to clipboard operation
activitysim copied to clipboard

Estimation mode test for atwork_subtour_scheduling is failing

Open jpn-- opened this issue 1 year ago • 1 comments

Describe the bug Something has happened with our dependencies that is causing the atwork_subtour_scheduling estimation mode test to fail.

=========================== short test summary info ============================
FAILED activitysim/estimation/test/test_larch_estimation.py::test_scheduling_model[atwork_subtour_scheduling-SLSQP] - AssertionError: Values are not sufficiently close.
To update values, use --force-regen option.

value:
          obtained_value        expected_value                 diff
11   1.91103112514781048  -0.37405450634311860  2.28508563149092891
25  -8.60763368983575816 -10.33360335093555449  1.72596966109979633
30 -17.74046430158113807 -11.51656463081027404  6.22389967077086403
31  -4.01031675842188307  -5.14292502713100941  1.13260826870912634

best:
           obtained_best         expected_best                 diff
11   1.91103112514781048  -0.37405450634311860  2.28508563149092891
25  -8.60763368983575816 -10.33360335093555449  1.72596966109979633
30 -17.74046430158113807 -11.51656463081027404  6.22389967077086403
31  -4.01031675842188307  -5.14292502713100941  1.13260826870912634
=========== 1 failed, 26 passed, 1482 warnings in 882.06s (0:14:42) ============

Only 4 estimated parameters are significantly different from their expected values, and and 3 of the 4 have relatively large magnitude expected values. This is likely due to this test being nearly (or actually) over-specified, which results in a singular hessian matrix and unstable optimization outcomes.

I am temporarily removing this one estimation test from our stable of tests so that work can proceed on open Phase 8 tasks that are just about complete. The consortium is contemplating work to revisit estimation mode in Phase 9, and fixing would most easily be undertaken as part of a comprehensive estimation-mode update if that is going to happen. If that does not happen as part of Phase 9, we will address just this one test failure separately as a generalized maintenance and support activity.

jpn-- avatar Dec 06 '23 14:12 jpn--

@jpn-- Just want to inform you that we are running into estimation issues with atwork_subtour_scheduling. We are observing non intuitive estimation results for atwork_subtour_scheduling using our region's household travel survey data. We have to stop work on estimating this sub model due to a potential bug in this model. There may be something more than just the poor test data.

We worked with @dhensle on this so if you need more artifact or information on this.

bwentl avatar Jan 30 '24 19:01 bwentl