activitysim
activitysim copied to clipboard
Utility terms that compare pandas categorical variable to strings are not evaluated correctly with Sharrow
Describe the bug
After implementing the string to pandas categorical conversion, some of our current CI tests failed. They all had Sharrow turned on and set to test mode. The utility calculated with and without Sharrow are different.
[48:39.10] INFO: completed flow_LQLDEWSFEGQ5W2NJNONCWPNFSCB7O5RD.load in 0:00:18.710844 stop_frequency.work.simple_simulate.eval_mnl
[48:39.10] INFO: completed apply_flow in 0:00:21.651810
[48:39.10] INFO: elapsed time sharrow flow 0:00:21.659376 stop_frequency.work.simple_simulate.eval_mnl
[48:39.31] INFO: elapsed time simple flow 0:00:00.207622 stop_frequency.work.simple_simulate.eval_mnl.eval_utils
Not equal to tolerance rtol=0.01, atol=0
utility not aligned
Mismatched elements: 132 / 144 (91.7%)
Max absolute difference: 1998.00011762
Max relative difference: 1729.2712081
x: array([[ 0. , -1000.9582 , -1002.2882 , -1002.6522 , -1001.3462 ,
-2000.7913 , -2002.1212 , -2002.4852 , -1003.1262 , -2002.5713 ,
-2003.9012 , -2003.5703 , -1004.4472 , -2003.8922 , -2004.5272 ,...
y: array([[ 0.000000e+00, -1.958200e+00, -3.288200e+00, -3.652200e+00,
-2.346200e+00, -2.791200e+00, -4.121200e+00, -4.485200e+00,
-4.126200e+00, -4.571200e+00, -5.901200e+00, -5.570200e+00,...
big problem: 132 missed close values out of 144 (91.67%)
sh_util.shape=(9, 16)
(array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]), array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1, 2, 3, 5, 6, 7,
9, 10, 11, 13, 14, 15, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]))
possible problematic expressions:
11.1% [043] (school_esc_outbound.isin(['ride_share', 'pure_escort']))
00.0% [044] (school_esc_inbound.isin(['ride_share', 'pure_escort']))
[48:43.24] ERROR: ===== ERROR IN stop_frequency =====
[48:43.24] ERROR:
Not equal to tolerance rtol=0.01, atol=0
utility not aligned
To Reproduce Steps to reproduce the behavior:
- Check out ...
- Run test_mtc_extended.py::test_prototype_mtc_extended_sharrow()
Expected behavior The utility with and with Sharrow should be the same.
Screenshots
result of tracing the failed tour, in the stop_frequency.work:
Chooser
Non-sharrow evaluation
Sharrow evaluation]
Additional context Temporary solution: I moved the pandas categorical vs string comparisons to the preprocessors.