MoveSim
MoveSim copied to clipboard
Question about the current implementation of mobility regularity-aware loss
https://github.com/FIBLAB/MoveSim/blob/93e6837fa318a9e5a3966f5267accfb303ed8bb6/code/main.py#L165
Is the current implementation of mobility regularity-aware loss correct? Now both $L_d$ and $L_p$ are directly added to the loss for calculating policy gradient, but they won't produce any gradient on the input (generated sequences, since they are discrete values).
I guess the correct way is to add them to the reward instead. Is that right?