DL2
DL2 copied to clipboard
Question about state changing
Can someone explain how to deal with the last state in ith scheduling and the 1st state in i+1 th scheduling? how to combine them to be a RL sample like (s, a, r, s') ? thank you