Xufang Luo
Xufang Luo
@KellySit Hi Kelly! Thanks for your interests! Since we already have some people and efforts on supporting claude code, we would like to explore a way we may collaborate. Could...
@XianglongTan What kind of off-policy would you like to have? Collected trajectories in one iteration can be spited into multiple mini batches, and this is a kind of off-policy. This...
Hi @boardman0 Thanks for your interest in agent lightning! I am not sure what you mean about transition classification. We do not do classification for transitions. What we do in...