Cal-QL
Cal-QL copied to clipboard
Question about the configuration dictionary for the default high/low rewrd values for each envs
As inspecting through your codes, I found there is a function cal_return_to_go
which requires a config dictionary for the high/low reward values for each env.
What is its purpose and what if in real-world problems we cannot ensure the high/low rewards of the environment?