RL-book icon indicating copy to clipboard operation
RL-book copied to clipboard

Results 16 RL-book issues
Sort by recently updated
recently updated
newest added

ERROR: Cannot install -r requirements.txt (line 66) and urllib3==1.26.5 because these package versions have conflicting dependencies. The conflict is caused by: The user requested urllib3==1.26.5 requests 2.24.0 depends on urllib3!=1.25.0,...

The code mentions: return Process1.State(price=state.price + up_move * 2 - 1) Where as its a logistic function of (L-Xt). Why is up_move multiplied by 2 and subtracted by 1?

Bumps [ipython](https://github.com/ipython/ipython) from 7.14.0 to 8.10.0. Release notes Sourced from ipython's releases. See https://pypi.org/project/ipython/ We do not use GitHub release anymore. Please see PyPI https://pypi.org/project/ipython/ Commits 15ea1ed release 8.10.0 560ad10...

dependencies

Bumps [jupyter-core](https://github.com/jupyter/jupyter_core) from 4.6.3 to 4.11.2. Release notes Sourced from jupyter-core's releases. 4.11.1 What's Changed Fix inclusion of jupyter file and check in CI by @​blink1073 in jupyter/jupyter_core#276 Full Changelog:...

dependencies

Bumps [certifi](https://github.com/certifi/python-certifi) from 2020.6.20 to 2022.12.7. Commits 9e9e840 2022.12.07 b81bdb2 2022.09.24 939a28f 2022.09.14 aca828a 2022.06.15.2 de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ... b8eb5e9 2022.06.15.1...

dependencies

I am confusing about the line 40, why is it state.state.inventory_position() instead of state.inventory, and line 45 state.state.on_hand instead of state.on_hand.

It would be good to have the dynamic programming algorithms take as input a tolerance input (eg: value_iteration_result takes an extra input tolerance: float)

The summation variable for a Stationary Distribution (p74 of pdf) in Chapter2.md should be s not s', i.e. sum over s in N: Wrong: ![image](https://user-images.githubusercontent.com/16808143/152866812-a23c2957-6cbb-4f69-876e-3f06df5f3d46.png)

The 2 code examples for the iterative square root algorithm cause an 'UnboundLocalError' and are missing an assignment; even with local vars initialized, the loop will never terminate as x...

pg. 119 has the following code snippet ```python from typing import Iterator X = TypeVar('X') def iterate(step: Callable[[X], X], start: X) -> Iterator[X]: ... ``` It seems odd to explicitly...