FinRL
FinRL copied to clipboard
ValueError: Expected parameter loc (Tensor of shape (64, 1)) of distribution Normal(loc: torch.Size([64, 1]), scale: torch.Size([64, 1])) to satisfy the constraint Real(), but found invalid values:
I am using FinRL stock trading environment for single stock model training. When I want train a model the error shows up.
To Reproduce Steps to reproduce the behavior:
- Go to final/meta/env_stock_trading/env_stocktrading.py - use this environment
- Train a stablebaselines model on training data frame with one stock only
- See error
Expected behavior Models are training without this error showing.
Screenshots of the error message
Previous FinRL Github issue of this type: There is already an issue on FinRL for this error. You can find it here. What the issue is according to this github issue post: "This error is usually caused by the nan and inf values in the states. You may try to replace nan and inf values with 0s and see if there is still this error."
What I did, My findings:
- I ensured there are no NaN values or Infinity values in the dataframe we pass to the enviroment
- On github people said the issue could be in some certain feature, so I changed features, tried different combination
- Before returning the state I printed the value of how many NaN values it contains, for all states there was printed zero
- I added this line before returning a state:
state = [i if math.isnan(i) == False and math.isinf(i) == False else 0 for i in state] #replace all nan and inf values in the list by 0
(so the issue is not in the state) - what I noticed is that this error is related to randomness (of probably actions/policy), the error does not appear regularly, sometimes it appears, sometimes it appears later, sometimes it does not appear at all
Desktop:
- OS: MacOS (M1)
- Python = 3.9
- Gym = 0.21.0
- stable_baselines3
- tensorflow = 2.7.0
Leave a comment here if you have any idea what may be the source of this issue. Thanks
UPDATE:
I found out that the problem is caused by reward function, which I have rewritten to Calmar Ratio. So when I run the script with reward as portfolio value change, the error doesn't occur.
The problem is that a function that calculates Calmar Ratio outputs a NaN values. So I tried to replace all rewards that are NaN values by 0 with this piece of code:
ratio = RatiosClass().calmar_ratio(e, r, f)
if math.isnan(ratio):
print("NAN")
ratio = 0
elif str(type(ratio)) != "<class 'numpy.float64'>":
print("NOT NUMERIC")
ratio = 0
self.reward = ratio
All thought NaN values and values other than <class 'numpy.float64'> are replaced with 0, the error still occurs.
The error was fixed by making a change in calculation of the reward. If there was a zero in denominator, divide by one.