FinRL_Podracer
FinRL_Podracer copied to clipboard
delta_stock comparing scaled action to unscaled available_amount
This doesn't look right to me when I stepped through it. My code includes Interactive Brokers commissions but othewise the same
if action > 0: # buy_stock
available_amount = self.account // adj
delta_stock = min(available_amount, action)
self.stocks[index] += delta_stock
if USE_IB_COST:
comm = max(delta_stock * TRANSACTION_FEE_PER_SHARE, 1.0)
else:
comm = (adj * delta_stock) * self.transaction_fee_percent
self.account -= adj * delta_stock + comm
elif self.stocks[index] > 0: # sell_stock
delta_stock = min(-action, self.stocks[index])
if USE_IB_COST:
comm = max(delta_stock * TRANSACTION_FEE_PER_SHARE, 1.0)
else:
comm = (adj * delta_stock) * self.transaction_fee_percent
self.account += adj * delta_stock - comm
A typical action value on the 3rd line above is 0.295 while the available_amount is unscaled at say 47778. So action will always be the minimum.
I assume this is your scaling
state = np.hstack((
self.account * 2 ** -16,
self.day_npy * 2 ** -8,
self.stocks * 2 ** -12,
), ).astype(np.float32)
Looks like a bug to me. Why not use MinMaxScaler?