FinRL_Podracer icon indicating copy to clipboard operation
FinRL_Podracer copied to clipboard

delta_stock comparing scaled action to unscaled available_amount

Open davidwynter opened this issue 4 years ago • 0 comments

This doesn't look right to me when I stepped through it. My code includes Interactive Brokers commissions but othewise the same

            if action > 0:  # buy_stock
                available_amount = self.account // adj
                delta_stock = min(available_amount, action)
                self.stocks[index] += delta_stock
                if USE_IB_COST:
                    comm = max(delta_stock * TRANSACTION_FEE_PER_SHARE, 1.0)

                else:
                    comm = (adj * delta_stock) * self.transaction_fee_percent

                self.account -= adj * delta_stock + comm

            elif self.stocks[index] > 0:  # sell_stock
                delta_stock = min(-action, self.stocks[index])
                if USE_IB_COST:
                    comm = max(delta_stock * TRANSACTION_FEE_PER_SHARE, 1.0)

                else:
                    comm = (adj * delta_stock) * self.transaction_fee_percent

                self.account += adj * delta_stock - comm


A typical action value on the 3rd line above is 0.295 while the available_amount is unscaled at say 47778. So action will always be the minimum.

I assume this is your scaling

        state = np.hstack((
            self.account * 2 ** -16,
            self.day_npy * 2 ** -8,
            self.stocks * 2 ** -12,
        ), ).astype(np.float32)

Looks like a bug to me. Why not use MinMaxScaler?

davidwynter avatar Apr 02 '21 16:04 davidwynter