Chi Zhang comments

Results 15 comments of


                                            Chi Zhang

Why MOPO is given access to terminal function in rollout generation?

Actually, given the terminal function is reasonable in practice to perform model-based RL. Also, we can even get access to a given reward function when doing model-based learning. The reward...

Question regarding Model State

The trading model is this: at time stamp t, allocate portfolio denoted as w1 and buy using open price and sell all the portfolio using close price at timestamp t...

Question regarding Model State

Yes.

Question regarding Model State

1. There may be something wrong with the commission calculation. We are in a hurry and doesn't focus too much on the trading model. But it doesn't affect the algorithm....

TypeError

Use Theano backend of keras or pull the newest commit. For Tensorflow backend, the error and solution is described here https://github.com/keras-team/keras/issues/2397.

Results can not be replicated

It is one of the major issues known in many fields using deep learning models, especially RL. I suggest you try different random seeds and train for longer period. Here...

Question

The market value refers to the return of equal distribution of your current investment volume. I think it's better to incorporate news data into the model as it is super...

Question

We didn't include the news in this course project due to time limit. The general idea is to predict the sentiment label (positive, negative, neutral) of each stock at each...

Question

Yes, the result of imitation learning is to just buy one stock. We try to optimize Sharpe ratio directly, but it turns out to be a very difficult problem since...

Question

There is an assumption of the trading rule. Buy using the open price and sell all the holdings at the close price on the same day and repeat.