imitation-learning Questions about Model and Training

Questions about Model and Training

Open mvpcom opened this issue 6 years ago • 17 comments

Thanks for sharing this, it seems to me this is a trained model for test purpose only using your benchmark system. According to your paper, you are utilizing a custom dataset (around 14 hours of manual and automatic driving in the Town01) with Adam optimizer for training the network. However, I need more detail about the perturbation technique? How can I add the same method for data collection in the CARLA using python interface? Because you're not going to release your dataset and the training code in the future soon, I'm going to build my own dataset, at least before you release the training code. Another question is, what about the second table results? Is the benchmarking system produce same results as the first table? (By tables I mean tables in "CARLA: An Open Urban Driving Simulator") And the last question is, how long the benchmark is? How many hours? I run that recently and it still runs after more than 2 and half hours with 1080 GPU and a good computer :D

Dec 27 '17 17:12 mvpcom

Hey @mvpcom . I will release the dataset ! Here is the link : https://drive.google.com/file/d/1hloAeyamYn-H6MfV1dRtY1gJPhkR55sY/view I will put it in the readme and explain the fields. The training code will be released next year probably. Maybe after ICRA decision.

About the benchmark. It is super slow. It takes about one day to run. We are currently trying to speed up carla in order to do faster benchmarks.

The results are quite similar. There are a few differences, because of changes in carla, but the dataset is still valid.

Dec 28 '17 14:12 felipecode

Thanks @felipecode I believe your works are great and I will be happy if I can help in any part of the development process.

Please also check this commit, I put a comment there.

Dec 28 '17 15:12 mvpcom

I wrote a training code for the same model using the same dataset. First of all, I believe during training you generated specific data for each branch according to control input. Am I right? I'm also curious about the speed branch. These are all outputs of the network for each branch separately. The question is, do you use a sequence as input or just a frame? I ask because for determining speed we need to have at least two frames otherwise there is no valuable information in the latest branch. Why is the speed branch useful for training the whole network?

branch_config = [["Steer", "Gas", "Brake"], ["Steer", "Gas", "Brake"],
["Steer", "Gas", "Brake"], ["Steer", "Gas", "Brake"], ["Speed"]]

and another question is about the augmentation parameters. Would you please let me know for each of different augmentation techniques which parameters were utilized? I need all details for Gaussian Blur, Additive Gaussian Noise, Pixel Dropout, additive and multiplicative brightness variation, contrast variation and saturation variation.

Jan 05 '18 18:01 mvpcom

I use just one frame. The speed is also send as input. This helps for the agent to learn how to stop and to reduce speed on curves. The speed prediction, as an auxiliary target, is something we tried and seems to help for the car to stop better on intersections. It was not on the paper, since we don't know how much it helps ( Depends on the training data, for some cases it helps for some it doesnt).

For augmentation, i used the imgaug library. Here you have the parameters:

	st = lambda aug: iaa.Sometimes(0.4, aug)
	oc = lambda aug: iaa.Sometimes(0.3, aug)
	rl = lambda aug: iaa.Sometimes(0.09, aug)
	self.augment = iaa.Sequential([
        rl(iaa.GaussianBlur((0, 1.5))), # blur images with a sigma between 0 and 1.5
        rl(iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05), per_channel=0.5)), # add gaussian noise to images
        oc(iaa.Dropout((0.0, 0.10), per_channel=0.5)), # randomly remove up to X% of the pixels
        oc(iaa.CoarseDropout((0.0, 0.10), size_percent=(0.08, 0.2),per_channel=0.5)), # randomly remove up to X% of the pixels
        oc(iaa.Add((-40, 40), per_channel=0.5)), # change brightness of images (by -X to Y of original value)
        st(iaa.Multiply((0.10, 2.5), per_channel=0.2)), # change brightness of images (X-Y% of original value)
        rl(iaa.ContrastNormalization((0.5, 1.5), per_channel=0.5)), # improve or worsen the contrast
        rl(iaa.Grayscale((0.0, 1))), # put grayscale
        ],
	    random_order=True # do all of the above in random order

)

That is the dirty part of deep learning. The model is quite sensible to augmentation in order to generalize for the other town. This had to be carefully tunned. It is interesting that both towns are very similar. Yet generalization is a problem, and you need some psychodelic augmentations to make it work.

Looking forward to see your results ! Cheers.

Jan 06 '18 13:01 felipecode

I have one more question dear @felipecode `:D The question is about the total steps of training. As you said in the paper, the model trained for 294000 steps. For how many days and using what kind of configuration (GPUs?), you did training? Is there any diagram that shows train and validation loss? I couldn't find any in both papers. At least I hope to have a final train and validation loss after all steps (294000) to have an idea when the model can be comparable to your final model. For your knowledge, I utilized MSE of control outputs/Speed for each branch as a loss function. I want to make sure I didn't forget anything. Using my current implementation and my configuration (1080 GPU), for me every 230 steps approximately take long as one hour.

A sample for my training procedure to have a better sense:

Train::: Epoch: 0, Step: 580, TotalSteps: 581, Loss: 0.0489383', 'Follow Lane'
Train::: Epoch: 0, Step: 580, TotalSteps: 581, Loss: 0.0419313', 'Go Left'
Train::: Epoch: 0, Step: 580, TotalSteps: 581, Loss: 0.0437215', 'Go Right'
Train::: Epoch: 0, Step: 580, TotalSteps: 581, Loss: 0.0526295', 'Go Straight'
Train::: Epoch: 0, Step: 580, TotalSteps: 581, Loss: 165.592', 'Speed Prediction Branch'

Jan 06 '18 16:01 mvpcom

Hey @mvpcom . Sorry for taking so long to answer. The models I tried would take 24-36 hours to train on a titan Xp. I don't have easy access to this diagrams now. I will probably try to retrain this model and give more detailed information. Thanks for the contribution.

Jan 29 '18 16:01 felipecode

Thanks, @felipecode. I am eager to see more detail information. Although a Titan X Pascal is much much better than a Titan X, I'm not sure if it is a good reason for my implementation to take much much longer time for 294000 steps (approximately around 12 days for Titan X and 52 days for 1080). I have to recheck my data loading process.

Jan 29 '18 16:01 mvpcom

Some things that may help.

Run all branches , but just back propagate on one ( use a mask for that). Conditions kill parallelism.
I use the tensorflow queues for data loading.

For the rest, I would say you have some specific bug. 12 days is too much.

Jan 29 '18 16:01 felipecode

@mvpcom is your train code can public?

Feb 01 '18 01:02 zdx3578

@zdx3578 Yes if the Carla Team doesn't mind that because as far as I know, they are waiting for publishing their paper and this is the reason why I wrote the train code myself from scratch.

Feb 01 '18 06:02 mvpcom

@mvpcom Yes, sure ! Please do it ! The paper is already done. But we want to release good code, with good tutorials so we are working on improving what we have, and then we will release !

Cheers

Feb 01 '18 14:02 felipecode

@zdx3578 @felipecode Here you can find the first draft of the code. It was part of a larger code, so I'm not sure if it is without bugs. Besides that, it is necessary to revise the code to speed up the training process as we discussed above. If you find any bug, please let me know to fix it up.

Feb 01 '18 14:02 mvpcom

https://github.com/pathak22/noreward-rl this is good job for train can ref it .

Feb 07 '18 07:02 zdx3578

@felipecode Two question for you Felipe. First of all, I'm not sure how to do masking for backpropagation on one branch only. This one is beneficial for reduce the training time. Do you know any link that may help? And unfortunately, I can't load your saved checkpoint into my model. It seems there are some different points in both models. Is there any consideration that I have to consider? It will be awesome if you share your input pipeline too.

Feb 15 '18 12:02 mvpcom

I would like to know what is exactly meant by manual driving here. Is it human driving on the roads of Town01 or can it be done manually in the simulator itself( If I'm not wrong, the standalone mode is just a video game mode). I'd be really helped if someone helps me out on this!

Thanks in advance!

Jun 24 '18 14:06 soham2017

@soham2017 As I see in the paper,80% of the data was collected in Carla an 20% is real manual driving data using a little truck.

Jul 16 '18 02:07 zhanyifei

I build a pytorch version to train the policy in case that you may be interested.

Dec 10 '18 12:12 onlytailei

imitation-learning imitation-learning copied to clipboard

Questions about Model and Training

imitation-learning
imitation-learning copied to clipboard