Andrej comments

Results 373 comments of


                                            Andrej

Feature: beam search for improving global quality of new text samples

I recently implemented beam search for an RNN Language Model in context of Image Captioning in NeuralTalk2 repo (https://github.com/karpathy/neuraltalk2/blob/master/misc/LanguageModel.lua#L166). It does makes things work quite a bit better there. I...

Create web visualization tools

Thanks! This looks great and I'd be happy to merge something like this when the kinks are ironed out.

Create web visualization tools

Is it easy to also include, e.g. validation loss? looking at it in context of training loss is very useful usually. We can expand on this in the future I...

Create web visualization tools

Ok I had a closer look at the code and while I am onboard with the general idea of including web-based visualization of the training progress, I am hesitant to...

Training w/ GPU, sampling without GPU

I was just thinking about this as well. @soumith is the preferred solution to always save CPU models and explicitly convert to GPU int he sampling script if the user...

Training w/ GPU, sampling without GPU

@soumith I'm not fully comfortable with some of these API and best practices. I'm planning to iterate over all entries in proto, convert them with `:float()`, save to file, and...

Training w/ GPU, sampling without GPU

@soumith ahhh! Glad I asked, that's precisely the kind of gotcha I was afraid of. I'll keep this in mind.

Training w/ GPU, sampling without GPU

Yes I think I was going to do this but then decided it would be tricky due to parameter tying issues. The problem is that when you case the model...

Training w/ GPU, sampling without GPU

I created a quick script to convert char-rnn GPU models to CPU models as a temporary solution to this issue. In the long run we'll want to always save a...

Training w/ GPU, sampling without GPU

@soumith Hey Soumith RE: this issue with char-rnn, I think there is support now in Torch that doesn't destroy parameter sharing when model is shipped between CPU GPU. Though I'm...