Scott LeGrand
Scott LeGrand
Even without the sparse kernels, DSSTNE's management of sparse data beats TensorFlow because I believe the latter relies of cuSparse and it is performing a signficant amount of uploading of...
I would turn on verbose mode and watch per-minibatch training error to see when it blows up.
How did you create the datasets here?
Hey Pierce, Stick to single GPU for now, the multi-GPU edition of this is a work-in-progress but I suspect you'll be happy with the results, a reimplementation of Krizhevsky's one...
So I'd have to see your data to be sure, but how are your items distributed in your training set? If it follows a Power Law (or Zipf Distribution), your...
Also, try this network out, it gives the best performance on MovieLens I've seen... { "Version" : 0.8, "Name" : "AIV NNC", "Kind" : "FeedForward", "ShuffleIndices" : false, "ScaledMarginalCrossEntropy" :...
Lower your learning rate. On Nov 20, 2017 4:23 AM, "lightsailpro" wrote: > The new model is bad. The average error becomes larger and larger with > each epoch. >...
This is the "alpha" parameter to the Train command, looking at the source code (Train.cpp): // Hyper parameters float alpha = stof(getOptionalArgValue(argc, argv, "-alpha", "0.025f")); float lambda = stof(getOptionalArgValue(argc, argv,...
PS looking at my own code, try 0.01 to start...
So are you filtering out items the viewer has already purchased/viewed? Second, instead of trying to run this as an autoencoder of the viewing history, you could instead try to...