tffm icon indicating copy to clipboard operation
tffm copied to clipboard

Running tffm on a single core

Open martincousi opened this issue 6 years ago • 1 comments

I need to run multiple TFFMRegressor objects in joblib Parallel. To do so, I passed the following parameter:

session_config=tf.ConfigProto(intra_op_parallelism_threads=1,
                                                    inter_op_parallelism_threads=1,
                                                    allow_soft_placement=True,
                                                    device_count = {'CPU': 1, 'GPU': 0})

However, my cores do not seem to run whenever I use n_jobs=2 or higher in Parallel; my python notebook cell just hangs, never completes and my processors are not used. At n_jobs=1, everything is running fine. What am I missing? Would I better use polylearn instead for this kind of task?

martincousi avatar Apr 12 '18 20:04 martincousi

Ok it appears that passing this session_config parameter does not do anything, at least, on my machine. So, it is not needed

I was able to use Parallel in the end by:

  • passing a numpy.float32 value for init_std (otherwise I sometime got an error)
  • standardizing the values in X to $|x_{ij}| <= 1$ (otherwise I got an error when using a high order)

It appears that passing a numpy.float32 value is not necessary the parameter reg.

Now, I know that my question is highly problem dependent but what are good range of values when doing a randomized search over the different parameters? My problem contains about 400 observations and I see that there are several parameters that could be tuned such as order, rank, optimizer (and its parameters), reg, etc. It even appears that batch_size and n_epochs are somewhat linked.

martincousi avatar Apr 25 '18 17:04 martincousi