reczilla icon indicating copy to clipboard operation
reczilla copied to clipboard

memory errors with INeuRec, UNeuRec

Open duncanmcelfresh opened this issue 3 years ago • 1 comments

Below are the traces. These occurred on a GCP n1-highmem-2 node with 1x tesla T4.

This one occurred on on datasets:

  • AmazonMoviesTVReader
  • BookCrossingReader
  • DatingReader
  • EpinionsReader
  • GoogleLocalReviewsReader
  • GowallaReader
  • Movielens10MReader
INeuRec_RecommenderWrapper: Init model...
INeuRec: init..
Traceback (most recent call last):
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/SearchAbstractClass.py", line 402, in _objective_function
    result_dict, result_string, recommender_instance, train_time, evaluation_time = self._evaluate_on_validation(current_fit_parameters_dict)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/RandomSearch.py", line 50, in _evaluate_on_validation
    current_fit_parameters
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/SearchAbstractClass.py", line 293, in _evaluate_on_validation
    recommender_instance, train_time = self._fit_model(current_fit_parameters)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/SearchAbstractClass.py", line 283, in _fit_model
    **current_fit_parameters)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/Conferences/IJCAI/NeuRec_our_interface/INeuRecWrapper.py", line 78, in fit
    self.model.fit(self.URM_train)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/Conferences/IJCAI/NeuRec_our_interface/INeuRec.py", line 41, in fit
    train = urm.T.todense()
  File "/home/shared/miniconda3/envs/reczilla/lib/python3.6/site-packages/scipy/sparse/base.py", line 849, in todense
    return np.asmatrix(self.toarray(order=order, out=out))
  File "/home/shared/miniconda3/envs/reczilla/lib/python3.6/site-packages/scipy/sparse/compressed.py", line 962, in toarray
    out = self._process_toarray_args(order, out)
  File "/home/shared/miniconda3/envs/reczilla/lib/python3.6/site-packages/scipy/sparse/base.py", line 1187, in _process_toarray_args
    return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError

and this one on

  • AnimeReader
  • MovieTweetingsReader
INeuRec_RecommenderWrapper: Init model...
INeuRec: init..
Traceback (most recent call last):
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/SearchAbstractClass.py", line 402, in _objective_function
    result_dict, result_string, recommender_instance, train_time, evaluation_time = self._evaluate_on_validation(current_fit_parameters_dict)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/RandomSearch.py", line 50, in _evaluate_on_validation
    current_fit_parameters
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/SearchAbstractClass.py", line 293, in _evaluate_on_validation
    recommender_instance, train_time = self._fit_model(current_fit_parameters)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/ParameterTuning/SearchAbstractClass.py", line 283, in _fit_model
    **current_fit_parameters)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/Conferences/IJCAI/NeuRec_our_interface/INeuRecWrapper.py", line 78, in fit
    self.model.fit(self.URM_train)
  File "/home/shared/reczilla/RecSys2019_DeepLearning_Evaluation/Conferences/IJCAI/NeuRec_our_interface/INeuRec.py", line 63, in fit
    R = tf.constant(train, dtype=tf.float32)
  File "/home/shared/miniconda3/envs/reczilla/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 179, in constant_v1
    allow_broadcast=False)
  File "/home/shared/miniconda3/envs/reczilla/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 283, in _constant_impl
    allow_broadcast=allow_broadcast))
  File "/home/shared/miniconda3/envs/reczilla/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 537, in make_tensor_proto
    "Cannot create a tensor proto whose content is larger than 2GB.")
ValueError: Cannot create a tensor proto whose content is larger than 2GB.

duncanmcelfresh avatar Jul 29 '22 02:07 duncanmcelfresh

this also occurs with UNeuRec

duncanmcelfresh avatar Jul 29 '22 16:07 duncanmcelfresh