turicreate icon indicating copy to clipboard operation
turicreate copied to clipboard

Can not merge two SFrames which easily fit in memory

Open darknoon opened this issue 5 years ago • 5 comments

I followed the directions here for sound classification, starting with ESC-10 to verify everything is working. I wasn't able to get turi create working with the GPU since I have CUDA 10.0 installed, not CUDA 9.0, so I'm using CPU training.

I am running out of RAM on the full dataset, though. I have 16GB of ram. Is there anything I can do to fix this, or should I rent an AWS instance to complete training?

(venv) andrew@ml-dev-box:~/Developer/PotionAudio$ python train.py 
Finished parsing file /home/andrew/Developer/PotionAudio/ESC-50/meta/esc50.csv
Parsing completed. Parsed 100 lines in 0.0181 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,int,int,str,str,int,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /home/andrew/Developer/PotionAudio/ESC-50/meta/esc50.csv
Parsing completed. Parsed 2000 lines in 0.016908 secs.
Making test and train sets
Creating model
Creating a validation set from 5 percent of training data. This may take a while.
        You can set ``validation_set=None`` to disable validation tracking.

Preprocessing audio data -
Preprocessed 59 of 1520 examples
Preprocessed 206 of 1520 examples
Preprocessed 352 of 1520 examples
Preprocessed 500 of 1520 examples
Preprocessed 649 of 1520 examples
Preprocessed 797 of 1520 examples
Preprocessed 945 of 1520 examples
Preprocessed 1094 of 1520 examples
Preprocessed 1242 of 1520 examples
Preprocessed 1390 of 1520 examples
Preprocessed 1520 of 1520 examples

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

darknoon avatar May 10 '19 16:05 darknoon

I'm not able to reproduce this issue. I can train a model on that dataset with 16GB of RAM.

@darknoon - what operating system are you using?

TobyRoseman avatar May 10 '19 18:05 TobyRoseman

@TobyRoseman this is on Ubuntu 18.04.2, python 3.6 in a virtualenv. Steady state I have about 13910076 free.

darknoon avatar May 10 '19 19:05 darknoon

I can reproduce this out of memory issue on Ubuntu 18 with Python 3.6. Looks like this is not related to the sound classifier. It's running out of memory before the sound classifier is even called.

Using the code here it runs out of memory on this line: data = data.join(meta_data)

I have 6GB of RAM. When written to disk data is 1.4G. meta_data is 92K.

I'm seeing this behavior with both the 5.5 and 5.4 release.

TobyRoseman avatar May 20 '19 22:05 TobyRoseman

I can also reproduce this issue on Ubuntu 16 and on Python 2.7.

The merge doesn't run out of memory if I do the following steps: 1 - Save the two SFrames to disk. 2 - Restart my Python session. 3 - Load the two SFrames from disk. This works even if I start up the lambda workers (in the new session) prior to doing the merge.

TobyRoseman avatar May 21 '19 01:05 TobyRoseman

On Ubuntu 20 with TuriCreate 6.4 - running the Sound Classifier example code (on the full ESC 50 dataset) is often still resulting in out of memory error.

TobyRoseman avatar Sep 10 '20 22:09 TobyRoseman