pykliep
pykliep copied to clipboard
Memory Error
Thanks for your implementation. I run the code in a training set of ~ 150k rows and a test set of ~ 80k rows with ~24 features and get Memory Error. The stack trace is:
File "modeling.py", line 552, in score
kliep.fit(X_train, X_test) # keyword arguments are X_train and X_test
File "pykliep.py", line 83, in fit
sigma=sigma)
File "pykliep.py", line 124, in _fit
sigma=sigma)
File "pykliep.py", line 162, in _find_alpha
b = self._phi(X_train, sigma).sum(axis=0) / X_train.shape[0]
File "pykliep.py", line 154, in _phi
return np.exp(-np.sum((X-self._test_vectors)**2, axis=-1)/(2*sigma**2))
MemoryError```
What is the possible cause of this error?
It appears you ran out of memory. How much RAM do you have? Try it on less data.
On Wed, Aug 29, 2018 at 9:00 PM Nhan Vu [email protected] wrote:
Thanks for your implementation. I run the code in a training set of ~ 150k rows and a test set of ~ 80k rows with ~24 features and get Memory Error. The stack trace is:
File "modeling.py", line 552, in score kliep.fit(X_train, X_test) # keyword arguments are X_train and X_test File "pykliep.py", line 83, in fit sigma=sigma) File "pykliep.py", line 124, in _fit sigma=sigma) File "pykliep.py", line 162, in _find_alpha b = self._phi(X_train, sigma).sum(axis=0) / X_train.shape[0] File "pykliep.py", line 154, in _phi return np.exp(-np.sum((X-self._test_vectors)2, axis=-1)/(2*sigma2)) MemoryError```
What is the possible cause of this error?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/srome/pykliep/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AJTcKT7Kl47BQT0fs4VX4Hg9w5u6dV_Nks5uVzkcgaJpZM4WSmeJ .
-- Sent from my iPhone
This happen to me with data of size 350,000 rows and 18 columns. My RAM is 32GB but with about 25GB actually free. It normally handles this much data with absolute ease.
But also consider the parameters to be saved for the algorithm. However, I get this error more when I have larger parameter sets
I'm facing the same exact problem! My train dataset consists of 24k rows and 400 features.
I'm using a Cloud VM with 56 GB RAM! I tracked the ram usage when I ran your algorithm and exceeds 53k MB of ram ~ 52 GB!
Is there any way out @srome