openml-python icon indicating copy to clipboard operation
openml-python copied to clipboard

get_dataset(), "The kernel appears to have died. It will restart automatically"

Open learsi1911 opened this issue 3 years ago • 9 comments

Description

Steps/Code to Reproduce

Expected Results

Actual Results

learsi1911 avatar Jun 14 '21 11:06 learsi1911

Hi, I'll move this to the openml-python issue tracker

I'm guessing you tried to download a large dataset? This is a known issue. The ARFF parser uses too much memory.

We have implemented parquet support, but this is not yet in the current release.

joaquinvanschoren avatar Jun 14 '21 11:06 joaquinvanschoren

We have implemented parquet support, but this is not yet in the current release.

Small correction, it should be available in the current release as soon as the production server sends valid information on where the parquet file is located.

PGijsbers avatar Jun 14 '21 11:06 PGijsbers

Thank you very much for your answer, do you know approximately how long is the time for this new version?

On Mon, Jun 14, 2021 at 1:39 PM PGijsbers @.***> wrote:

We have implemented parquet support, but this is not yet in the current release.

Small correction, it should be available in the new release as soon as the production server sends valid information on where the parquet file is located.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openml/openml-python/issues/1093#issuecomment-860617099, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGNP2MRNQUD7BB5INTXKJGLTSXTABANCNFSM46VBXS7Q .

learsi1911 avatar Jun 14 '21 11:06 learsi1911

@learsi1911 Could you please provide the ID of the dataset you were trying to download? And could you share how much memory was available to the kernel? That information would allow us to test whether the issue is resolved when the parquet support is fully operational.

@prabhant Do you have an estimate on when the parquet files are available from the production server?

PGijsbers avatar Jun 14 '21 11:06 PGijsbers

@learsi1911 Could you please provide the ID of the dataset you were trying to download? And could you share how much memory was available to the kernel? That information would allow us to test whether the issue is resolved when the parquet support is fully operational.

@prabhant Do you have an estimate on when the parquet files are available from the production server?

Of course the ID is 547 As I said the problem is that the first time I used "get_dataset()" I have no problem but if I try again then I get the error.

learsi1911 avatar Jun 14 '21 11:06 learsi1911

The production server with parquet support will be ready in a week or two.

prabhant avatar Jun 14 '21 11:06 prabhant

Dataset 547 is not really large and shouldn't result in any issues. Could you please run the failing snippet from within ipython and paste the output?

mfeurer avatar Jun 15 '21 06:06 mfeurer

Yes, I have tried python directly in the windows console and it works, maybe it is something related to jupyter.

On Tue, Jun 15, 2021 at 8:41 AM Matthias Feurer @.***> wrote:

Dataset 547 is not really large and shouldn't result in any issues. Could you please run the failing snippet from within ipython and paste the output?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openml/openml-python/issues/1093#issuecomment-861221532, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGNP2MS7O5EPETWPQNRSZQLTS3YYXANCNFSM46VBXS7Q .

learsi1911 avatar Jun 15 '21 09:06 learsi1911

The jupyter notebook kernels typically work with much less memory than a regular python process. But as mfeurer said, the dataset isn't large and should not lead to a kernel dying. It would be helpful if you could post the code that lead to the error and the full error output.

PGijsbers avatar Jun 21 '21 19:06 PGijsbers

If the problem still occurs, please re-open this issue but provide a code example that reproduces the error.

PGijsbers avatar Nov 29 '22 09:11 PGijsbers