fastshap icon indicating copy to clipboard operation
fastshap copied to clipboard

Notebook issue: the kernel appears to have died. it will restart automatically.

Open OliverZijia opened this issue 2 years ago • 1 comments

Hi!

I got this notification while running this line in cifar_single_model.ipynb:

fastshap.train( fastshap_train, fastshap_val, batch_size=128, num_samples=2, max_epochs=200, eff_lambda=1e-2, validation_samples=1, lookback=10, bar=True, verbose=True)

I reinstall the fastshap, but the issue remains there.

Does anyone know what should I do?

Thanks!

OliverZijia avatar Mar 09 '22 11:03 OliverZijia

Hi Oliver - that's weird, let's see if we can figure out what's going on. It may take some poking around to see exactly what the problem is given that there's no error message, but I have some ideas.

First, just to be sure, can you confirm:

  • The dataset downloaded properly
  • You were able to train the model with missingness using imputer.train(...)
  • You were able to initialize the explainer model with UNet(...) and the FastSHAP wrapper with FastSHAP(...)
  • You have a GPU with ~10Gb memory

If all of the above is fine, I'll get to my first guess. In my experience, problems that lead to the kernel dying without an error message typically involve some incorrect operation on the GPU. Would you be able to try moving the two DNNs (the model and explainer objects) to CPU and see what happens when you train? It will be quite slow, but maybe we'll see an error message within the first couple training steps.

iancovert avatar Mar 09 '22 20:03 iancovert