lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

Illegal instruction (core dumped) soon after starting to fit

Open pedro-f-nogueira opened this issue 4 years ago • 2 comments

This error is somewhat weird and seems to violate docker principles. Here is the explanation for it:

  • I installed lightfm in a docker container in machine 1 via pip (have tried both 1.15 and 1.16 version).
  • The fit works fine in machine 1 by running it inside the docker container.
  • I move the same image to machine 2 and try to run fit there and the error "Illegal instruction (core dumped)" shows up as soon as epoch 0 starts.
  • If I enter the docker container and reinstall lightfm in the same image in machine 2 fit starts working fine.

This is similar or exactly the same as https://github.com/lyst/lightfm/issues/559#issuecomment-719418835 and I can't seem to find a solution using pip to make this package work in a docker scenario where the image is built in one machine and then moved to another. Any idea on how to fix it?

pedro-f-nogueira avatar Jan 18 '21 15:01 pedro-f-nogueira

Hi @pedro-f-nogueira ,

as far as I know the Illegal Instruction Error often occurs when CPU instructions are called that don't exist for your CPU. This is something that is NOT abstracted away by Docker.

I'm not an expert on this but you could try to set the LIGHTFM_NO_CFLAGS environment variable, clone the Repo, and install locally. That way, you should be fine on most CPUs at the cost of some performance. You could also use conda within your Docker container and install lightfm1.16.

Hope this helps.

SimonCW avatar Jan 19 '21 16:01 SimonCW

+1 for conda. For reference: I was encountering the core dumps on a GPU notebook w/docker.

taylorsmithgg avatar Aug 03 '22 23:08 taylorsmithgg