cog icon indicating copy to clipboard operation
cog copied to clipboard

Tensorflow is broken on M1 Mac

Open bfirsh opened this issue 2 years ago • 5 comments

Seeing this when building https://github.com/replicate/cog-examples/tree/main/hello-world:

Adding labels to image...
$ docker run --rm --shm-size 8G r8.im/bfirsh/hello-world-2 python -m cog.command.type_signature
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
2021-11-19 03:27:11.826541: F tensorflow/core/lib/monitoring/sampler.cc:42] Check failed: bucket_limits_[i] > bucket_limits_[i - 1] (0 vs. 10)
qemu: uncaught target signal 6 (Aborted) - core dumped

It's this issue: https://github.com/docker/for-mac/issues/5342 And: https://github.com/tensorflow/tensorflow/issues/52845

bfirsh avatar Nov 19 '21 03:11 bfirsh

See also https://github.com/replicate/cog/issues/273

zeke avatar Nov 19 '21 05:11 zeke

See also https://github.com/tensorflow/tensorflow/issues/52973

zeke avatar Feb 09 '22 06:02 zeke

We should have a getting started example working. Perhaps we should switch to PyTorch.

bfirsh avatar Mar 17 '22 17:03 bfirsh

Any solutions?

yc-cui avatar Sep 04 '22 15:09 yc-cui

Just checking if anyone was able to get around this?

sanschaise avatar Sep 16 '22 00:09 sanschaise

Same issue, any solutions?

okanji avatar Nov 06 '22 05:11 okanji

same issue on Mac M1

xcreateadmin avatar Feb 15 '23 15:02 xcreateadmin

same issue on Mac M2

sentient-22 avatar Jun 10 '23 19:06 sentient-22

A workaround is to set environment variable:

export DOCKER_DEFAULT_PLATFORM=linux/amd64

This will force docker run to use linux/amd64 platforn by default on MacOS.

Reference: https://stackoverflow.com/questions/65612411/forcing-docker-to-use-linux-amd64-platform-by-default-on-macos

hongchaodeng avatar Jun 11 '23 00:06 hongchaodeng

I think the best solution we can do in cog is to detect MacOS as the OS and specify linux/amd64 as a fallback.

hongchaodeng avatar Jun 11 '23 00:06 hongchaodeng

@hongchaodeng Unfortunately, setting DOCKER_DEFAULT_PLATFORM won't help. The underlying issue is that TensorFlow is compiled to use AVX instructions, but they aren't available. I agree with @bfirsh (https://github.com/replicate/cog/issues/336#issuecomment-1071099038) that we need to rewrite the Getting Started guide to use PyTorch instead.

I can start working on that this week.

mattt avatar Jun 21 '23 13:06 mattt

hello! it is in fact true that things break on an M1 mac. are there other ways we can go about testing these things locally until the tutorial is re-written? thank you!

peterpme avatar Jun 23 '23 00:06 peterpme