keepsake
keepsake copied to clipboard
When hitting Ctrl-C, exit message isn't printed
This seems to happen when there is too much work in queue. It doesn't eventually quit, though. A hypothesis: perhaps when Python is getting blocked making a call to the daemon, the wrapped pipe thread isn't working. (Maybe sys.stdout
is blocked by global interpreter lock, or something?!)
Here is traceback:
═══╡ Fetching new data from "s3://replicate-bfirsh-test-9"...
═══╡ Creating experiment a276c73, copying '.' to 's3://replicate-bfirsh-test-9'...
Downloading data set...
Epoch 0, train loss: 1.184, validation accuracy: 0.333
═══╡ Creating checkpoint 499d6a7, copying 'model.pth' to 's3://replicate-bfirsh-test-9'...
Epoch 1, train loss: 1.117, validation accuracy: 0.333
═══╡ Creating checkpoint d9858ad, copying 'model.pth' to 's3://replicate-bfirsh-test-9'...
Epoch 2, train loss: 1.061, validation accuracy: 0.467
═══╡ Creating checkpoint f37b61c, copying 'model.pth' to 's3://replicate-bfirsh-test-9'...
Epoch 3, train loss: 1.014, validation accuracy: 0.633
═══╡ Creating checkpoint 2183611, copying 'model.pth' to 's3://replicate-bfirsh-test-9'...
Epoch 4, train loss: 0.977, validation accuracy: 0.700
═══╡ Creating checkpoint 7f77873, copying 'model.pth' to 's3://replicate-bfirsh-test-9'...
Epoch 5, train loss: 0.950, validation accuracy: 0.900
═══╡ Creating checkpoint ee9f5ea, copying 'model.pth' to 's3://replicate-bfirsh-test-9'...
^CTraceback (most recent call last):
File "train.py", line 79, in <module>
train(args.learning_rate, args.num_epochs)
File "train.py", line 65, in train
experiment.checkpoint(
File "/Users/ben/p/replicate/python/replicate/console.py", line 58, in wrapper
return f(*args, **kwargs)
File "/Users/ben/p/replicate/python/replicate/experiment.py", line 134, in checkpoint
checkpoint = self._project._daemon().create_checkpoint(
File "/Users/ben/p/replicate/python/replicate/daemon.py", line 32, in wrapped
return f(*args, **kwargs)
File "/Users/ben/p/replicate/python/replicate/daemon.py", line 180, in create_checkpoint
ret = self.stub.CreateCheckpoint(
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 824, in __call__
state, call, = self._blocking(request, timeout, metadata, credentials,
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 813, in _blocking
event = call.next_event()
File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 338, in grpc._cython.cygrpc.SegregatedCall.next_event
File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 169, in grpc._cython.cygrpc._next_call_event
File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 163, in grpc._cython.cygrpc._next_call_event
File "src/python/grpcio/grpc/_cython/_cygrpc/completion_queue.pyx.pxi", line 63, in grpc._cython.cygrpc._latent_event
File "src/python/grpcio/grpc/_cython/_cygrpc/completion_queue.pyx.pxi", line 42, in grpc._cython.cygrpc._next
KeyboardInterrupt