swift-apis icon indicating copy to clipboard operation
swift-apis copied to clipboard

Session Crashes in Colab & Jupyter Lab

Open SumanSudhir opened this issue 5 years ago • 2 comments

I have noticed that whenever the input to the function doesn't satisfy the precondition then it result in crashing of a Google Colab and Swift-Jupyter Lab. Here is the reference notebook for the same

Screenshot from 2020-01-15 17-17-54

SumanSudhir avatar Jan 15 '20 11:01 SumanSudhir

This is a fundamental problem with how swift-jupyter is implemented, and we don't have any short term solutions to the problem. I'll keep this open and add an "open-design-questions" label because it would be nice to solve this problem eventually.

What is happening is that these precondition failures crash the program, putting it in an undefined state. LLDB (the thing that actually powers swift-jupyter's swift compilation and execution) tries to recover, by resetting the program to a reasonable state, but it doesn't alway succeed.

We brainstormed some ideas for fixing this, but none of the good ones will be easy to do any time soon:

  • Make the Tensor APIs return garbage (e.g. Tensor(0)) when there is an error, instead of crashing the program. This would be easy to implement, but it's not a good to return results that look successful when there was an error.
  • Turn the Tensor type into an enum with an error case, and return that on errors. This might be a good solution but it's a pretty significant API change and will have lots of consequences to deal with.
  • Change all the Tensor APIs to throwing functions, so that Swift's native error handling mechanism can deal with errors. This is probably bad, because we don't want to force users to write try in front of all their tensor operations.
  • Add an exception mechanism to Swift that can properly unwind the stack on errors. This is a nice solution, but it's lots of work and it's unlikely that Swift would accept this feature, because Swift already has its own non-exception-based error handling mechanism.

Related discussion: https://forums.swift.org/t/force-unwrapping-try-and-fatalerror-in-the-lldb-repl-cause-memory-leaks/20823

marcrasi avatar Jan 22 '20 22:01 marcrasi

I made a post on the SIG a couple of weeks ago: https://groups.google.com/a/tensorflow.org/forum/m/#!topic/swift/uACgFQJqzEI. These issues might be related.

rickwierenga avatar Feb 04 '20 18:02 rickwierenga