cog
cog copied to clipboard
Print errors from type signature
On M1 Mac, the build process was hanging forever because an error was eaten and never displayed. This pull request outputs stdout and stderr from the container when generating the type signature. It makes the output a bit messier if everything works OK, but messy is better than broken.
To make that work, I had to amend the change in #332 to redirect output to stderr instead of suppressing it completely.
@andreasjansson Look good to you? Will wait for your approval, because you did previous PR :)
The test from #332 remains. Do you mean some kind of unit test?
@bfirsh Some kind of test that fails at import time and outputs some log messages, which are then displayed by cog. Basically just some way of reproducing the tensorflow issue you ran into, in isolation.
Gotchya. Testing the specific issue I'm running to will be rather hard because it's only on M1 Macs under QEMU. The integration test in #332 covers messages at import time, I think, so that seems like a reasonable test for this case.
I've also added a broken unit test. Either stdout_redictor is broken, my test is broken, or the integration test from #332 is broken.
As an aside: we could probably make a better integration test for this. As far as I can see the test in #332 is included in the default project for most of the integration tests, so will break all of the integration tests if the capturing isn't working correctly. It might be better as an isolated test that just tests messages at import time.
Yeah another isolated test would be nice. I put it in the common test project to make sure stdout suppression works everywhere. It's a bit of a blunt tool, but it potentially catches edge cases that a single isolated test would miss.
I was thinking we could emulate import-time failures by just raising an exception or importing a non-existing package, to make sure we handle and display errors during the type signature parsing step.
Discussed IRL with @zeke: we realized that if the type signature (now "schema") was fetched over HTTP (with some mechanism to stop it from running setup()) then this would fix this entire class of problem. It would also neatly mean the mechanism for fetching the schema was in one place instead of two.
Is this still relevant? 🤔
Yep. Needs rebasing and fixing/writing the test. It's not urgent/important, I think.