enso
enso copied to clipboard
Disappearing stack trace frames
Under not-very-clear circumstances (but I've found a reliable repro, I just don't know WHY it's happening), when we get a panic, it is lacking stack frames that should be there. In at least one case this led to a bug that was hard to find because the place where it was being triggered was buried in the missing stack frames.
Repro
This repro is a bit invovled but as for now I could not find a simpler one. Most likely it will only work on Windows, because it relies on the error deleting a file open in another program.
- First apply the following patch:
Index: test/Table_Tests/src/IO/Excel_Spec.enso
===================================================================
diff --git a/test/Table_Tests/src/IO/Excel_Spec.enso b/test/Table_Tests/src/IO/Excel_Spec.enso
--- a/test/Table_Tests/src/IO/Excel_Spec.enso (revision c5bf2384e42058e7a60ffaef77f9284487030ddb)
+++ b/test/Table_Tests/src/IO/Excel_Spec.enso (date 1714145234362)
@@ -120,12 +120,11 @@
clothes = enso_project.data/'clothes.csv' . read
sub_clothes = clothes.select_columns [0, 1]
counter = Ref.new 0
- prefix = "out_" + (Random.integer 1 10000).to_text + "_"
+ prefix = "out_"
Spec_Write_Data.Value [table, clothes, sub_clothes] counter suffix prefix
teardown self =
- enso_project.data/"transient" . list (self.prefix+"*") . each f->
- f.delete . catch Any error-> IO.println "[CLEANUP] Failed to delete "+f.to_text+": "+error.to_display_text
+ enso_project.data/"transient" . list (self.prefix+"*") . each f-> f.delete
create_out self suffix=self.suffix =
i = self.counter.get + 1
- Now create and run the following Enso script:
from Standard.Base import all
from Standard.Table import all
main =
f = File.new "test/Table_Tests/data/transient/out_123.xls"
IO.println <| (Table.new [["X", [1,2,3]]]).write f
opened_workbook = f.read
IO.println opened_workbook.sheet_names
# Now pause execution by opening the REPL to hold the file locked:
Standard.Base.Runtime.Debug.breakpoint
- The script should print a path and
[EnsoSheet]
and open a REPL. Keep that REPL open - we need to keep a file locked. - Now, run
.\test\Table_Tests\src\IO\Excel_Spec.enso
The last command should fail with the following stack trace:
Execution finished with an error: An IO error has occurred: java.nio.file.FileSystemException: X:\NBO\enso\test\Table_Tests\data\transient\out_123.xls: Proces nie może uzyskać dostępu do pliku, ponieważ jest on używany przez inny proces (X:\NBO\enso\test\Table_Tests\data\transient\out_123.xls).
at <enso> Panic.throw(Internal)
at <enso> case_branch(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Error.enso:138:25-43)
at <enso> <anonymous>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Error.enso:137-142)
at <enso> Error.catch_primitive(Internal)
at <enso> Error.catch(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Error.enso:136-142)
+ at <enso> Panic.type.rethrow(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Panic.enso:88:21-47)
at <enso> go<arg-2>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Range.enso:259:21-52)
- at <enso> go(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Range.enso:258-260)
- at <enso> null(Internal)
at <enso> <anonymous>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Test\0.0.0-dev\src\Group.enso:39:37-40)
at <enso> case_branch(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Test\0.0.0-dev\src\Helpers.enso:36:13-34)
at <enso> Helpers.run_specs_from_group(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Test\0.0.0-dev\src\Helpers.enso:26-37)
(...) truncated
I have highlighted red two lines between which we are missing some stack frames. We know that the Panic is encountered in our Spec_Write_Data.teardown
method, and the panic is caused by rethrowing an error returned by a callback within Vector.each
. We can see the Panic.rethrow
(highlighted green), but we don't see Spec_Write_Data.teardown
nor Vector.each
frames.
Now, to prove that these frames should be there, I will modify the code again. Apply the following patch now:
Index: test/Table_Tests/src/IO/Excel_Spec.enso
===================================================================
diff --git a/test/Table_Tests/src/IO/Excel_Spec.enso b/test/Table_Tests/src/IO/Excel_Spec.enso
--- a/test/Table_Tests/src/IO/Excel_Spec.enso (revision 36be3ec7b0f4bd45b4f20e6584a9d6bf46636351)
+++ b/test/Table_Tests/src/IO/Excel_Spec.enso (date 1714146286800)
@@ -124,7 +124,8 @@
Spec_Write_Data.Value [table, clothes, sub_clothes] counter suffix prefix
teardown self =
- enso_project.data/"transient" . list (self.prefix+"*") . each f-> f.delete
+ enso_project.data/"transient" . list (self.prefix+"*") . each f->
+ Error.throw (Illegal_Argument.Error "hi there!")
create_out self suffix=self.suffix =
i = self.counter.get + 1
We essentially replace the File
error with our own thrown dataflow error. Now run the test/Table_Tests/src/IO/Excel_Spec.enso
again. This time the stack trace is following:
Execution finished with an error: Illegal Argument: hi there!
at <enso> Panic.throw(Internal)
at <enso> case_branch(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Error.enso:138:25-43)
at <enso> <anonymous>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Error.enso:137-142)
at <enso> Error.catch_primitive(Internal)
at <enso> Error.catch(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Error.enso:136-142)
at <enso> Panic.type.rethrow(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Panic.enso:88:21-47)
at <enso> go<arg-2>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Range.enso:259:21-52)
- at <enso> go(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Range.enso:258-260)
+ at <enso> Range.each<arg-2>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Range.enso:261:13-25)
+ at <enso> Range.each(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Range.enso:255-261)
+ at <enso> Array_Like_Helpers.each(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Internal\Array_Like_Helpers.enso:282-283)
+ at <enso> Vector.each(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base\0.0.0-dev\src\Data\Vector.enso:785:9-38)
+ at <enso> Spec_Write_Data.teardown(Excel_Spec.enso:127-128)
+ at <enso> <anonymous><arg-1>(Excel_Spec.enso:149:13-25)
- at <enso> null(Internal)
at <enso> <anonymous>(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Test\0.0.0-dev\src\Group.enso:39:37-40)
at <enso> case_branch(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Test\0.0.0-dev\src\Helpers.enso:36:13-34)
at <enso> Helpers.run_specs_from_group(X:\NBO\enso\built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Test\0.0.0-dev\src\Helpers.enso:26-37)
(...) truncated
I have highlighted green frames that are now appearing in the stack trace (and red the same frames as in the previous trace). In both scenarios the panic is caused by Panic.rethrow
rethrowing a dataflow error returned from a callback in Vector.each
. So knowing how that works, we know that this part of the stack trace should be the same in both scenarios. Yet, in this scenario we see these frames, but in the one above - we do not.
What makes this worst is that the <enso> Spec_Write_Data.teardown(Excel_Spec.enso:127-128)
frame got dropped making the bug above pretty hard to debug.
The null(Internal)
frame also looks a little bit suspicious.
Seems like this may be related to tail call optimizations throwing away some stack frames, tracked in https://github.com/enso-org/enso/issues/9251.
Another related issue might be #8153