ChakraCore
ChakraCore copied to clipboard
Potential Clang 12 miscompilation
OS: Ubuntu 22.04 x86-64 Compiler: Clang ~14.0.0~12.0.1 CMake build type: RelWithDebInfo
Bytecode generation for verification segfaults:
$ ./ch -GenerateLibraryByteCodeHeader -LdChakraLib -JsBuiltIn /path/to/ChakraCore/tools/../lib/Runtime/Library/InJavascript/Array_prototype.js
Segmentation fault (core dumped)
The crash comes from code in ByteCodeSerializer.cpp, though it might also be related to ImmutableList, or something else.
$ gdb --args ./ch -GenerateLibraryByteCodeHeader -LdChakraLib -JsBuiltIn /local/petr/ChakraCore/tools/../lib/Runtime/Library/InJavascript/Array_prototype.js
GNU gdb (Ubuntu 12.0.90-0ubuntu1) 12.0.90
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./ch...
(gdb) r
Starting program: /local/petr/ChakraCore/build-test/ch -GenerateLibraryByteCodeHeader -LdChakraLib -JsBuiltIn /local/petr/ChakraCore/tools/../lib/Runtime/Library/InJavascript/Array_prototype.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ff7f43cf640 (LWP 5873)]
[New Thread 0x7ff7f3b9f640 (LWP 5874)]
[New Thread 0x7ff7f339e640 (LWP 5875)]
Thread 1 "ch" received signal SIGSEGV, Segmentation fault.
regex::ImmutableList<Js::BufferBuilder*>::ReverseCurrentList (this=<optimized out>) at /local/petr/ChakraCore/lib/Common/DataStructures/ImmutableList.h:387
387 auto next = current->next;
(gdb) bt
#0 regex::ImmutableList<Js::BufferBuilder*>::ReverseCurrentList (this=<optimized out>)
at /local/petr/ChakraCore/lib/Common/DataStructures/ImmutableList.h:387
#1 Js::ByteCodeBufferBuilder::AddFunction (this=0x7fffffffd9f0, this@entry=0x7fffffffdd48,
builder=..., function=0x7ff7f2b65380, srcInfo=srcInfo@entry=0x7ff7f3ba1e40, cache=cache@entry=0x0)
at /local/petr/ChakraCore/lib/Runtime/ByteCode/ByteCodeSerializer.cpp:2440
Failure doesn't exist in Debug mode. I've tried to build some files with Debug, but that leads to undefined symbols (as more inlining removes some of the calls), maybe I'll have some more success with Release mode. ~Worth looking into earlier Clang versions, we don't have this failure on Ubuntu 20, where package manager doesn't have Clang 14.~ We didn't have this on Ubuntu 20, which has Clang 10 in the package repo. I can narrow down it to Clang 12 vs Clang 11, latter works.
There's a change to the behaviour of auto in an update to C++ standards - C++ proposal doc here: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3922.html I think this may have been implemented in Clang 12 and could be our issue.
In a lot of places in ImmutableList.h it looks like auto is used instead of ImmutableList<T>* simply to look neater - I wonder if replacing some of these could fix this?
Good point. After looking at the change it seems like it should only apply to auto initializers with {} init list. Might still be worth checking out, also anything else in Clang 12 that might look suspect.
Update: getting rid of auto in ImmutableList doesn't help. The segfault happens in ReverseCurrentList because current is null. It looks like the execution of the function goes for some time before it hits that.
I'd missed the update here hence tried the same - I was hoping it was something simple...
Though it is strange if current == null and we're hitting line 387 (the current->next) access where the crash hits as the loop control that guards that should be equivalent to !current == nullptr could null and nullptr be comparing as different? That seems unlikely - and I figure that would cause problems before here.
I need to look a bit closer at the spot where it is used, maybe there are some clues there. We also build with some extra flags, though I am not sure if those should lead to such an issue.
Try compiling with the newer clang on OSX as well, you might be able to reproduce it.
Changing the loop control to while (current != nullptr && current != NULL) did not help, so my unlikely idea was indeed not the issue.
I'm having a go at install a newer clang on my mac - never anything but the system version before. EDIT: using non-apple clang on macOS (to get a newer version) seems like a non-starter, way too many problems introduced.
I expected to hit whatever this issue is when building on Apple Silicon with newer macOS and newer Clang - but I've not hit it.
Unfortunately trying to build for macOS with non-apple Clang just does not work - Apple makes certain changes to the versions of Clang they release and working without them is no good.
I will endeavour to try with an Ubuntu VM shortly.
Unfortunately trying to build for macOS with non-apple Clang just does not work - Apple makes certain changes to the versions of Clang they release and working without them is no good.
Is it just incompatibility of the standard library or non-Apple Clang is that far behind in supporting the processor?
@rhuanjl
Changing the loop control to
while (current != nullptr && current != NULL)did not help, so my unlikely idea was indeed not the issue.
From playing with Compiler-Explorer and gdb: Might the IsEmpty function just be optimized away?
Compiler-Explorer
gdb output
Line 385 is only a single jmp?!
(gdb) step
385 while(!current->IsEmpty())
=> 0x00007ffff4d1cfbe <_ZN2Js21ByteCodeBufferBuilder11AddFunctionERNS_17BufferBuilderListEPNS_21ParseableFunctionInfoEPK7SRCINFOPNS_13ByteCodeCacheE+3550>: eb f0 jmp 0x7ffff4d1cfb0 <_ZN2Js21ByteCodeBufferBuilder11AddFunctionERNS_17BufferBuilderListEPNS_21ParseableFunctionInfoEPK7SRCINFOPNS_13ByteCodeCacheE+3536>
(gdb) continue
Continuing.
Thread 1 "ch" received signal SIGSEGV, Segmentation fault.
regex::ImmutableList<Js::BufferBuilder*>::ReverseCurrentList (this=<optimized out>) at /home/lukas/ChakraCore/lib/Common/DataStructures/ImmutableList.h:387
387 auto next = current->next;
=> 0x00007ffff4d1cfb3 <_ZN2Js21ByteCodeBufferBuilder11AddFunctionERNS_17BufferBuilderListEPNS_21ParseableFunctionInfoEPK7SRCINFOPNS_13ByteCodeCacheE+3539>: 48 8b 40 08 mov 0x8(%rax),%rax
(gdb) print current
$1 = (regex::ImmutableList<Js::BufferBuilder*> *) 0x0
A potential fix (worked for me):
diff --git a/lib/Common/DataStructures/ImmutableList.h b/lib/Common/DataStructures/ImmutableList.h
index 6b53d5e..0a2ad10 100644
--- a/lib/Common/DataStructures/ImmutableList.h
+++ b/lib/Common/DataStructures/ImmutableList.h
@@ -748,10 +748,12 @@ namespace regex
return result;
}
+ bool __attribute__((noinline)) CheckNull( void* obj ) { return obj == nullptr; }
+
// Info: Return true if the list is empty.
bool IsEmpty()
{
- return this==Empty();
+ return CheckNull(this);
}
Lovely, I think you are absolutely right @ShortDevelopment, it is removing undefined behavior just like the reddit post describes. We probably need to re-think this code, as I am sure future versions of Clang/GCC will find new creative ways to slap us on the wrist here.
After experimenting a bit, this would fail to compile in MSVC, as it won't take noinline attribute (also MSVC doesn't remove that code at the moment anyway), but overall the workaround looks OK.
In the long run, there would be a small penalty for using a non-inlineable function, plus it is still technically undefined behavior. I can try to think what a refactoring of this class would look like to avoid the situation.
I think the easiest solution is to remove the IsEmpty function and replace it with explicit nullptr-checks.
ImmutableList* ptr = nullptr;
- if(ptr->IsEmpty())
+ if(nullptr == ptr)
To future proof the code and avoid any submarine bugs we ought to delete IsEmpty, and re-factor throughout.
I note however that in a couple of places this may require changes at function call sites, currently several Methods in that class are designed to be able to be called on a nullptr OR a valid item; in light of modern C++ optimising away this==nullptr checks I think we should refactor and check we're not calling any methods on nullptrs.
@ppenzin Maybe rename this to
Refactor
ImmutableList
After thinking about this for a bit, I think we can make IsEmpty a static function - it would still be able to do nullptr checks, though not from within a class on instance pointer. Instead of x.isEmpty it would be isEmpty(x). And it would still be inlineable.
I'd be keen to delete half the file if we could; I had a look through the codebase, Immutable List is used in only 2 places:
- ByteCodeSerialiser.cpp which uses the methods: Prepend and ReverseCurrentList AND
- BufferBuilder.h which uses the methods: Accumulate and Iterate
Other than those 4 methods (and parts used internally) much of the file is unused.
It looks to me like quite a re-factor to remove IsEmpty or make it static, though hopefully it's easier than it looks. If you have a go, please cherry-pick the below 2 commits to sort the CI: https://github.com/chakra-core/ChakraCore/pull/6980/commits/4bae197849d8f597c3781ffec25d18aaf931ef29 and https://github.com/chakra-core/ChakraCore/pull/6980/commits/61c011a568d1659cae158ed4ab57a7f4a67ce0a7
When refactoring ImmutableList we could
- use a balanced binary-tree instead of the current linked-list approach to gain performance and
- make
ImmutableLista wrapper around an internalNodepointer that could be checked againstnullptr.
See: .NET Framework Immutable Collections and ImmutableList.Node in .NET BCL
When refactoring
ImmutableListwe could
- use a balanced binary-tree instead of the current linked-list approach to gain performance and
- make
ImmutableLista wrapper around an internalNodepointer that could be checked againstnullptr.See: .NET Framework Immutable Collections and
ImmutableList.Nodein .NET BCL
Considering the limited ways Immutable Lists are used in CC I'm not sure if a balanced binary tree would actually optimise anything; it's filled, reversed and iterated and that's basically it; main standout to me about the current implantation is that it's overly complex and lengthy for what it's used for - if we're cleaning up the code would like to see half of the file deleted...
If the list actually was "immutable" it would have to make a copy on every append.
https://github.com/chakra-core/ChakraCore/blob/2af598f04ab508f9231d6e26f0f82f5a57561413/lib/Common/DataStructures/ImmutableList.h#L42
Deleting complex and unused code sounds like a way better approach than implementing a binary-tree though 😄.
Yeah, immutable list modifying on append is pretty hilarious. This is basically a C linked list written with C++ classes.