Hi,
Any suggestion on how to debug this? A few hundred of these calls succeeded before this one.
Thanks,
Marcus
Breakpoint 1, 0x00002100030f0668 in query_template ()
1: x/i $pc
=> 0x2100030f0668 <query_template+104>: ld r20,16(r3)
#0 0x00002100030f0668 in query_template ()
#1 0x00002100030f0298 in multifrag_query_hoisted_literals ()
#2 0x00000000126aca20 in QueryExecutionContext::launchCpuCode(RelAlgExecutionUnit const&, CpuCompilationContext const*, bool, std::vector<signed char, std::allocator > const&, std::vector<std::vector<signed char const*, std::allocator<signed char const*> >, std::allocator<std::vector<signed char const*, std::allocator<signed char const*> > > >, std::vector<std::vector<long, std::allocator >, std::allocator<std::vector<long, std::allocator > > > const&, std::vector<std::vector<unsigned long, std::allocator >, std::allocator<std::vector<unsigned long, std::allocator > > > const&, int, int*, unsigned int, std::vector<long, std::allocator > const&) (this=0x2103282a0200, ra_exe_unit=..., native_code=0x2100183772e0,
hoist_literals=true, literal_buff=..., col_buffers=..., num_rows=..., frag_offsets=..., scan_limit=0, error_code=0x2104e5d7a74c, num_tables=1, join_hash_tables=...)
at /home/mgd/src/omniscidb/QueryEngine/QueryExecutionContext.cpp:684
#3 0x0000000012326cb4 in Executor::executePlanWithoutGroupBy(RelAlgExecutionUnit const&, CompilationResult const&, bool, std::shared_ptr<ResultSet>&, std::vector<Analyzer::Expr*, std::allocatorAnalyzer::Expr* > const&, ExecutorDeviceType, std::vector<std::vector<signed char const*, std::allocator<signed char const*> >, std::allocator<std::vector<signed char const*, std::allocator<signed char const*> > > >&, QueryExecutionContext*, std::vector<std::vector<long, std::allocator >, std::allocator<std::vector<long, std::allocator > > > const&, std::vector<std::vector<unsigned long, std::allocator >, std::allocator<std::vector<unsigned long, std::allocator > > > const&, Data_Namespace::DataMgr*, int, unsigned int, unsigned int, bool, RenderInfo*)
(this=0x210018087d90, ra_exe_unit=..., compilation_result=..., hoist_literals=true, results=..., target_exprs=..., device_type=CPU, col_buffers=..., query_exe_context=0x2103282a0200,
num_rows=..., frag_offsets=..., data_mgr=0x1cf40600, device_id=0, start_rowid=0, num_tables=1, allow_runtime_interrupt=true, render_info=0x0)
at /home/mgd/src/omniscidb/QueryEngine/Execute.cpp:2864
Hi @MarcusGDaniels ,
There are a few things you can try. To help us understand the query, try running again in verbose mode (--verbose or verbose = true in the config file). The snippets you are looking for are the RelAlgExecutionUnit serialization (tells us the shape of the query step that failed) and the QueryMemoryDescriptor output (shows memory layout, buffer sizes, etc). We can look for any red flags there (buffers that are too small, strange indices, etc).
You can also run the SQL prefixed with the explain command to grab the LLVM IR and look for abnormalities there, but that's pretty difficult without the context above.
Finally, if it's a multi-step query, explain plan shows all the steps -- so we can see if it's a first step issue or later step (which is more likely).
I found the error went away when I went back to the 5.5.2 tarball release instead of building the master branch.