citus
citus copied to clipboard
Conversion to columnar fails with "dmax exceeds max"
This table cannot be converted to columnar storage. It fails on row 8620. However, inserting row 8620 by itself does not cause a failure, so I suspect this is due to some kind of total length problem (the total length of the table is around 0x10000000
).
Steps to reproduce:
curl https://trends.tf/static/test.dump.zst | zstdcat | psql -d your_database
psql -d your_database -c "SELECT alter_table_set_access_method('test', 'columnar');"
You should see output from the second command similar to
NOTICE: creating a new table for public.test
NOTICE: moving the data of public.test
ERROR: Memory constraint error: memcpy_s: dmax exceeds max (errno 403)
CONTEXT: SQL statement "INSERT INTO public.test_1771415073 (id,data) SELECT id,data FROM public.test"
This also happens when inserting this data directly into a table created with USING columnar
. Reducing chunk_group_row_limit
to 8219 causes the conversion to succeed.
I am using postgresql 13 with citus 10.2.5-1.
For reference the import findings from the other issue. Another workaround is to lower (default) value of stripe_row_limit
.
Stacktrace of the error is this:
#0 ereport_constraint_handler (message=0x7f50ad4e0b95 "memcpy_s: dmax exceeds max", pointer=0x0, error=403) at utils/citus_safe_lib.c:44
#1 0x00007f50ad4873ab in memcpy_s (dest=<optimized out>, dmax=<optimized out>, src=src@entry=0x2e237d8, smax=<optimized out>) at safeclib/memcpy_s.c:114
#2 0x00007f50ad4b2fcd in SerializeSingleDatum (datumTypeAlign=<optimized out>, datumTypeLength=-1, datumTypeByValue=false, datum=48379864,
datumBuffer=0x2d6af28) at ../columnar/columnar_writer.c:562
#3 ColumnarWriteRow (writeState=writeState@entry=0x2b6ed48, columnValues=0x2d75118, columnNulls=0x2bfb7d0) at ../columnar/columnar_writer.c:221
#4 0x00007f50ad4b0ab7 in columnar_tuple_insert (relation=0x7f50bc432a40, slot=0x2bfb728, cid=<optimized out>, options=<optimized out>,
bistate=<optimized out>) at ../columnar/columnar_tableam.c:756
#5 0x000000000067074c in ExecInsert ()
#6 0x0000000000671beb in ExecModifyTable ()
#7 0x00000000006455c2 in standard_ExecutorRun ()
#8 0x00007f50ad440dd2 in CitusExecutorRun (queryDesc=0x2c26b70, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
at executor/multi_executor.c:233
#9 0x0000000000680ff7 in _SPI_execute_plan ()
#10 0x000000000068152b in SPI_execute ()
#11 0x00007f50ad3fbd85 in ExecuteQueryViaSPI () at commands/alter_table.c:1997
#12 0x00007f50ad3fca6f in ReplaceTable (suppressNoticeMessages=false, justBeforeDropCommands=0x2b56480, targetId=<optimized out>, sourceId=<optimized out>)
at commands/alter_table.c:1562
#13 ConvertTable () at commands/alter_table.c:767
#14 0x00007f50ad3fd829 in AlterTableSetAccessMethod () at commands/alter_table.c:502
#15 0x00007f50ad3fd93b in alter_table_set_access_method (fcinfo=0x2bbc400) at commands/alter_table.c:333
---Type <return> to continue, or q <return> to quit---
#16 0x000000000064134f in ExecInterpExpr ()
#17 0x00000000006743bf in ExecResult ()
#18 0x00000000006455c2 in standard_ExecutorRun ()
#19 0x00007f50ad440dd2 in CitusExecutorRun (queryDesc=0x2bbfd38, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
at executor/multi_executor.c:233
#20 0x0000000000680ff7 in _SPI_execute_plan ()
#21 0x0000000000681849 in SPI_execute_plan_with_paramlist ()
#22 0x00007f50a20a27d3 in exec_run_select () from /usr/pgsql-14/lib/plpgsql.so
#23 0x00007f50a20a7e79 in exec_stmts () from /usr/pgsql-14/lib/plpgsql.so
#24 0x00007f50a20ac217 in exec_for_query () from /usr/pgsql-14/lib/plpgsql.so
#25 0x00007f50a20a71ea in exec_stmts () from /usr/pgsql-14/lib/plpgsql.so
#26 0x00007f50a20aa0d2 in exec_stmt_block () from /usr/pgsql-14/lib/plpgsql.so
#27 0x00007f50a20aa12b in exec_toplevel_block.constprop.34 () from /usr/pgsql-14/lib/plpgsql.so
#28 0x00007f50a20aa9e1 in plpgsql_exec_function () from /usr/pgsql-14/lib/plpgsql.so
#29 0x00007f50a20b5c34 in plpgsql_call_handler () from /usr/pgsql-14/lib/plpgsql.so
#30 0x00000000005e75e8 in ExecuteCallStmt ()
#31 0x00000000007beb4e in standard_ProcessUtility ()
#32 0x00007f50ad41a0b0 in multi_ProcessUtility (pstmt=0x29cbcf0,
queryString=0x29ca878 "CALL alter_old_partitions_set_access_method(\n 'ors_result_tracking',\n '2021-12-01 00:00:00' /* older_than */,\n 'columnar'\n);", readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x29cbdd0, completionTag=0x7fff66c25810)
---Type <return> to continue, or q <return> to quit---
at commands/utility_hook.c:238
#33 0x00000000007bd10a in PortalRunUtility ()
#34 0x00000000007bd229 in PortalRunMulti ()
#35 0x00000000007bd689 in PortalRun ()
#36 0x00000000007b95a7 in exec_simple_query ()
#37 0x00000000007baa1b in PostgresMain ()
#38 0x000000000048e663 in ServerLoop ()
#39 0x00000000007378e2 in PostmasterMain ()
#40 0x000000000048f782 in main ()
IMO the obvious way to fix this is to just replace the memcpy_s with memcpy. The "safety" guard against too large of a size is getting in the way of an actual copy of that size.
Yes, I agree the memcpy_s should not be used here, but I think it would be better to replace it with appendBinaryStringInfo
/appendbinaryStringInfoNT
in this case.