Conversion to columnar fails with "dmax exceeds max"
This table cannot be converted to columnar storage. It fails on row 8620. However, inserting row 8620 by itself does not cause a failure, so I suspect this is due to some kind of total length problem (the total length of the table is around 0x10000000).
Steps to reproduce:
curl https://trends.tf/static/test.dump.zst | zstdcat | psql -d your_database
psql -d your_database -c "SELECT alter_table_set_access_method('test', 'columnar');"
You should see output from the second command similar to
NOTICE: creating a new table for public.test
NOTICE: moving the data of public.test
ERROR: Memory constraint error: memcpy_s: dmax exceeds max (errno 403)
CONTEXT: SQL statement "INSERT INTO public.test_1771415073 (id,data) SELECT id,data FROM public.test"
This also happens when inserting this data directly into a table created with USING columnar. Reducing chunk_group_row_limit to 8219 causes the conversion to succeed.
I am using postgresql 13 with citus 10.2.5-1.
For reference the import findings from the other issue. Another workaround is to lower (default) value of stripe_row_limit.
Stacktrace of the error is this:
#0 ereport_constraint_handler (message=0x7f50ad4e0b95 "memcpy_s: dmax exceeds max", pointer=0x0, error=403) at utils/citus_safe_lib.c:44
#1 0x00007f50ad4873ab in memcpy_s (dest=<optimized out>, dmax=<optimized out>, src=src@entry=0x2e237d8, smax=<optimized out>) at safeclib/memcpy_s.c:114
#2 0x00007f50ad4b2fcd in SerializeSingleDatum (datumTypeAlign=<optimized out>, datumTypeLength=-1, datumTypeByValue=false, datum=48379864,
datumBuffer=0x2d6af28) at ../columnar/columnar_writer.c:562
#3 ColumnarWriteRow (writeState=writeState@entry=0x2b6ed48, columnValues=0x2d75118, columnNulls=0x2bfb7d0) at ../columnar/columnar_writer.c:221
#4 0x00007f50ad4b0ab7 in columnar_tuple_insert (relation=0x7f50bc432a40, slot=0x2bfb728, cid=<optimized out>, options=<optimized out>,
bistate=<optimized out>) at ../columnar/columnar_tableam.c:756
#5 0x000000000067074c in ExecInsert ()
#6 0x0000000000671beb in ExecModifyTable ()
#7 0x00000000006455c2 in standard_ExecutorRun ()
#8 0x00007f50ad440dd2 in CitusExecutorRun (queryDesc=0x2c26b70, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
at executor/multi_executor.c:233
#9 0x0000000000680ff7 in _SPI_execute_plan ()
#10 0x000000000068152b in SPI_execute ()
#11 0x00007f50ad3fbd85 in ExecuteQueryViaSPI () at commands/alter_table.c:1997
#12 0x00007f50ad3fca6f in ReplaceTable (suppressNoticeMessages=false, justBeforeDropCommands=0x2b56480, targetId=<optimized out>, sourceId=<optimized out>)
at commands/alter_table.c:1562
#13 ConvertTable () at commands/alter_table.c:767
#14 0x00007f50ad3fd829 in AlterTableSetAccessMethod () at commands/alter_table.c:502
#15 0x00007f50ad3fd93b in alter_table_set_access_method (fcinfo=0x2bbc400) at commands/alter_table.c:333
---Type <return> to continue, or q <return> to quit---
#16 0x000000000064134f in ExecInterpExpr ()
#17 0x00000000006743bf in ExecResult ()
#18 0x00000000006455c2 in standard_ExecutorRun ()
#19 0x00007f50ad440dd2 in CitusExecutorRun (queryDesc=0x2bbfd38, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
at executor/multi_executor.c:233
#20 0x0000000000680ff7 in _SPI_execute_plan ()
#21 0x0000000000681849 in SPI_execute_plan_with_paramlist ()
#22 0x00007f50a20a27d3 in exec_run_select () from /usr/pgsql-14/lib/plpgsql.so
#23 0x00007f50a20a7e79 in exec_stmts () from /usr/pgsql-14/lib/plpgsql.so
#24 0x00007f50a20ac217 in exec_for_query () from /usr/pgsql-14/lib/plpgsql.so
#25 0x00007f50a20a71ea in exec_stmts () from /usr/pgsql-14/lib/plpgsql.so
#26 0x00007f50a20aa0d2 in exec_stmt_block () from /usr/pgsql-14/lib/plpgsql.so
#27 0x00007f50a20aa12b in exec_toplevel_block.constprop.34 () from /usr/pgsql-14/lib/plpgsql.so
#28 0x00007f50a20aa9e1 in plpgsql_exec_function () from /usr/pgsql-14/lib/plpgsql.so
#29 0x00007f50a20b5c34 in plpgsql_call_handler () from /usr/pgsql-14/lib/plpgsql.so
#30 0x00000000005e75e8 in ExecuteCallStmt ()
#31 0x00000000007beb4e in standard_ProcessUtility ()
#32 0x00007f50ad41a0b0 in multi_ProcessUtility (pstmt=0x29cbcf0,
queryString=0x29ca878 "CALL alter_old_partitions_set_access_method(\n 'ors_result_tracking',\n '2021-12-01 00:00:00' /* older_than */,\n 'columnar'\n);", readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x29cbdd0, completionTag=0x7fff66c25810)
---Type <return> to continue, or q <return> to quit---
at commands/utility_hook.c:238
#33 0x00000000007bd10a in PortalRunUtility ()
#34 0x00000000007bd229 in PortalRunMulti ()
#35 0x00000000007bd689 in PortalRun ()
#36 0x00000000007b95a7 in exec_simple_query ()
#37 0x00000000007baa1b in PostgresMain ()
#38 0x000000000048e663 in ServerLoop ()
#39 0x00000000007378e2 in PostmasterMain ()
#40 0x000000000048f782 in main ()
IMO the obvious way to fix this is to just replace the memcpy_s with memcpy. The "safety" guard against too large of a size is getting in the way of an actual copy of that size.
Yes, I agree the memcpy_s should not be used here, but I think it would be better to replace it with appendBinaryStringInfo/appendbinaryStringInfoNT in this case.