citus icon indicating copy to clipboard operation
citus copied to clipboard

Conversion to columnar fails with "dmax exceeds max"

Open Forty-Bot opened this issue 2 years ago • 3 comments

This table cannot be converted to columnar storage. It fails on row 8620. However, inserting row 8620 by itself does not cause a failure, so I suspect this is due to some kind of total length problem (the total length of the table is around 0x10000000).

Steps to reproduce:

curl https://trends.tf/static/test.dump.zst | zstdcat | psql -d your_database
psql -d your_database -c "SELECT alter_table_set_access_method('test', 'columnar');"

You should see output from the second command similar to

NOTICE:  creating a new table for public.test
NOTICE:  moving the data of public.test
ERROR:  Memory constraint error: memcpy_s: dmax exceeds max (errno 403)
CONTEXT:  SQL statement "INSERT INTO public.test_1771415073 (id,data) SELECT id,data FROM public.test"

This also happens when inserting this data directly into a table created with USING columnar. Reducing chunk_group_row_limit to 8219 causes the conversion to succeed.

I am using postgresql 13 with citus 10.2.5-1.

Forty-Bot avatar May 01 '22 02:05 Forty-Bot

For reference the import findings from the other issue. Another workaround is to lower (default) value of stripe_row_limit.

Stacktrace of the error is this:

#0  ereport_constraint_handler (message=0x7f50ad4e0b95 "memcpy_s: dmax exceeds max", pointer=0x0, error=403) at utils/citus_safe_lib.c:44
#1  0x00007f50ad4873ab in memcpy_s (dest=<optimized out>, dmax=<optimized out>, src=src@entry=0x2e237d8, smax=<optimized out>) at safeclib/memcpy_s.c:114
#2  0x00007f50ad4b2fcd in SerializeSingleDatum (datumTypeAlign=<optimized out>, datumTypeLength=-1, datumTypeByValue=false, datum=48379864,
    datumBuffer=0x2d6af28) at ../columnar/columnar_writer.c:562
#3  ColumnarWriteRow (writeState=writeState@entry=0x2b6ed48, columnValues=0x2d75118, columnNulls=0x2bfb7d0) at ../columnar/columnar_writer.c:221
#4  0x00007f50ad4b0ab7 in columnar_tuple_insert (relation=0x7f50bc432a40, slot=0x2bfb728, cid=<optimized out>, options=<optimized out>,
    bistate=<optimized out>) at ../columnar/columnar_tableam.c:756
#5  0x000000000067074c in ExecInsert ()
#6  0x0000000000671beb in ExecModifyTable ()
#7  0x00000000006455c2 in standard_ExecutorRun ()
#8  0x00007f50ad440dd2 in CitusExecutorRun (queryDesc=0x2c26b70, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
    at executor/multi_executor.c:233
#9  0x0000000000680ff7 in _SPI_execute_plan ()
#10 0x000000000068152b in SPI_execute ()
#11 0x00007f50ad3fbd85 in ExecuteQueryViaSPI () at commands/alter_table.c:1997
#12 0x00007f50ad3fca6f in ReplaceTable (suppressNoticeMessages=false, justBeforeDropCommands=0x2b56480, targetId=<optimized out>, sourceId=<optimized out>)
    at commands/alter_table.c:1562
#13 ConvertTable () at commands/alter_table.c:767
#14 0x00007f50ad3fd829 in AlterTableSetAccessMethod () at commands/alter_table.c:502
#15 0x00007f50ad3fd93b in alter_table_set_access_method (fcinfo=0x2bbc400) at commands/alter_table.c:333
---Type <return> to continue, or q <return> to quit---
#16 0x000000000064134f in ExecInterpExpr ()
#17 0x00000000006743bf in ExecResult ()
#18 0x00000000006455c2 in standard_ExecutorRun ()
#19 0x00007f50ad440dd2 in CitusExecutorRun (queryDesc=0x2bbfd38, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
    at executor/multi_executor.c:233
#20 0x0000000000680ff7 in _SPI_execute_plan ()
#21 0x0000000000681849 in SPI_execute_plan_with_paramlist ()
#22 0x00007f50a20a27d3 in exec_run_select () from /usr/pgsql-14/lib/plpgsql.so
#23 0x00007f50a20a7e79 in exec_stmts () from /usr/pgsql-14/lib/plpgsql.so
#24 0x00007f50a20ac217 in exec_for_query () from /usr/pgsql-14/lib/plpgsql.so
#25 0x00007f50a20a71ea in exec_stmts () from /usr/pgsql-14/lib/plpgsql.so
#26 0x00007f50a20aa0d2 in exec_stmt_block () from /usr/pgsql-14/lib/plpgsql.so
#27 0x00007f50a20aa12b in exec_toplevel_block.constprop.34 () from /usr/pgsql-14/lib/plpgsql.so
#28 0x00007f50a20aa9e1 in plpgsql_exec_function () from /usr/pgsql-14/lib/plpgsql.so
#29 0x00007f50a20b5c34 in plpgsql_call_handler () from /usr/pgsql-14/lib/plpgsql.so
#30 0x00000000005e75e8 in ExecuteCallStmt ()
#31 0x00000000007beb4e in standard_ProcessUtility ()
#32 0x00007f50ad41a0b0 in multi_ProcessUtility (pstmt=0x29cbcf0,
    queryString=0x29ca878 "CALL alter_old_partitions_set_access_method(\n  'ors_result_tracking',\n  '2021-12-01 00:00:00' /* older_than */,\n  'columnar'\n);", readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x29cbdd0, completionTag=0x7fff66c25810)
---Type <return> to continue, or q <return> to quit---
    at commands/utility_hook.c:238
#33 0x00000000007bd10a in PortalRunUtility ()
#34 0x00000000007bd229 in PortalRunMulti ()
#35 0x00000000007bd689 in PortalRun ()
#36 0x00000000007b95a7 in exec_simple_query ()
#37 0x00000000007baa1b in PostgresMain ()
#38 0x000000000048e663 in ServerLoop ()
#39 0x00000000007378e2 in PostmasterMain ()
#40 0x000000000048f782 in main ()

JelteF avatar Aug 31 '22 12:08 JelteF

IMO the obvious way to fix this is to just replace the memcpy_s with memcpy. The "safety" guard against too large of a size is getting in the way of an actual copy of that size.

Forty-Bot avatar Aug 31 '22 15:08 Forty-Bot

Yes, I agree the memcpy_s should not be used here, but I think it would be better to replace it with appendBinaryStringInfo/appendbinaryStringInfoNT in this case.

JelteF avatar Aug 31 '22 15:08 JelteF