cloudberry icon indicating copy to clipboard operation
cloudberry copied to clipboard

cdbd primary segment failure: writer proc reference shared with reader is invalid

Open liyxbeijing opened this issue 2 years ago • 0 comments

Cloudberry Database version

cbdb version:

version                                                         
                                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------
 PostgreSQL 14.4 (Cloudberry Database 1.4.0 build commit:e83e3ffc22d538deb2dbceeeae0138ca2de064e6) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20210130 (Red Hat 1
0.2.1-11), 64-bit compiled on Aug  3 2023 10:15:47
(1 row)

cbdb primary segment instance failed: primary log:

2023-08-29 02:37:02.864347 CST,"",2023-08-29 02:29:37 CST,0,con231463,cmd7,seg25,slice3,dx1822982,x1522739,sx1,"ERROR","XX000","writer proc reference shared with reader is invalid (xact.c:1091)",,,,,,"insert into 
......
sum(case ",0,,"xact.c",1091,"Stack trace:
1    0xd54987 postgres errstart (elog.c:589)
2    0x6c7dee postgres <symbol not found> (xact.c:1091)
3    0x7efeba postgres HeapTupleSatisfiesVisibility (heapam_visibility.c:1103)
4    0x7e28b9 postgres heapgetpage (heapam.c:515)
5    0x7e3112 postgres <symbol not found> (heapam.c:1180)
6    0x7e437f postgres heap_getnextslot (heapam.c:1458)
7    0x9f50e3 postgres <symbol not found> (nodeSeqscan.c:154)
8    0x9c4a31 postgres ExecScan (execScan.c:137)
9    0x9c1e6c postgres <symbol not found> (execProcnode.c:630)
10   0xa084a2 postgres <symbol not found> (executor.h:281)
11   0x9c1e6c postgres <symbol not found> (execProcnode.c:630)
12   0x9b91bd postgres <symbol not found> (execMain.c:2530)
13   0x9b9c15 postgres standard_ExecutorRun (execMain.c:875)
14   0xbe0544 postgres <symbol not found> (pquery.c:243)
15   0xbe097e postgres <symbol not found> (pquery.c:1476)
16   0xbe1e62 postgres PortalRun (pquery.c:970)
17   0xbdbc7a postgres <symbol not found> (postgres.c:1424)
18   0xbdf9be postgres PostgresMain (postgres.c:5736)
19   0xb3a4bd postgres <symbol not found> (postmaster.c:4682)
20   0xb3b450 postgres PostmasterMain (postmaster.c:1658)
21   0x785a55 postgres main (main.c:198)
22   0x7f9a76a92555 libc.so.6 __libc_start_main + 0xf5
23   0x79153f postgres <symbol not found> + 0x79153f
23   0x79153f postgres <symbol not found> + 0x79153f
"
2023-08-29 02:37:02.864473 CST,","","28558",2023-08-29 02:29:37 CST,0,con231463,cmd7,seg25,slice3,dx1822982,x1522739,sx1,"LOG","00000","An exception was encountered during the execution of statement: insert into
2023-08-29 02:37:02.864813 CST",2023-08-29 02:29:37 CST,0,con231463,,seg25,,,,,"LOG","08006","could not send data to client: Broken pipe",,,,,,,0,,"pqcomm.c",1516,
2023-08-29 02:37:02.864845 CST,"2023-08-29 02:29:37 CST,0,con231463,,seg25,,,,,"FATAL","08006","connection to client lost",,,,,,,0,,"postgres.c",4081,
2023-08-29 08:52:45.009295 CST,,,p335484,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","received fast shutdown request",,,,,,,0,,"postmaster.c",3314,
2023-08-29 08:52:45.011483 CST,,,p335484,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","aborting any active transactions",,,,,,,0,,"postmaster.c",3332,
2023-08-29 08:52:45.011650 CST,,,p335514,th2047301760,,,,0,,,seg25,,,,,"FATAL","57P01","terminating background worker ""sweeper process"" due to administrator command",,,,,,,0,,"bgworker.c",777,
2023-08-29 08:52:45.011830 CST,,,p335513,th2047301760,,,,0,,,seg25,,,,,"WARNING","01000","ic-proxy-server: received signal 15",,,,,,,0,,"ic_proxy_main.c",474,
2023-08-29 08:52:45.013963 CST,,,p335484,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","background worker ""sweeper process"" (PID 335514) exited with exit code 1",,,,,,,0,,"postmaster.c",4208,
2023-08-29 08:52:45.015975 CST,,,p335484,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","background worker ""logical replication launcher"" (PID 335512) exited with exit code 1",,,,,,,0,,"postmaster.c",4208,
2023-08-29 08:52:45.016376 CST,,,p335484,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","background worker ""ic proxy process"" (PID 335513) exited with exit code 1",,,,,,,0,,"postmaster.c",4208,
2023-08-29 08:52:45.022509 CST,,,p335508,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","shutting down",,,,,,,0,,"xlog.c",9142,
2023-08-29 08:52:45.234669 CST,,,p335484,th2047301760,,,,0,,,seg25,,,,,"LOG","00000","database system is shut down",,,,,,,0,,"miscinit.c",1008,

mirror log:

2023-08-29 02:37:00.088155 CST,""","12282",2023-08-29 02:37:00 CST,0,,,seg25,,,,,"LOG","00000","promoting mirror to primary due to FTS request",,,,,,,0,,"ftsmessagehandler.c",392,
2023-08-29 02:37:00.088200 CST,"2023-08-29 02:37:00 CST,0,,,seg25,,,,,"LOG","00000","creating replication slot internal_wal_replication_slot",,,,,,,0,,"ftsmessagehandler.c",355,
2023-08-29 02:37:00.678014 CST,,,p277912,th-1003132800,,,,0,,,seg25,,,,,"LOG","00000","received promote request",,,,,,,0,,"xlog.c",13735,
2023-08-29 02:37:00.680322 CST,,,p277917,th-1003132800,,,,0,,,seg25,,,,,"FATAL","57P01","terminating walreceiver process due to administrator command",,,,,,,0,,"walreceiver.c",161,
2023-08-29 02:37:00.681026 CST,,,p277912,th-1003132800,,,,0,,,seg25,,,,,"LOG","00000","invalid record length at 11A/2EA29170: wanted 24, got 0",,,,,,,0,,"xlog.c",4482,
2023-08-29 02:37:00.681045 CST,,,p277912,th-1003132800,,,,0,,,seg25,,,,,"LOG","00000","redo done at 11A/2EA29138 system usage: CPU: user: 1158.92 s, system: 1226.51 s, elapsed: 970519.63 s",,,,,,,0,,"xlog.c",7962,
2023-08-29 02:37:00.684554 CST,,,p277912,th-1003132800,,,,0,,,seg25,,,,,"LOG","00000","last completed transaction was at log time 2023-08-29 02:32:30.511697+08",,,,,,,0,,"xlog.c",7968,
2023-08-29 02:37:00.705551 CST,"",,p217642,th-1003132800,"","12336",2023-08-29 02:37:00 CST,0,,,seg25,,,,,"LOG","00000","received probe message while acting as mirror",,,,,,,0,,"ftsmessagehandler.c",271,
2023-08-29 02:37:00.705590 CST,"",,p217642,th-1003132800,"","12336",2023-08-29 02:37:00 CST,0,,,seg25,,,,,"LOG","00000","FTS: ""fts_probe_file.bak"" file doesn't exist, creating it once.",,,,,,,0,,"ftsmessagehandler.c",64,
2023-08-29 02:37:00.724803 CST,"",,p217647,th-1003132800,"","12414",2023-08-29 02:37:00 CST,0,,,seg25,,,,,"LOG","00000","promoting mirror to primary due to FTS request",,,,,,,0,,"ftsmessagehandler.c",392,
2023-08-29 02:37:00.724818 CST,"",,p217647,th-1003132800,","12414",2023-08-29 02:37:00 CST,0,,,seg25,,,,,"LOG","00000","replication slot internal_wal_replication_slot exists",,,,,,,0,,"ftsmessagehandler.c",359,
2023-08-29 02:37:02.091756 CST,","","12596",2023-08-29 02:37:02 CST,0,con231465,,seg25,,,,,"FATAL","57M02","the database system is in recovery mode","last replayed record at 11A/2EA29170

What happened

primary segment failed.

What you think should happen instead

No response

How to reproduce

N/A

Operating System

centos 7.9

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes, I am willing to submit a PR!

Code of Conduct

liyxbeijing avatar Aug 31 '23 09:08 liyxbeijing