dcache
dcache copied to clipboard
Bulk request stuck on queue state
trafficstars
Hello,
FYI, There were stage requests in queue state as shown below
---------- REQUESTS ----------
STATUS | COUNT
CANCELLED | 2
COMPLETED | 350
QUEUED | 123854
STARTED | 860
---------- TARGETS -----------
STATE | COUNT
CANCELLED | 202
COMPLETED | 193471
CREATED | 273326
FAILED | 1
Not evidence in the PoolManager about the staging requests
[dccore01] (PoolManager@dccore01Domain) admin > rc ls
00003AAB7819ACB34083A74B4050CF4619E5@internal-net-external-net-world-net-*/* m=0 r=0 [dc280_12] [Waiting for stage: dc280_12 02.08 05:45:52] {0,}
Logs show these.
04 Feb 2025 10:30:19 [pool-6-thread-634] [Frontend-dcfrontend02 BulkRequestStatus] Uncaught exception in thread pool-6-thread-634com.google.common.util.concurrent.UncheckedExecutionException: org.springframework.dao.DataAccessResourceFailureException: PreparedStatementCallback; SQL [SELECT bulk_request.*, request_arguments.arguments as arguments FROM bulk_request LEFT OUTER JOIN request_arguments ON bulk_request.id = request_arguments.rid WHERE uid = ? ORDER BY arrived_at ASC LIMIT 1]; FATAL: terminating connection due to administrator command; nested exception is org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2085)
at com.google.common.cache.LocalCache.get(LocalCache.java:4011)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4034)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:5010)
at org.dcache.services.bulk.store.jdbc.request.JdbcBulkRequestStore.get(JdbcBulkRequestStore.java:826)
at org.dcache.services.bulk.store.jdbc.request.JdbcBulkRequestStore.valid(JdbcBulkRequestStore.java:910)
at org.dcache.services.bulk.store.jdbc.request.JdbcBulkRequestStore.getKey(JdbcBulkRequestStore.java:353)
at org.dcache.services.bulk.BulkService.lambda$messageArrived$4(BulkService.java:243)
at org.dcache.util.CDCExecutorServiceDecorator$WrappedRunnable.run(CDCExecutorServiceDecorator.java:130)
at org.dcache.util.BoundedExecutor$Worker.run(BoundedExecutor.java:247)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.springframework.dao.DataAccessResourceFailureException: PreparedStatementCallback; SQL [SELECT bulk_request.*, request_arguments.arguments as arguments FROM bulk_request LEFT OUTER JOIN request_arguments ON bulk_request.id = request_arguments.rid WHERE uid = ? ORDER BY arrived_at ASC LIMIT 1]; FATAL: terminating connection due to administrator command; nested exception is org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:107)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:70)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:79)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:79)
at org.springframework.jdbc.core.JdbcTemplate.translateException(JdbcTemplate.java:1541)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:667)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:713)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:744)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:757)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:810)
at org.dcache.services.bulk.store.jdbc.JdbcBulkDaoUtils.get(JdbcBulkDaoUtils.java:171)
at org.dcache.services.bulk.store.jdbc.request.JdbcBulkRequestDao.get(JdbcBulkRequestDao.java:156)
at org.dcache.services.bulk.store.jdbc.request.JdbcBulkRequestStore$RequestLoader.load(JdbcBulkRequestStore.java:146)
at org.dcache.services.bulk.store.jdbc.request.JdbcBulkRequestStore$RequestLoader.load(JdbcBulkRequestStore.java:142)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3570)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2312)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2189)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2079)
... 12 common frames omitted
Caused by: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:167)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:119)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:722)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:651)
... 24 common frames omitted
Restarting the bulk service permitted the requests in queue to resume
---------- REQUESTS ----------
STATUS | COUNT
CANCELLED | 2
COMPLETED | 1647
STARTED | 1337
---------- TARGETS -----------
STATE | COUNT
CANCELLED | 240
COMPLETED | 198071
FAILED | 861
RUNNING | 150441
Carlos