Etherpad still crashs when when querying stats API with timeout
Describe the bug
This bug report relates to #5005.
If we try to fetch the getStats API after Etherpad starts, the whole process get killed (see the additional context). In #5005 we saw Error: Query inactivity timeout. Now it is Error: Request aborted. Nevertheless the process should not be terminated when this happens.
Our Etherpad instance runs around 20k pads with a MySQL storage backend.
To Reproduce Steps to reproduce the behavior:
- Start Etherpad
- Trigger the API Endpoint getStats with a timeout (60 seconds)
curl -v --connect-timeout 60 --max-time 60 -s "localhost:9001/api/1.2.14/getStats?apikey=xxx"
Expected behavior
Etherpad should not crash if the HTTP Client closes the connection to the API Endpoint.
Server (please complete the following information):
- Etherpad version: 1.8.16
- OS: Debian Bullseye
- Node.js version (
node --version): v14.18.2 - npm version (
npm --version): 6.14.15 - Is the server free of plugins: no
Additional context
Dec 30 21:26:40 etherpad1 node[19662]: [2021-12-30 21:26:40.836] [ERROR] console - Error: Request aborted
Dec 30 21:26:40 etherpad1 node[19662]: at onaborted (/home/etherpad/etherpad-lite/src/node_modules/express/lib/response.js:1025:15)
Dec 30 21:26:40 etherpad1 node[19662]: at Immediate.<anonymous> (/home/etherpad/etherpad-lite/src/node_modules/express/lib/response.js:1067:9)
Dec 30 21:26:40 etherpad1 node[19662]: at processImmediate (internal/timers.js:464:21)
Dec 30 21:26:56 etherpad1 node[19662]: [2021-12-30 21:26:56.353] [ERROR] console - Fatal MySQL error: Error: Query inactivity timeout
Dec 30 21:26:56 etherpad1 node[19662]: at Query.<anonymous> (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/Protocol.js:160:17)
Dec 30 21:26:56 etherpad1 node[19662]: at Query.emit (events.js:400:28)
Dec 30 21:26:56 etherpad1 node[19662]: at Query._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/sequences/Sequence.js:124:8)
Dec 30 21:26:56 etherpad1 node[19662]: at Timer._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/Timer.js:32:23)
Dec 30 21:26:56 etherpad1 node[19662]: at listOnTimeout (internal/timers.js:557:17)
Dec 30 21:26:56 etherpad1 node[19662]: at processTimers (internal/timers.js:500:7)
Dec 30 21:26:56 etherpad1 node[19662]: --------------------
Dec 30 21:26:56 etherpad1 node[19662]: at Pool.query (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/Pool.js:199:23)
Dec 30 21:26:56 etherpad1 node[19662]: at /home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:46:20
Dec 30 21:26:56 etherpad1 node[19662]: at new Promise (<anonymous>)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database._query (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:44:20)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database.set (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:143:16)
Dec 30 21:26:56 etherpad1 node[19662]: at writeOneOp (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:544:34)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database._write (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:555:13)
Dec 30 21:26:56 etherpad1 node[19662]: at /home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:500:22
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database.flush (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:502:9)
Dec 30 21:26:56 etherpad1 node[19662]: at Timeout._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:228:32)
Dec 30 21:26:56 etherpad1 node[19662]: [2021-12-30 21:26:56.355] [ERROR] socket.io - Error while handling message from jiMZz_e-ZzKz-t4aAAAC: Error: Query inactivity timeout
Dec 30 21:26:56 etherpad1 node[19662]: at Query.<anonymous> (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/Protocol.js:160:17)
Dec 30 21:26:56 etherpad1 node[19662]: at Query.emit (events.js:400:28)
Dec 30 21:26:56 etherpad1 node[19662]: at Query._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/sequences/Sequence.js:124:8)
Dec 30 21:26:56 etherpad1 node[19662]: at Timer._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/Timer.js:32:23)
Dec 30 21:26:56 etherpad1 node[19662]: at listOnTimeout (internal/timers.js:557:17)
Dec 30 21:26:56 etherpad1 node[19662]: at processTimers (internal/timers.js:500:7)
Dec 30 21:26:56 etherpad1 node[19662]: --------------------
Dec 30 21:26:56 etherpad1 node[19662]: at Pool.query (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/Pool.js:199:23)
Dec 30 21:26:56 etherpad1 node[19662]: at /home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:46:20
Dec 30 21:26:56 etherpad1 node[19662]: at new Promise (<anonymous>)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database._query (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:44:20)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database.set (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:143:16)
Dec 30 21:26:56 etherpad1 node[19662]: at writeOneOp (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:544:34)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database._write (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:555:13)
Dec 30 21:26:56 etherpad1 node[19662]: at /home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:500:22
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database.flush (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:502:9)
Dec 30 21:26:56 etherpad1 node[19662]: at Timeout._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:228:32)
Dec 30 21:26:56 etherpad1 node[19662]: [2021-12-30 21:26:56.813] [ERROR] console - Fatal MySQL error: Error: Query inactivity timeout
Dec 30 21:26:56 etherpad1 node[19662]: at Query.<anonymous> (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/Protocol.js:160:17)
Dec 30 21:26:56 etherpad1 node[19662]: at Query.emit (events.js:400:28)
Dec 30 21:26:56 etherpad1 node[19662]: at Query._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/sequences/Sequence.js:124:8)
Dec 30 21:26:56 etherpad1 node[19662]: at Timer._onTimeout (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/protocol/Timer.js:32:23)
Dec 30 21:26:56 etherpad1 node[19662]: at listOnTimeout (internal/timers.js:557:17)
Dec 30 21:26:56 etherpad1 node[19662]: at processTimers (internal/timers.js:500:7)
Dec 30 21:26:56 etherpad1 node[19662]: --------------------
Dec 30 21:26:56 etherpad1 node[19662]: at Pool.query (/home/etherpad/etherpad-lite/src/node_modules/mysql/lib/Pool.js:199:23)
Dec 30 21:26:56 etherpad1 node[19662]: at /home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:46:20
Dec 30 21:26:56 etherpad1 node[19662]: at new Promise (<anonymous>)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database._query (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:44:20)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database.get (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:116:34)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database._getLocked (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:294:38)
Dec 30 21:26:56 etherpad1 node[19662]: at exports.Database.get (/home/etherpad/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:271:25)
Dec 30 21:26:56 etherpad1 node[19662]: at runMicrotasks (<anonymous>)
Dec 30 21:26:56 etherpad1 node[19662]: at processTicksAndRejections (internal/process/task_queues.js:95:5)
There are two issues here that I'd like to address separately:
- Etherpad shouldn't crash if a DB query throws. This should be easy to fix.
- The query shouldn't time out. This is the same problem as: https://github.com/ether/ep_adminpads2/issues/25#issuecomment-986198640. Unfortunately, fixing this is delicate work because it involves changing how pads are saved to the database. It will likely break some plugins.
- Etherpad shouldn't crash if a DB query throws. This should be easy to fix.
Sigh, I jinxed myself by saying that. It's not easy to fix. IIUC, what's happening is the MySQL server is so busy trying to enumerate all pads that unrelated queries over other pool connections are timing out.
Thanks a lot for your constant work of improvement and support to help for our issues we have.
Will #5347 apply in the next release? We would like to test it and give a feedback.
Yes, it will be in the next release (1.9.0). Unfortunately, I think there's a low probability that PR will fix the issue for you. (It might, so it's still worth trying.)
Okay, we keep trying and still hope 🤞
Okay, we keep trying and still hope 🤞
hey i would like to know one thing....if we contribute here is etherpad going to pay us ...and if it then whats the criteria how many pull req
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
We tried another time to get deeper in this problem. As we are using MariaDB/MySQL for Storage with MyISAM engine we looked what ueberDB would trigger in the database and did this:
MariaDB [etherpad]> SELECT COUNT(`key`) FROM store WHERE `key` LIKE 'pad:%' AND `key` NOT LIKE 'pad:%:%';
+--------------+
| COUNT(`key`) |
+--------------+
| 18924 |
+--------------+
1 row in set (51.514 sec)
After some improvements for our MariaDB instance we reduced the query time from ~90 seconds to ~53 seconds. But we were still not satisfied. So we noticed that we have ~18k Pads but 100.000.000 rows in MariaDB. So we stumble upon an old problem, that the sessionstorage still rapidly growths:
MariaDB [etherpad]> SELECT COUNT(*) FROM store WHERE `key` LIKE 'sessionstorage%';
+----------+
| COUNT(*) |
+----------+
| 38923592 |
+----------+
1 row in set (5 min 44.863 sec)
We cleaned around one year ago all entries starting with sessionstorage from the database and after that year 40% of the database is used by this entries.
We will again clean the database from this entries and waiting for 1.9.0 release which hopefully fix the session creation. After we cleaned up the database I will bring new timings of our queries to compare.
- #5348
- #5010
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not stale.