robinhood
robinhood copied to clipboard
Accounting DB errors with Changelog Reader
I am getting a lot of errors in my log files when accounting is enabled for the changelog reader. I don't get the errors when just performing a scan however. Error is always the same for any of the database operations.
Out of range value for column 'sz0' at row 1
I was going to work around this by running the scanner and the CL reader via separate configs but the ACCT table is dropped when "accounting=no" is in the config.
Same error (robinhood 3.1.5 on Lustre 2.10):
2019/07/25 11:50:54 [4596/8] EntryProc | Error 7 performing database operation: request error.
2019/07/25 11:50:55 [4596/4] ListMgr | Unhandled error 1264: default conversion to DB_REQUEST_FAILED
2019/07/25 11:50:55 [4596/4] ListMgr | Error 7 executing query 'UPDATE ENTRIES SET uid='amiratag',gid='oak_jamesz',size=0,blocks=1597448,last_access=1564080654,last_mod=1564080654,last_mdchange=1564080654,type='file',mode=432,nlink=1,md_update=1564080655,fileclass='+stanford+groups+nosnap+',class_update=1564080655 WHERE id='0x20002a76d:0x15364:0x0'': Out of range value for column 'sz0' at row 1
mysql> mysql> select * from ENTRIES WHERE id='0x20002a76d:0x15364:0x0';
+-------------------------+----------+------------+----------+---------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+--------------------------+--------------+--------------+------------------+--------------+---------------+
| id | uid | gid | size | blocks | creation_time | last_access | last_mod | last_mdchange | type | mode | nlink | md_update | invalid | fileclass | class_update | alert_status | modeguard_status | alert_lstchk | alert_lstalrt |
+-------------------------+----------+------------+----------+---------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+--------------------------+--------------+--------------+------------------+--------------+---------------+
| 0x20002a76d:0x15364:0x0 | amiratag | oak_jamesz | 50331648 | 1597448 | 1564009631 | 1564080712 | 1564080712 | 1564080712 | file | 432 | 1 | 1564080712 | NULL | +stanford+groups+nosnap+ | 1564080712 | | | 0 | 0 |
+-------------------------+----------+------------+----------+---------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+--------------------------+--------------+--------------+------------------+--------------+---------------+
1 row in set (0.00 sec)
mysql> SELECT * from ACCT_STAT where uid='amiratag' and gid='oak_jamesz';
+----------+------------+------+--------------+------------------+----------------+-------------+---------+------+------+-------+---------+---------+--------+-------+------+-------+------+
| uid | gid | type | alert_status | modeguard_status | size | blocks | count | sz0 | sz1 | sz32 | sz1K | sz32K | sz1M | sz32M | sz1G | sz32G | sz1T |
+----------+------------+------+--------------+------------------+----------------+-------------+---------+------+------+-------+---------+---------+--------+-------+------+-------+------+
| amiratag | oak_jamesz | dir | | ok | 536137728 | 1056240 | 18009 | 0 | 0 | 0 | 16525 | 1482 | 1 | 1 | 0 | 0 | 0 |
| amiratag | oak_jamesz | dir | | invalid | 8192 | 16 | 2 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
| amiratag | oak_jamesz | file | | | 15023241869454 | 29372049608 | 5195492 | 0 | 174 | 28239 | 2987544 | 1361072 | 770225 | 47937 | 263 | 0 | 0 |
+----------+------------+------+--------------+------------------+----------------+-------------+---------+------+------+-------+---------+---------+--------+-------+------+-------+------+
3 rows in set (0.00 sec)```
Hello, we're wondering if this error could come from the fact that the sz0 column is unsigned but the trigger uses SIGNED cast:
https://github.com/cea-hpc/robinhood/blob/master/src/list_mgr/listmgr_init.c#L2493
mysql> describe ACCT_STAT;
+------------------+--------------------------------------------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------------------------------------------------+------+-----+---------+-------+
| uid | varbinary(127) | NO | PRI | unknown | |
| gid | varbinary(127) | NO | PRI | unknown | |
| type | enum('symlink','dir','file','chr','blk','fifo','sock') | NO | PRI | file | |
| alert_status | enum('','clear','alert') | NO | PRI | | |
| modeguard_status | enum('','ok','invalid') | NO | PRI | | |
| size | bigint(20) unsigned | YES | | 0 | |
| blocks | bigint(20) unsigned | YES | | 0 | |
| count | bigint(20) unsigned | YES | | 0 | |
| sz0 | bigint(20) unsigned | YES | | 0 | |
| sz1 | bigint(20) unsigned | YES | | 0 | |
| sz32 | bigint(20) unsigned | YES | | 0 | |
| sz1K | bigint(20) unsigned | YES | | 0 | |
| sz32K | bigint(20) unsigned | YES | | 0 | |
| sz1M | bigint(20) unsigned | YES | | 0 | |
| sz32M | bigint(20) unsigned | YES | | 0 | |
| sz1G | bigint(20) unsigned | YES | | 0 | |
| sz32G | bigint(20) unsigned | YES | | 0 | |
| sz1T | bigint(20) unsigned | YES | | 0 | |
+------------------+--------------------------------------------------------+------+-----+---------+-------+
Shouldn't be the cast as UNSIGNED?
Hello Stephan, I think the cast to SIGNED is to allow the subtract operations. If this is the issue, perhaps changing sz0..sz1T to SIGNED as well may fix this?
I think this may fix the issue: https://review.gerrithub.io/c/cea-hpc/robinhood/+/521073/2