robinhood icon indicating copy to clipboard operation
robinhood copied to clipboard

Accounting DB errors with Changelog Reader

Open dnhodgson opened this issue 6 years ago • 4 comments

I am getting a lot of errors in my log files when accounting is enabled for the changelog reader. I don't get the errors when just performing a scan however. Error is always the same for any of the database operations.

Out of range value for column 'sz0' at row 1

I was going to work around this by running the scanner and the CL reader via separate configs but the ACCT table is dropped when "accounting=no" is in the config.

dnhodgson avatar May 10 '18 16:05 dnhodgson

Same error (robinhood 3.1.5 on Lustre 2.10):

2019/07/25 11:50:54 [4596/8] EntryProc | Error 7 performing database operation: request error.
2019/07/25 11:50:55 [4596/4] ListMgr | Unhandled error 1264: default conversion to DB_REQUEST_FAILED
2019/07/25 11:50:55 [4596/4] ListMgr | Error 7 executing query 'UPDATE ENTRIES SET uid='amiratag',gid='oak_jamesz',size=0,blocks=1597448,last_access=1564080654,last_mod=1564080654,last_mdchange=1564080654,type='file',mode=432,nlink=1,md_update=1564080655,fileclass='+stanford+groups+nosnap+',class_update=1564080655 WHERE id='0x20002a76d:0x15364:0x0'': Out of range value for column 'sz0' at row 1
mysql> mysql> select * from ENTRIES WHERE id='0x20002a76d:0x15364:0x0';
+-------------------------+----------+------------+----------+---------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+--------------------------+--------------+--------------+------------------+--------------+---------------+
| id                      | uid      | gid        | size     | blocks  | creation_time | last_access | last_mod   | last_mdchange | type | mode | nlink | md_update  | invalid | fileclass                | class_update | alert_status | modeguard_status | alert_lstchk | alert_lstalrt |
+-------------------------+----------+------------+----------+---------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+--------------------------+--------------+--------------+------------------+--------------+---------------+
| 0x20002a76d:0x15364:0x0 | amiratag | oak_jamesz | 50331648 | 1597448 |    1564009631 |  1564080712 | 1564080712 |    1564080712 | file |  432 |     1 | 1564080712 |    NULL | +stanford+groups+nosnap+ |   1564080712 |              |                  |            0 |             0 |
+-------------------------+----------+------------+----------+---------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+--------------------------+--------------+--------------+------------------+--------------+---------------+
1 row in set (0.00 sec)

mysql> SELECT * from ACCT_STAT where uid='amiratag' and gid='oak_jamesz';
+----------+------------+------+--------------+------------------+----------------+-------------+---------+------+------+-------+---------+---------+--------+-------+------+-------+------+
| uid      | gid        | type | alert_status | modeguard_status | size           | blocks      | count   | sz0  | sz1  | sz32  | sz1K    | sz32K   | sz1M   | sz32M | sz1G | sz32G | sz1T |
+----------+------------+------+--------------+------------------+----------------+-------------+---------+------+------+-------+---------+---------+--------+-------+------+-------+------+
| amiratag | oak_jamesz | dir  |              | ok               |      536137728 |     1056240 |   18009 |    0 |    0 |     0 |   16525 |    1482 |      1 |     1 |    0 |     0 |    0 |
| amiratag | oak_jamesz | dir  |              | invalid          |           8192 |          16 |       2 |    0 |    0 |     0 |       2 |       0 |      0 |     0 |    0 |     0 |    0 |
| amiratag | oak_jamesz | file |              |                  | 15023241869454 | 29372049608 | 5195492 |    0 |  174 | 28239 | 2987544 | 1361072 | 770225 | 47937 |  263 |     0 |    0 |
+----------+------------+------+--------------+------------------+----------------+-------------+---------+------+------+-------+---------+---------+--------+-------+------+-------+------+
3 rows in set (0.00 sec)```

thiell avatar Jul 25 '19 18:07 thiell

Hello, we're wondering if this error could come from the fact that the sz0 column is unsigned but the trigger uses SIGNED cast:

https://github.com/cea-hpc/robinhood/blob/master/src/list_mgr/listmgr_init.c#L2493

mysql> describe ACCT_STAT;
+------------------+--------------------------------------------------------+------+-----+---------+-------+
| Field            | Type                                                   | Null | Key | Default | Extra |
+------------------+--------------------------------------------------------+------+-----+---------+-------+
| uid              | varbinary(127)                                         | NO   | PRI | unknown |       |
| gid              | varbinary(127)                                         | NO   | PRI | unknown |       |
| type             | enum('symlink','dir','file','chr','blk','fifo','sock') | NO   | PRI | file    |       |
| alert_status     | enum('','clear','alert')                               | NO   | PRI |         |       |
| modeguard_status | enum('','ok','invalid')                                | NO   | PRI |         |       |
| size             | bigint(20) unsigned                                    | YES  |     | 0       |       |
| blocks           | bigint(20) unsigned                                    | YES  |     | 0       |       |
| count            | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz0              | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz1              | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz32             | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz1K             | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz32K            | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz1M             | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz32M            | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz1G             | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz32G            | bigint(20) unsigned                                    | YES  |     | 0       |       |
| sz1T             | bigint(20) unsigned                                    | YES  |     | 0       |       |
+------------------+--------------------------------------------------------+------+-----+---------+-------+

Shouldn't be the cast as UNSIGNED?

thiell avatar Feb 28 '20 23:02 thiell

Hello Stephan, I think the cast to SIGNED is to allow the subtract operations. If this is the issue, perhaps changing sz0..sz1T to SIGNED as well may fix this?

tl-cea avatar Mar 02 '20 08:03 tl-cea

I think this may fix the issue: https://review.gerrithub.io/c/cea-hpc/robinhood/+/521073/2

tl-cea avatar Jul 21 '21 12:07 tl-cea