core icon indicating copy to clipboard operation
core copied to clipboard

Postgres: preview cleanup job failing

Open DeepDiver1975 opened this issue 1 year ago • 18 comments

1) Test\PreviewCleanupTest::test

Doctrine\DBAL\Exception\DriverException: An exception occurred while executing 'SELECT "fileid", "name", "storage"

FROM "oc_filecache" "thumb"

WHERE "parent" IN (

  SELECT "fileid"

  FROM "oc_filecache"

  WHERE "storage" IN (

    SELECT "numeric_id"

    FROM "oc_storages"

    WHERE "id" LIKE 'home::%' OR "id" LIKE 'object::user:%'

  )

  AND "path_hash" = '3b8779ba05b8f0aed49650f3ff8beb4b'

)

AND NOT EXISTS (

  SELECT 1

  FROM "oc_filecache"

  WHERE "fileid" = CAST("thumb"."name" AS BIGINT)

)

AND "fileid" > ?

ORDER BY "storage" limit 1000' with params [0]:


SQLSTATE[22P02]: Invalid text representation: 7 ERROR:  invalid input syntax for integer: "welcome.txt"


/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/AbstractPostgreSQLDriver.php:102

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php:182

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php:159

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Connection.php:2226

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Connection.php:1313

/drone/src/lib/private/DB/Connection.php:191

/drone/src/lib/private/PreviewCleanup.php:145

/drone/src/lib/private/PreviewCleanup.php:53

/drone/src/tests/lib/PreviewCleanupTest.php:106

phpvfscomposer:///drone/src/lib/composer/phpunit/phpunit/phpunit:106


Caused by

Doctrine\DBAL\Driver\PDO\Exception: SQLSTATE[22P02]: Invalid text representation: 7 ERROR:  invalid input syntax for integer: "welcome.txt"


/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDO/Exception.php:18

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:119

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Connection.php:1304

/drone/src/lib/private/DB/Connection.php:191

/drone/src/lib/private/PreviewCleanup.php:145

/drone/src/lib/private/PreviewCleanup.php:53

/drone/src/tests/lib/PreviewCleanupTest.php:106

phpvfscomposer:///drone/src/lib/composer/phpunit/phpunit/phpunit:106


Caused by

PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR:  invalid input syntax for integer: "welcome.txt"


/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:117

/drone/src/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Connection.php:1304

/drone/src/lib/private/DB/Connection.php:191

/drone/src/lib/private/PreviewCleanup.php:145

/drone/src/lib/private/PreviewCleanup.php:53

/drone/src/tests/lib/PreviewCleanupTest.php:106

phpvfscomposer:///drone/src/lib/composer/phpunit/phpunit/phpunit:106

https://drone.owncloud.com/owncloud/core/39300/21/8

We had a similar issue before ......

DeepDiver1975 avatar Oct 11 '23 14:10 DeepDiver1975

I think that I have seen this sometimes in CI, but then the tests pass on rerun, which is really strange (specially for a unit test)

phil-davis avatar Oct 11 '23 20:10 phil-davis

I was able to reproduce the invalid input syntax for.. exception on PostgreSQL 15.4 + oC 10.13.2. However, the exception is not reproducible 100% of the time when following steps below (maybe this explains why the tests pass on rerun):

  • Scenario 1:
- user "admin" creates "file.txt" with some text in it
- user "admin" deletes "file.txt", also from trashbin
- user "admin" creates "files2.txt" with some text in it and shares it with user "user1"
- user "admin" removes user "user1"
- user "admin" deletes "file2.txt", also from trashbin 
- now running "occ previews:cleanup" will trigger an exception (at first run only)
  • Scenario 2:
- user "admin" creates "file.txt" with some text in it
- user "admin" deletes "file.txt", also from trashbin
- user "admin" creates "files2.txt" with some text in it and shares it with user "user1"
- user "admin" deletes "file2.txt", also from trashbin
- now running "occ previews:cleanup" will trigger an exception (at first run only)
  • Scenario 3:
- user "admin" creates "file.txt" with some text in it
- user "admin" deletes "file.txt", also from trashbin
- user "admin" creates "files2.txt" with some text in it and deletes it (NOTE: file is still in trashbin)
- run "occ previews:cleanup" —> no exception
- user "admin" now deletes "files2.txt" from trashbin
- now running "occ previews:cleanup2" will trigger an exception (at first run only)
In PDOStatement.php line 117:
                                                                                                                  
  [PDOException (22P02)]                                                                                          
  SQLSTATE[22P02]: Invalid text representation: 7 ERROR:  invalid input syntax for type bigint: "files_trashbin"  
                                                                                                                  

Exception trace:
  at /var/www/html/owncloud/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:117
 PDOStatement->execute() at /var/www/html/owncloud/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:117
 Doctrine\DBAL\Driver\PDOStatement->execute() at /var/www/html/owncloud/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Connection.php:1304
 Doctrine\DBAL\Connection->executeQuery() at /var/www/html/owncloud/lib/private/DB/Connection.php:191
 OC\DB\Connection->executeQuery() at /var/www/html/owncloud/lib/private/PreviewCleanup.php:145
 OC\PreviewCleanup->queryPreviewsToDelete() at /var/www/html/owncloud/lib/private/PreviewCleanup.php:53
 OC\PreviewCleanup->process() at /var/www/html/owncloud/core/Command/Previews/Cleanup.php:72
 OC\Core\Command\Previews\Cleanup->execute() at /var/www/html/owncloud/lib/composer/symfony/console/Command/Command.php:298
 Symfony\Component\Console\Command\Command->run() at /var/www/html/owncloud/core/Command/Base.php:159
 OC\Core\Command\Base->run() at /var/www/html/owncloud/lib/composer/symfony/console/Application.php:1040
 Symfony\Component\Console\Application->doRunCommand() at /var/www/html/owncloud/lib/composer/symfony/console/Application.php:301
 Symfony\Component\Console\Application->doRun() at /var/www/html/owncloud/lib/composer/symfony/console/Application.php:171
 Symfony\Component\Console\Application->run() at /var/www/html/owncloud/lib/private/Console/Application.php:165
 OC\Console\Application->run() at /var/www/html/owncloud/console.php:94
 require_once() at /var/www/html/owncloud/occ:11

previews:cleanup [--output [OUTPUT]] [--all] [--] [<chunk_size>]

pako81 avatar Oct 11 '23 21:10 pako81

Maybe related to the cast done around https://github.com/owncloud/core/blob/master/lib/private/PreviewCleanup.php#L105-L111

pako81 avatar Oct 12 '23 06:10 pako81

Yes, the problem is likely in the cast function.

The query has 2 filters: the "where parent in" and the "not exists". In theory, the "where parent in" filter should be applied first, so the "not exists" filter have casts over the previews, which are expected to have numbers as names. However, if the "not exists" filter is applied first for whatever reason, we could have casts over regular filenames such as the "welcome.txt" file.

jvillafanez avatar Oct 16 '23 07:10 jvillafanez

@jvillafanez following seems to solve the issue for me: https://gist.github.com/pako81/bbd7277130947becbe0bf0739ae3a31d#file-previewcleanup-php-L137-L159. But maybe there is a more efficient way to achieve this without the need to rewrite the query for PostgreSQL only.

pako81 avatar Oct 16 '23 07:10 pako81

There is not really an issue in having multiple dedicated queries for different dbms - at least in this highly specific case where the query has some complexity.

This is also why I originally did not use the query builder as this adds a hell of complexity to maintain it.

Having just two or three explicit strings holding the query is better for maintenance.

DeepDiver1975 avatar Oct 16 '23 08:10 DeepDiver1975

We could check the performance of the following query:

SELECT * FROM (
  SELECT `fileid`, `name`, `storage`
  FROM `oc_filecache` `thumb`
  WHERE `parent` IN (
    SELECT `fileid`
    FROM `oc_filecache`
    WHERE `storage` IN (
      SELECT `numeric_id`
      FROM `oc_storages`
      WHERE `id` LIKE 'home::%' OR `id` LIKE 'object::user:%'
    )
    AND `path_hash` = '3b8779ba05b8f0aed49650f3ff8beb4b'
  )
) `t`
WHERE NOT EXISTS (
  SELECT 1
  FROM `oc_filecache`
  WHERE `fileid` = CAST(`t`.`name` AS SIGNED)
)
AND `fileid` > 0
ORDER BY `storage`;

the performance should be fine (checked with mysql)

+----+-------------+--------------+------------+--------+--------------------------------------------------------------------------------------+----------------------+---------+-------------------------------+------+----------+-----------------------------------------------------------+
| id | select_type | table        | partitions | type   | possible_keys                                                                        | key                  | key_len | ref                           | rows | filtered | Extra                                                     |
+----+-------------+--------------+------------+--------+--------------------------------------------------------------------------------------+----------------------+---------+-------------------------------+------+----------+-----------------------------------------------------------+
|  1 | SIMPLE      | oc_filecache | NULL       | index  | PRIMARY,fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size | fs_storage_path_hash | 134     | NULL                          |   10 |    10.00 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | oc_storages  | NULL       | eq_ref | PRIMARY,storages_id_index                                                            | PRIMARY              | 4       | owncloud.oc_filecache.storage |    1 |   100.00 | Using where                                               |
|  1 | SIMPLE      | thumb        | NULL       | ref    | PRIMARY,fs_parent_name_hash,fs_parent_storage_size                                   | fs_parent_name_hash  | 8       | owncloud.oc_filecache.fileid  |    1 |   100.00 | Using index condition                                     |
|  1 | SIMPLE      | oc_filecache | NULL       | eq_ref | PRIMARY                                                                              | PRIMARY              | 8       | func                          |    1 |   100.00 | Using where; Not exists; Using index                      |
+----+-------------+--------------+------------+--------+--------------------------------------------------------------------------------------+----------------------+---------+-------------------------------+------+----------+-----------------------------------------------------------+
4 rows in set, 2 warnings (0.01 sec)

That should also work for postgresql I think

jvillafanez avatar Oct 16 '23 08:10 jvillafanez

https://github.com/owncloud/core/pull/41047 should fix the issue. I haven't tested with postgresql yet, but the CI passes

jvillafanez avatar Oct 16 '23 10:10 jvillafanez

Unfortunately, still reproducible with https://github.com/owncloud/core/pull/41047 in place.

pako81 avatar Oct 16 '23 11:10 pako81

Yes, it seems postgresql still changes the query plan even for the new query....

owncloud=# explain SELECT * FROM (
  SELECT "fileid", "name", "storage"
  FROM "oc_filecache" "thumb"
  WHERE "parent" IN (
    SELECT "fileid"
    FROM "oc_filecache"
    WHERE "storage" IN (
      SELECT "numeric_id"
      FROM "oc_storages"
      WHERE "id" LIKE 'home::%' OR "id" LIKE 'object::user:%'
    )
    AND "path_hash" = '3b8779ba05b8f0aed49650f3ff8beb4b'
  )
) "t"
WHERE NOT EXISTS (
  SELECT 1
  FROM "oc_filecache"
  WHERE "fileid" = CAST("t"."name" AS BIGINT)
)
AND "fileid" > 0
ORDER BY "storage";
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Sort  (cost=38.13..38.14 rows=1 width=528)
   Sort Key: thumb.storage
   ->  Nested Loop Anti Join  (cost=26.98..38.12 rows=1 width=528)
         ->  Nested Loop  (cost=26.84..34.88 rows=1 width=528)
               ->  HashAggregate  (cost=26.70..26.71 rows=1 width=8)
                     Group Key: oc_filecache_1.fileid
                     ->  Nested Loop Semi Join  (cost=0.15..26.69 rows=1 width=8)
                           ->  Seq Scan on oc_filecache oc_filecache_1  (cost=0.00..10.50 rows=1 width=12)
                                 Filter: ((path_hash)::text = '3b8779ba05b8f0aed49650f3ff8beb4b'::text)
                           ->  Index Scan using oc_storages_pkey on oc_storages  (cost=0.15..8.17 rows=1 width=4)
                                 Index Cond: (numeric_id = oc_filecache_1.storage)
                                 Filter: (((id)::text ~~ 'home::%'::text) OR ((id)::text ~~ 'object::user:%'::text))
               ->  Index Scan using fs_parent_storage_size on oc_filecache thumb  (cost=0.14..8.16 rows=1 width=536)
                     Index Cond: (parent = oc_filecache_1.fileid)
                     Filter: (fileid > 0)
         ->  Index Only Scan using oc_filecache_pkey on oc_filecache  (cost=0.14..3.24 rows=1 width=8)
               Index Cond: (fileid = (thumb.name)::bigint)
(17 rows)

owncloud=# explain SELECT * FROM (
  SELECT "fileid", "name", "storage"
  FROM "oc_filecache" "thumb"
  WHERE "parent" IN (
    SELECT "fileid"
    FROM "oc_filecache"
    WHERE "storage" IN (
      SELECT "numeric_id"
      FROM "oc_storages"
      WHERE "id" LIKE 'home::%' OR "id" LIKE 'object::user:%'
    )                                                 
    AND "path_hash" = '3b8779ba05b8f0aed49650f3ff8beb4b'
  )             
) "t"
WHERE NOT EXISTS (
  SELECT 1                                       
  FROM "oc_filecache"
  WHERE "fileid" = CAST("t"."name" AS BIGINT)
)                  
AND "fileid" > 0
ORDER BY "storage";
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Sort  (cost=12.18..12.18 rows=1 width=20)
   Sort Key: thumb.storage
   ->  Nested Loop Semi Join  (cost=1.55..12.17 rows=1 width=20)
         Join Filter: (thumb.parent = oc_filecache_1.fileid)
         ->  Hash Anti Join  (cost=1.41..2.74 rows=1 width=28)
               Hash Cond: ((thumb.name)::bigint = oc_filecache.fileid)
               ->  Seq Scan on oc_filecache thumb  (cost=0.00..1.23 rows=18 width=28)
                     Filter: (fileid > 0)
               ->  Hash  (cost=1.18..1.18 rows=18 width=8)
                     ->  Seq Scan on oc_filecache  (cost=0.00..1.18 rows=18 width=8)
         ->  Materialize  (cost=0.15..9.41 rows=1 width=8)
               ->  Nested Loop Semi Join  (cost=0.15..9.41 rows=1 width=8)
                     ->  Seq Scan on oc_filecache oc_filecache_1  (cost=0.00..1.23 rows=1 width=12)
                           Filter: ((path_hash)::text = '3b8779ba05b8f0aed49650f3ff8beb4b'::text)
                     ->  Index Scan using oc_storages_pkey on oc_storages  (cost=0.15..8.17 rows=1 width=4)
                           Index Cond: (numeric_id = oc_filecache_1.storage)
                           Filter: (((id)::text ~~ 'home::%'::text) OR ((id)::text ~~ 'object::user:%'::text))
(17 rows)

I'll close the PR.

I'm checking "common table expresions" as an alternative

jvillafanez avatar Oct 16 '23 11:10 jvillafanez

Let's see if https://github.com/owncloud/core/pull/41048 is more explicit and works on all the DBs

jvillafanez avatar Oct 16 '23 11:10 jvillafanez

It seems even with https://github.com/owncloud/core/pull/41048 issue is still there. At least manual tests seem to indicate so.

pako81 avatar Oct 16 '23 11:10 pako81

It seems even with https://github.com/owncloud/core/pull/41048 issue is still there. At least manual tests seem to indicate so.

Works for me, and the tests pass (except for mysql 5, which doesn't have the with clause). Is it reproducible? I'm out of ideas. I guess we'll have to write a query specifically for postgresql.

The weird thing is that it could also happen with the rest of the DBs. I mean, they could optimize the query the same way as postgresql; whether it triggers an error or does another thing, I don't know.

Maybe the best option is https://github.com/owncloud/core/pull/41047 + custom query for postgresql. Or we could keep the current code + the custom query for postgresql, assuming the DB won't do anything weird with the query.

jvillafanez avatar Oct 16 '23 12:10 jvillafanez

Works for me, and the tests pass (except for mysql 5, which doesn't have the with clause). Is it reproducible?

I was able to reproduce it by following the steps from Scenario 1 at https://github.com/owncloud/core/issues/41040#issuecomment-1758594224. Note: it may not be reproducible at first shot.

pako81 avatar Oct 16 '23 13:10 pako81

Yes, confirmed. You might need to have multiple users / storages so that the DB chooses to use the "not exists" condition first because it might cost less. I'll close that PR too.

I'll check support for regexp in all DBs

jvillafanez avatar Oct 16 '23 13:10 jvillafanez

We can try with something like (we'll need to adjust the type of the cast for each DB)

SELECT "fileid", "name", "storage"
FROM "oc_filecache" "thumb" 
WHERE "parent" IN (
  SELECT "fileid"
  FROM "oc_filecache"
  WHERE "storage" IN (
    SELECT "numeric_id"
    FROM "oc_storages"
    WHERE "id" LIKE 'home::%' OR "id" LIKE 'object::user:%'
  )
  AND "path_hash" = '3b8779ba05b8f0aed49650f3ff8beb4b'
)
AND NOT EXISTS (
  SELECT 1
  FROM "oc_filecache"
  WHERE "fileid" = CAST(REGEXP_REPLACE("thumb"."name", '(^.*[^[:digit:]]+.*$|^$)', CAST("thumb"."fileid" as VARCHAR(250))) as BIGINT)
)
AND "fileid" > 0
ORDER BY "storage"

I think the regexp_replace function is available in mysql, mariadb, postgresql and oracle. For sqlite3 we'll need to load the regexp extension from https://github.com/nalgeon/sqlean . We'll have to add some basic support to load sqlite3 extensions (this will require additional documentation)

As far as I know, the execution order of the filters depend on the query planner of each DB, and I don't think we have any control over it. This means that what is happening with postgresql could happen with any other DB. A defensive approach like the above should prevent problems

jvillafanez avatar Oct 17 '23 09:10 jvillafanez

Let's see if the third time is the charm...

https://github.com/owncloud/core/pull/41051 works for me, and tests pass.

Taking into account that casting a random string to an integer has an undefined / undocumented behavior for most of the DBs (at least it's difficult to find the info), we'll go through the "regexp" route by default to ensure we can cast the value. For mysql 5 and sqlite, which don't have the regexp_replace function, we'll use a different approach taking advantage of the behavior they have when casting a random string.

Explanation for the tricks used is in the code, and it should be clear enough

jvillafanez avatar Oct 18 '23 10:10 jvillafanez

confirmed to work: exceptions are not triggered anymore 👍

pako81 avatar Oct 18 '23 13:10 pako81

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Apr 16 '24 01:04 github-actions[bot]

This issue has been automatically closed.

github-actions[bot] avatar Apr 26 '24 01:04 github-actions[bot]