joomla-cms icon indicating copy to clipboard operation
joomla-cms copied to clipboard

[5.2] Fix Postgres Problem in our testing

Open rdeutz opened this issue 1 year ago • 9 comments

Summary of Changes

Clean up the data in a different way.

rdeutz avatar Oct 18 '24 08:10 rdeutz

From my point of view, it would be more beneficial to perform the cleanup in beforeEach rather than in afterEach. This prevents tests from failing after they are interrupted and avoids requiring the installation step. This change should be applied consistently across all tests, either by cleaning up all test-created database objects or using transactions. I like to discuss this further with @laoneo and @alikon.

Regarding the failing System Tests with the error duplicate key value violates unique constraint "_users_pkey" with PostgreSQL, from my point of view it is related to cleanupDB() is not completed before the next test run creates a new user and needs to be fixed on chaining and I'am working on resolving it.

muhme avatar Oct 18 '24 09:10 muhme

From my point of view, it would be more beneficial to perform the cleanup in beforeEach rather than in afterEach. This prevents tests from failing after they are interrupted and avoids requiring the installation step. This change should be applied consistently across all tests, either by cleaning up all test-created database objects or using transactions. I like to discuss this further with @laoneo and @alikon.

Makes sense

Regarding the failing System Tests with the error duplicate key value violates unique constraint "_users_pkey" with PostgreSQL, from my point of view it is related to cleanupDB() is not completed before the next test run creates a new user and needs to be fixed on chaining and I'am working on resolving it.

The cleanupDB is a good idea, haven't looked at the test for a longer time and didn't noticed this. But as far as I understand it it runs only after a block and not after a single test. So my guess is that this will not help.

rdeutz avatar Oct 18 '24 09:10 rdeutz

@rdeutz @richard67 @laoneo @alikon @LadySolveig

I am now happy 😄 to be able to consistently reproduce the PostgreSQL duplicate pkey — on three different systems, and three times in a row on each one. This is important, as despite significant effort, I’ve only observed single occurrences of the issue, even on the slowest 1 vCPU VPS. I'm demonstrating it using JBT with:

scripts/create 52 pgsql
cat > branch_52/tests/System/integration/install/duplicate.cy.js <<EOF
describe('Test duplicate key value error', () => {
  it('in creating 1,000 users', () => {
    for (let i = 1; i <= 1000; i++) {
        cy.db_createUser({username: \`test user \${i}\` });
    }
  });
});
EOF
scripts/test 52 system install/duplicate.cy.js 

Results in:

  1) Test duplicate key value error
       in creating 1,000 users:
     CypressError: `cy.task('queryDB')` failed with the following error:

> duplicate key value violates unique constraint "jos52_users_pkey"

Curiously, it only works on a fresh database. So for the next tests, you have to create a new one (it only takes about a minute):

scripts/database 52 pgsql
scripts/test 52 system install/duplicate.cy.js

Therefore, it seems the issue is not related to the cleanupDB task. At the moment, I assume that deleting tables will not resolve this PostgreSQL problem of generating non-unique primary keys with serial.

To remember we have seen it most times on administrator/components/com_privacy/Consent.cy.js, but also on administrator/components/com_users/Users.cy.js.

My question is: should this be solved in System Tests only with a table lock, using transactions, or maybe just a small wait? Or is it a bigger issue that requires a solution like using DEFERRABLE when creating the table? Are there other tools, such as importing mass data, that could have the problem too?

muhme avatar Oct 20 '24 08:10 muhme

@muhme But shouldn't that already be solved when this change is made here? Then the table is always emptied before the test runs and we don't have the problem if the test is cancelled? https://github.com/rdeutz/joomla-cms/pull/18

Could you perhaps rerun your test with this change to see if this could solve the problem very easy. Perhaps we should switch to the beforeEach event for resetting the data in all tests. This is also described as best practice in the Cypress documentation.

LadySolveig avatar Oct 20 '24 08:10 LadySolveig

Curiously, it only works on a fresh database.

That could mean we have some problem with the initial value of a sequence on a new installation, possibly not only for the system tests bit in general, and it just hasn’t been noticed yet. Hard to believe, but who knows?

@muhme What happens if you manually do on a new installation what the system test does?

richard67 avatar Oct 20 '24 08:10 richard67

@muhme But shouldn't that already be solved when this change is made here? Then the table is always emptied before the test runs and we don't have the problem if the test is cancelled? rdeutz#18

Could you perhaps rerun your test with this change to see if this could solve the problem very easy. Perhaps we should switch to the beforeEach event for resetting the data in all tests. This is also described as best practice in the Cypress documentation.

@LadySolveig From my point of view, this is not a duplicate in the unique username, it is a duplicate in the primary key id, which is auto-incremented by PostgreSQL. The main issue is reproducing the error. I've spent many hours trying, and I’ve only seen single, non-reproducible occurrences. Inserting waits in JavaScript and SELECT pg_sleep(0.1). Even after running the entire System Tests multiple times or Consent.cy.js 100 times, with additional CPU, database, and/or I/O load, the error was not reproducable. This includes tests on a VPS with only a single vCPU. But on all four systems I saw the error. And horrible even after changing JavaScript NPM module from postgres to pg with PR 44084. In my opinion, we need to fix and verify with duplicate.cy.js creating 1,000 users.

muhme avatar Oct 20 '24 09:10 muhme

@richard67

What happens if you manually do on a new installation what the system test does?

JBT's scripts/create and scripts/database are running tests/System/integration/install/Installation.cy.js. I'm not sure if that was your question, but for me, the key point is being able to reproduce the problem. Now that we can, we can fixing it and verifying that the fix works.

muhme avatar Oct 20 '24 09:10 muhme

@muhme yes, now i have understood what you mean. 👍🏼 Sorry that I didn't read more carefully. I think the approach to test if there is something fundamentally wrong on the testing system for postgrees as @richard67 mentioned would be advisable in any case.

But quite apart from that, I think we should consider the switch to onBefore for resetting the data. If you run the tests via Gui and change something in between and the live reloading takes effect or you would cancel a test during a headless run, most of them would not be able to run a second time. 🤷🏼‍♀️ I also don't see any real gain in doing it differently than Cypress describes in the best practices.

LadySolveig avatar Oct 20 '24 09:10 LadySolveig

I was wondering how user creation works from the command-line tool on a fresh database:

for i in {1..10000}; do                                                                  
  echo "INSERT INTO jos52_users (username, name, email, password, \"registerDate\", params) VALUES ('user $i', 'test', '[email protected]', '098f6bcd4621d373cade4e832627b4f6', NOW(), '' );" >> /tmp/a.sql
done
scripts/database 52 pgsql
docker cp /tmp/a.sql jbt_pg:/tmp/a.sql
docker exec jbt_pg bash -c "PGPASSWORD=root psql -U root -d test_joomla_52 -f /tmp/a.sql"

~The result is that, using psql, we can create 10,000 user entries without any issues.~

muhme avatar Oct 21 '24 04:10 muhme

if you run the install\installtion.cy.js it create a SuperUser with a id random(1,1000)

but the sequence value is still 1 SELECT currval(pg_get_serial_sequence('jos_users', 'id')) AS new_id and if after you run this you obvoisly got duplicate key

describe('Test duplicate key value error', () => {
  it('in creating 1,000 users', () => {
    for (let i = 1; i <= 1000; i++) {
        cy.db_createUser({username: \`test user \${i}\` });
    }
  });
});

for avoid this you need to change the script

  it('in creating 1,000 users', () => {
    cy.task('queryDB', `SELECT MAX(id) +1 as id FROM #__users`)
      .then((id) => {
        console.log('id:', id[0].id);
        cy.task('queryDB', `SELECT setval('#__users_id_seq', ${id[0].id}, false)`)
      })
    for (let i = 1; i <= 1000; i++) {
        cy.db_createUser({username: `test user ${i}` });
    }
  });

so the problem is that the sequence value is not updated correctly after the creation of the SUperUser

alikon avatar Oct 21 '24 06:10 alikon

please check #44324 should fix the issue with postgresql duplicate key

alikon avatar Oct 21 '24 08:10 alikon

Do we still need this one as #44324 fixes the problem?

laoneo avatar Oct 21 '24 13:10 laoneo

Thank you, Nicola for open our eyes 👍 With that insight, I now realize I was wrong whith:

The result is that, using psql, we can create 10,000 user entries without any issues.

Of course, this fails as well, but I didn't notice it yesterday in the 10,000 lines of output. It's better to run with:

docker exec jbt_pg bash -c "PGPASSWORD=root psql -U root -d test_joomla_52 --set ON_ERROR_STOP=on -f /tmp/a.sql

And then, of course, we see the result:

...
INSERT 0 1
INSERT 0 1
INSERT 0 1
psql:/tmp/a.sql:263: ERROR:  duplicate key value violates unique constraint "jos52_users_pkey"
DETAIL:  Key (id)=(263) already exists.

MariaDB and MySQL do not have the problem, they continue on their own with the following ID.

muhme avatar Oct 22 '24 03:10 muhme

Again, do we still need this PR?

Hackwar avatar Nov 27 '24 14:11 Hackwar

  • @alikon's #44324 found the root cause of the PostgreSQL duplicate key value violates unique constraint issue and fixed it. 👍😄
  • Cleanup using beforeEach instead of afterEach, would be more useful to apply this approach across all tests, or using transactions.
  • I recommend closing this PR as the original problem is fixed.

muhme avatar Nov 28 '24 05:11 muhme