django icon indicating copy to clipboard operation
django copied to clipboard

Add Github actions for MacOS.

Open orf opened this issue 3 years ago • 18 comments

See discussion here: https://forum.djangoproject.com/t/improving-the-contribution-experience-with-github-actions/5964/

orf avatar Jan 04 '21 20:01 orf

Hey @orf — thanks for this. As @jdufresne noted on #13805, the Jenkins node version needs updating. See e.g. https://djangoci.com/job/pull-requests-javascript/9953/

I wonder if we couldn't start using GHA for those JS tests right now? 🤔 — Enable the check, disable the one on Jenkins. — At least as a trial — I'm pretty sure @felixxm doesn't want to spend hours on Jenkins this week. (? 🙂)

carltongibson avatar Jan 06 '21 15:01 carltongibson

Absolutely! I just realized that I haven't actually added the JS tests to docker-box or my POC, I only added the frontend selium tests. I can work on a PR to add JS tests with actions tonight 👍

orf avatar Jan 06 '21 15:01 orf

Cool. Thanks. We'll see if @felixxm likes the easy out there? 😀

carltongibson avatar Jan 06 '21 15:01 carltongibson

Please don't rush. I still have doubts but I also don't have time in this week to prepare a detailed comment, e.g. isort, flake8 and Windows builds use Python 3.9 on Jenkins; output is not formatted so it's less readable we have almost 14k+ tests so it can be frustrating especially with more than one failure; we have much more skipped tests 1212 vs. 1135 on Jenkins probably SQLite is compiled without the JSON1 etc.; it can be hard to control SQLite version, etc.

felixxm avatar Jan 06 '21 16:01 felixxm

As @jdufresne noted on #13805, the Jenkins node version needs updating. See e.g. https://djangoci.com/job/pull-requests-javascript/9953/

It uses the old node because there is some dependency collision with the libraries for MySQL, I don't remember details. I can check this later.

felixxm avatar Jan 06 '21 16:01 felixxm

Please don't rush.

Ahh, ok sorry I missed this comment. I made this PR: https://github.com/django/django/pull/13851, it was pretty simple!

I look forward to your comment on the proposal!

orf avatar Jan 06 '21 17:01 orf

Sorry, wrong branch 😂. https://github.com/django/django/pull/13852

It uses the old node because there is some dependency collision with the libraries for MySQL, I don't remember details. I can check this later.

This is one of the big advantages of Github actions, you avoid a lot of this stateful-CI-runner mess. Everything is reproducible and flexible by design. There are some downsides to this, as discussed in the proposal, but the fact that our JS tests are impacted by a MySQL dependency is a sign things could be improved.

orf avatar Jan 06 '21 17:01 orf

Hi,

Please don't rush. I still have doubts but I also don't have time in this week to prepare a detailed comment

No worries, take your time!

e.g. isort, flake8 and Windows builds use Python 3.9 on Jenkins;

This is a one line change (literally) and the same holds true as soon as we need to switch the version.

output is not formatted so it's less readable we have almost 14k+ tests so it can be frustrating especially with more than one failure

Agreed, we certainly can find something there. On the upside flake8 & isort errors can be shown directly in the PR https://aws1.discourse-cdn.com/standard14/uploads/djangoproject/original/2X/f/f59e48cb2c85e16282f1de4e33461b03bcb2f6ec.png which imo is a massive gain. Granted as Tom suggested in the forum topic, we should step slowly and integrate what makes sense. FWIW I think even if everything works we might keep a scaled down jenkins instance for one or two things…

we have much more skipped tests 1212 vs. 1135 on Jenkins probably SQLite is compiled without the JSON1 etc.; it can be hard to control SQLite version, etc.

It is actually easier to control sqlite versions since we'd have our on docker containers. In Jenkins we have the problem with that in the worst case we need different slaves for different django projects we can just adjust our containers here. Also issues like these can be fixed by contributors and not just people with access to Jenkins. As for the skipped tests: Yes there is certainly work left to be done…

Please also consider the following: We currently do not test on Mac at all etc and we would get two different Windows & Mac versions on Github actions.

We obviously don't want to have a degraded experience. But I think in quite a few instances we can improve and in the others at least reach an equivalent experience.

apollo13 avatar Jan 06 '21 17:01 apollo13

I moved flake8 and isort to a separate PR, #13921.

felixxm avatar Jan 20 '21 08:01 felixxm

Sorry about the commit spam, I was experimenting with problem matchers. It didn't work 😭.

Is there anything else I should do to this PR? I've removed the linting, added timing information and made the test logging more verbose.

orf avatar Feb 02 '21 00:02 orf

Master appears to have several failures on MacOS that have crept in. I'm reproducing them locally now and I'll see what's up. However I'd like to ask again: what are the blockers to having this merged @felixxm? Please let me know so I can put some time aside to working through them. A non-trivial amount of our users develop on MacOS, and it's scary to see something as core as BasicExpressionTests failing in master.

Perhaps we could merge this with allow_failures set to True so we can assess the impact without blocking merges? I expect it would help with #12646

orf avatar Apr 04 '21 12:04 orf

Apparently the culprit is https://github.com/django/django/pull/13300 which fails on sqlite 3.35.3. I'm guessing this started failing in 3.35.0 (on homebrew from March 24th), which includes this potentially relevant changelog entry: Attempt to process EXISTS operators in the WHERE clause as if they were IN operators, in cases where this is a valid transformation and seems likely to improve performance..

I'll bisect this while adding support for different sqlite3 versions on MacOS to this branch, which seems useful.

orf avatar Apr 04 '21 13:04 orf

Looking at the thread https://github.com/django/django/pull/13840#issuecomment-755395988 isn't really addressed. I'll have a think about those comments whilst I'm looking at it.

carltongibson avatar Apr 06 '21 16:04 carltongibson

I've added support for multiple sqlite versions to the MacOS tests. You can do the same steps locally:

brew uninstall --ignore-dependencies sqlite
brew tap-new $USER/local-sqlite
brew extract sqlite --version=${{ matrix.sqlite }} $USER/local-sqlite
brew install sqlite@${{ matrix.sqlite }}
ln -s /usr/local/opt/sqlite@${{ matrix.sqlite }}/ /usr/local/opt/sqlite

With 3.34.0 then expressions.tests.BasicExpressionsTests passes, but with the latest (3.35.3) they fail.

If we can work out some steps to do something similar with Windows (using chocolatey?) then we can unify the two steps again.

Looking at the thread #13840 (comment) isn't really addressed. I'll have a think about those comments whilst I'm looking at it.

I was sort of waiting for the detailed comment 😅, but we've addressed isort/flake8 and the less-readable output. JSON1 is compiled in and it's actually far easier to control and change the sqlite3 version. I'll look at the differences in tests now.

orf avatar Apr 06 '21 22:04 orf

Not gonna lie, this looks like it's an issue with sqlite 😱. I've reduced it it down to this test setup:

CREATE TABLE "expressions_company"
(
    "id"                  integer          NOT NULL PRIMARY KEY AUTOINCREMENT,
    "name"                varchar(100)     NOT NULL,
    "num_employees"       integer unsigned NOT NULL CHECK ("num_employees" >= 0),
    "num_chairs"          integer unsigned NOT NULL CHECK ("num_chairs" >= 0),
    "ceo_id"              integer          NOT NULL REFERENCES "expressions_employee" ("id") DEFERRABLE INITIALLY DEFERRED,
    "point_of_contact_id" integer          NULL REFERENCES "expressions_employee" ("id") DEFERRABLE INITIALLY DEFERRED,
    "based_in_eu"         bool             NOT NULL
);
INSERT INTO "expressions_company"
VALUES (1, 'Example Inc.', 2300, 5, 1, NULL, 0);
INSERT INTO "expressions_company"
VALUES (2, 'Foobar Ltd.', 3, 4, 2, NULL, 1);
INSERT INTO "expressions_company"
VALUES (3, 'Test GmbH', 32, 1, 3, NULL, 0);
CREATE TABLE "expressions_employee"
(
    "id"        integer     NOT NULL PRIMARY KEY AUTOINCREMENT,
    "firstname" varchar(50) NOT NULL,
    "lastname"  varchar(50) NOT NULL,
    "salary"    integer     NULL
);
INSERT INTO "expressions_employee"
VALUES (1, 'Joe', 'Smith', 10);
INSERT INTO "expressions_employee"
VALUES (2, 'Frank', 'Meyer', 20);
INSERT INTO "expressions_employee"
VALUES (3, 'Max', 'Mustermann', 30);

UPDATE "expressions_company"
SET "name"                = 'Test GmbH',
    "num_employees"       = 32,
    "num_chairs"          = 1,
    "ceo_id"              = 3,
    "point_of_contact_id" = 3,
    "based_in_eu"         = 0
WHERE "expressions_company"."id" = 3;

The following query on sqlite 3.32.3 returns three results:

❯ sqlite3 test.db
SQLite version 3.32.3 2020-06-18 14:16:19
Enter ".help" for usage hints.
sqlite> SELECT "expressions_employee"."id",
   ...>        "expressions_employee"."firstname",
   ...>        "expressions_employee"."lastname",
   ...>        "expressions_employee"."salary"
   ...> FROM "expressions_employee"
   ...> WHERE (EXISTS(SELECT (1) AS "a"
   ...>               FROM "expressions_company" U0
   ...>               WHERE U0."ceo_id" = "expressions_employee"."id"
   ...>               LIMIT 1) OR EXISTS(SELECT (1) AS "a"
   ...>                                  FROM "expressions_company" U0
   ...>                                  WHERE U0."point_of_contact_id" = "expressions_employee"."id"
   ...>                                  LIMIT 1));
1|Joe|Smith|10
2|Frank|Meyer|20
3|Max|Mustermann|30

But on 3.35.3 it returns just one row:

❯ /usr/local/opt/sqlite/bin/sqlite3 test.db
SQLite version 3.35.3 2021-03-26 12:12:52
Enter ".help" for usage hints.
sqlite> SELECT "expressions_employee"."id",
       "expressions_employee"."firstname",
       "expressions_employee"."lastname",
       "expressions_employee"."salary"
FROM "expressions_employee"
WHERE (EXISTS(SELECT (1) AS "a"
              FROM "expressions_company" U0
              WHERE U0."ceo_id" = "expressions_employee"."id"
              LIMIT 1) OR EXISTS(SELECT (1) AS "a"
                                 FROM "expressions_company" U0
                                 WHERE U0."point_of_contact_id" = "expressions_employee"."id"
                                 LIMIT 1));
1|Joe|Smith|10
sqlite>

orf avatar Apr 06 '21 23:04 orf

I've posted a bug report to the sqlite forums. I think I might have done it twice. My god, that interface has got to be the worst I've seen in a long time.

https://sqlite.org/forum/forumpost/58006ebd08

Removing the redundant LIMIT 1 from the EXISTS subquery makes the query return the correct number of results.

This was added here: https://github.com/django/django/commit/c54b8ec2f5122ae13bcb4e1d00d8e8f6bcacf2cf, but I'm not sure it makes sense in a nested context. At least, that's what IntelliJ is telling me:

Screenshot 2021-04-07 at 00 44 42

Edit:

Seems like it's fixed in 3.35.4, which should be out in homebrew soon ™️ https://github.com/Homebrew/homebrew-core/pull/74485

orf avatar Apr 06 '21 23:04 orf

Seems like it's fixed in 3.35.4, which should be out in homebrew soon tm Homebrew/homebrew-core#74485

I can confirm that it's fixed in 3.35.4.

felixxm avatar Apr 07 '21 06:04 felixxm

I think we should pull out from this adding a single build of macOS, with libmemcached if you like

Agreed! I think we should keep the homebrew method for running a specific version in the script, but it would install the latest? Or we could make it use the system default.

I wonder if we could have the macOS job just run on main? Is there a Not on forks filter? 🤔

We can only trigger it on main or any merge request that has a ci-macos label applied? That might be more flexible. We can also exclude forks I believe.

I'm not really seeing the benefit of moving the Windows build just now.

The question I'd have is: why not move it? It's true that there is no gap in coverage like MacOS but the general act bringing our CI files in-repo has a bunch of other benefits (least of all being able to view build logs for more than 48 hours...). We can always run Windows tests on both for a time to see how it goes?

orf avatar Apr 08 '21 14:04 orf

I'm going to close due to inactivity. Happy if someone wants to pick this up again.

carltongibson avatar Dec 21 '22 10:12 carltongibson