ocaml icon indicating copy to clipboard operation
ocaml copied to clipboard

Simplifications and fixes to multicore systhreads implementation (3/3)

Open gadmm opened this issue 3 years ago • 3 comments

Follow-up to #11271, #11385 and #11393 (cc @Engil, @kayceesrk).

Targets 5.1 (does not seem to fix important bugs). Details in the individual commit logs. Can you please have a look at the commit that removes dead code, in case this was unintended and hides a bug? I also added a FIXME regarding a behaviour from old systhreads that is not yet implemented in multicore (see comments below).

This PR lives on top of #11393, so only the 5 topmost commits belong to it.

No change entry needed.

gadmm avatar Jul 01 '22 22:07 gadmm

I have rebased on top of #11393, so keep in mind that only the 5 topmost commits belong to this PR.

gadmm avatar Jul 03 '22 21:07 gadmm

I rebased on top of #11385 and added some new minor fixes and clarifications. It still only contains less important fixes targeting 5.1. For clarity I convert it as a draft, we should review it when earlier PRs have been merged.

I added some more FIXMEs that I think are worth looking into.

(of course a round of precheck is welcome)

gadmm avatar Jul 06 '22 17:07 gadmm

In case anyone wonders (I just did), the current "PR path" appears to be:

  • already merged:
    • #11271, which was then reverted
    • #11390
    • #11403
  • next to review:
    • #11385
  • further PRs:
    • #11386 (the present PR)

gasche avatar Jul 10 '22 14:07 gasche

This should be rebased and made ready for review. While adapting to the new way systhread is initialized on every thread, I stumbled into a rabbit hole regarding error handling inside Domain.spawn and found other bugs with signals. I am not yet sure how to go about it, I just wanted to post an update.

gadmm avatar Jan 27 '23 20:01 gadmm

I have rebased on top of trunk. This could do with a round of review. Instructions for reviewers:

  • This is best reviewed commit-per-commit, with the help of additional detail in each commit log.
  • If someone knows systhread well, please review that commit "Remove dead code" does not reveal a regression in functionality compared to OCaml 4 (code that became dead for wrong reasons).
  • A few FIXMEs are added, coming from my review of the OCaml 5 systhread implem. and are about possible regressions wrt. OCaml 4. I do not intend or expect to have to fix these FIXMEs in this PR.
    • Some FIXME concern possible changes/regressions wrt. the "best effort" to support fork on a single domain.
    • One FIXME is about the delay for joining the tick threads on shutdown (25ms on average). On OCaml 4 there was no wait. When reviewing the code, it appeared to me that a solution could be to interrupt the sleep system call in the tick thread using the self-pipe trick with pselect (edit: or simply using pthread_cond_timedwait).
  • One FIXME is added to avoid delaying this PR further: the FIXME in domain_thread_func is about proper error handling inside domain initialization. Domain initialization was moved from OCaml to C code at the end of 5.0 development, which has created new issues. Errors during domain initialization should be converted into exceptions; currently this probably crashes or deadlocks. As said in my previous comment, fixing this has revealed to be a bit of a rabbit hole, it is doable but is better done separately in order to make progress on this PR. However, reviewers are free to disagree and refuse the addition of a FIXME comment in the code (in which case we can discuss solutions here).

gadmm avatar Mar 06 '23 15:03 gadmm

Do we have a volunteer to review this PR? I have assigned myself as "person in charge of making sure we don't forget about it completely", but I would be happy to let someone else do the review.

gasche avatar Mar 06 '23 17:03 gasche

I'm starting to review this PR, but I have a couple of difficulties in the way this PR and others like #11307 are organised which is making it harder for me to review the PR(s).

My main difficulty is that the PRs suggest many improvements that seem independent but are organised as several commits, each of which has to be read one after the other. The details of the improvements are in the commit messages. This has several downsides that make it harder to review such PRs.

  • On Github, the discussions on several unrelated improvements get mixed together in the same PR thread which makes it harder to follow along.
  • If some of the early improvements are OK, but the follow-ups need changes, the whole PR gets blocked. As a reviewer, I am not able to checkpoint my approval on individual commits. I'm forced to read the entire commit chain in order to see how these interact with each other.
  • IINM, this is also different from how OCaml compiler development operates today, which is a PR for every improvement that's independent.

Personally, I really find it hard to review PRs commit by commit.

May I recommend @gadmm to consider having separate PRs for each improvement so that it makes reviewing easier? Small improvements will go much faster with a PR per commit. If they are inter-related, please mention that it builds on top of another PR. Also, possibly consider marking the PRs that are targets in a dependence chain as drafts so that reviewer time is not wasted on reviewing PRs which may change based on whether the source PRs require changes. Reviewer time is precious and limited, and it would help if the PRs are organised to be reviewer friendly.

kayceesrk avatar May 27 '23 03:05 kayceesrk

Except for this one comment on the second commit (https://github.com/ocaml/ocaml/pull/11386/commits/142beb885999e07d5273fb8958811e5c077b0e9b#r1207685036), the other changes look fine to me.

kayceesrk avatar May 27 '23 07:05 kayceesrk

@gadmm it looks like the present PR has been reviewed three weeks ago, and would be mergeable in trunk after you rebase and take review comments into account. Are you available to do this in the next few weeks? If not, would you prefer if a maintainer took care of the rebase?

gasche avatar Jun 15 '23 20:06 gasche

As I understand it, this is not scheduled for 5.1 so there is no rush.

gadmm avatar Jun 16 '23 16:06 gadmm

As I understand it, this is not scheduled for 5.1 so there is no rush.

It would be useful to get this towards a merge soon as the code has already gone out of sync with trunk and needs to be rebased. Merging it faster would mean that fixing conflicting changes would be the responsibility of folks who propose newer PRs ;-)

kayceesrk avatar Jun 27 '23 08:06 kayceesrk

Thanks for the gentle pressure, I am well aware and this has been at the top of my pile. As you can guess my own time is constrained, and I also received many solicitations from the OCaml github concurrently. I have been prioritizing 5.1. (Even though the release schedule is kept private for some odd reason, I have some sense that 5.1 is the priority at this time.)

Still on the topic of what we could do better to make sure that aging PRs end up getting merged, would monthly reminders be received positively? Would they help?

gadmm avatar Jun 27 '23 16:06 gadmm

The "fixup" comments are for reviewing my fixes. They will be squashed later.

gadmm avatar Jul 09 '23 15:07 gadmm

Except for this one comment on the second commit (https://github.com/ocaml/ocaml/commit/142beb885999e07d5273fb8958811e5c077b0e9b#r1207685036), the other changes look fine to me.

Force push lost my comment ?! :-( I don't know what my comment was, but I am assuming that it was fixed. If @gadmm can confirm, I think the PR is good to go.

kayceesrk avatar Jul 10 '23 11:07 kayceesrk

The comment is still there right above (https://github.com/ocaml/ocaml/pull/11386#discussion_r1207685036), along with my reply, but it looks like your pointer was invalidated. (I got the still-valid link by doing "copy link" directly from the PR discussion, which seems to give a different URL than yours.)

Let me know about the last point, and then I will clean-up the PR.

gadmm avatar Jul 17 '23 17:07 gadmm

Let me know about the last point, and then I will clean-up the PR.

LGTM. Please also add a changes entry. I'll aim to merge this right after that.

kayceesrk avatar Jul 18 '23 04:07 kayceesrk

Rebased & added a Changes entry (mostly to credit the work done by reviewers, as no change entry was needed for these minor changes & fixes IMO).

Edit: as already mentioned in a different discussion, the general advice on the organisation of this PR, which mentions various points that were already applied, and did not seem related to the delay in starting the review, ended up discussed in private.

gadmm avatar Jul 18 '23 16:07 gadmm