elpaca icon indicating copy to clipboard operation
elpaca copied to clipboard

[Bug/Support]: elpaca<-create is potentially non-atomic causing some order to be processed twice

Open Alan-Chen99 opened this issue 10 months ago • 6 comments

Confirmation

  • [x] I have checked the documentation (README, Wiki, docstrings, etc)
  • [ ] I am checking these without reading them.
  • [x] I have searched previous issues to see if my question is a duplicate.

Elpaca Version

forked from 141b2f5

Operating System

ubuntu

Description

elpaca<-create calls elpaca-menu-functions which calls the async url-retrieve-synchronously. During that another package will potentially depend on the package and cause elpaca<-create to be called again for the same id.

This comes up after using the new lock file feature where some package does not search inside menus, so it is no longer the case that all the menus are fetched up front.

Alan-Chen99 avatar Feb 09 '25 18:02 Alan-Chen99

Thanks for the report.

elpaca<-create calls elpaca-menu-functions which calls the async url-retrieve-synchronously.

Typo? url-retrieve-synchronously is synchronous.

During that another package will potentially depend on the package and cause elpaca<-create to be called again for the same id.

This comes up after using the new lock file feature where some package does not search inside menus, so it is no longer the case that all the menus are fetched up front.

Can you provide a test using the elpaca-test macro? It may help me understand the issue more clearly.

The way the lock file is intended to work is that the recipes for all init packages and their dependencies are written. Then they are used as the first menu. This should prevent other menus from being checked during init altogether.

progfolio avatar Feb 09 '25 18:02 progfolio

Unfortunately the only logs I have are at https://github.com/Alan-Chen99/dotfiles3/actions/runs/13228588924/job/36922616492#logs This doesnt reproduce locally; Im not sure why. In the logs compat failed with "Unable to find main elisp file for \"compat\"" which can be seen from the stacktrace to occur during the invocation of elpaca<-create(compat). Im not sure if Unable to find main elisp file is caused by this or not; It might be due to another problem. Eventually this leads to dependents of compat not being failed properly. Im not sure why though.

Alan-Chen99 avatar Feb 09 '25 18:02 Alan-Chen99

It's hard to tell exactly what's going on based off of those logs alone. It looks like you've introduced some advice in the system, which I have no way to evaluate. If you're able to find a reliable way to reproduce the issue, feel free to comment here and we can look into it more.

progfolio avatar Feb 09 '25 21:02 progfolio

I've found a reliable way to reproduce the issue. This will fail for Emacs 29 and below:

(elpaca-test
  :interactive t
  :init
  (elpaca transient)
  (elpaca magit))

Your initial analysis seems probable. Compat is queued twice.

elpaca<-create calls elpaca-menu-functions which calls the async url-retrieve-synchronously.

Typo? url-retrieve-synchronously is synchronous.

After digging into the source of url-retrieve-synchronously, I see it calls accept-process-output, which will allow subprocesses to run. I think that may have something to do with it, but I still don't see how the race could occur, considering queuing the packages all takes place on the main elisp thread.

Adding an explicit declaration for compat works around the issue:

(elpaca-test
  :interactive t
  :init
  (elpaca compat)
  (elpaca transient)
  (elpaca magit))

And the issue doesn't occur when the menu caches are present. I'll have to dig into this more.

progfolio avatar Feb 23 '25 20:02 progfolio

After digging into the source of url-retrieve-synchronously, I see it calls accept-process-output, which will allow subprocesses to run. I think that may have something to do with it, but I still don't see how the race could occur, considering queuing the packages all takes place on the main elisp thread.

I believe during accept-process-output other process filters got ran during which can cause a package to go to the next step and queue its dependencies.

Alan-Chen99 avatar Feb 26 '25 04:02 Alan-Chen99

Related: #428

progfolio avatar Mar 15 '25 15:03 progfolio