Remove lem-legit:*vcs* by making users take it as a function parameter
Emacs makes extensive use of dynamic bindings/scope: that's not a reason we should do the same. Common Lisp supports lexical scoping, unlike Elisp when it started out, and in my opinion we should really, really, try to avoid dynamic binds where a function parameter would do.
Yes, this adds the need to plumb an extra parameter around, but that's well worth the cost: we can always package/split function parameters as needed and as they make sense, whereas dynamic bindings may prove to be a huge pain point in the future if Lem starts making use of paralellism.
The Emacs documentation itself puts it succinctly [1]
Lexical binding opens up many more opportunities for optimization, so programs using it are likely to run faster in future Emacs versions. Lexical binding is also more compatible with concurrency,
Now, focusing on Legit itself: much of the in lem/porcelain uses the pattern
(defun some-legit-function (vcs)
(case vcs
(:git (git-fn))
(:hg (hg-fn))
(:t (fallthrough))))
There's no reason to have all of this dispatch logic done manually: this is a textbook case for using generic functions. Now, the thing is that generic functions need something to dispatch on. And it's pretty clear that the VCS parameter (representing, really, the repository [2]) is the perfect thing for that.
By switching all the legit-functions to be generic functions, the implementation for each VCS can also define their vcs objects to contain whatever data they might find useful: whether that's some sort of legit-side cache for commits, FFI handles (will be useful for libgit2 support for instance), etc.
[1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Lexical-Binding.html [2] Might be worth a rename, but I don't want to clutter this particular change.
I do plan to actually do the subsequent refactoring that I mention in the changelog- but that might take a bit, and in the meantime I don't want to let this change become outdated- so requesting a merge now
Hopefully I'm not digging myself into a bigger and bigger hole here: pushed up some of that refactoring I was talking about.
The two other things I'm thinking of doing for this PR:
- Move all the implementations of the generic functions into separate files, ie make a
porcelain-git.lisp,porcelain-hg.lisp, etc - Remove the one-interactive-rebase-per-process limitation (since we can now shove whatever data we want into
vcs-git:)
EDIT: Both done
Fixed some linter errors
There's no reason to have all of this dispatch logic done manually: this is a textbook case for using generic functions. Now, the thing is that generic functions need something to dispatch on. And it's pretty clear that the VCS parameter (representing, really, the repository [2]) is the perfect thing for that.
I made choices to avoid this so far,
but there's much more in your PR so I'll review calmly in the next couple days, thanks.
Move all the implementations of the generic functions into separate files, ie make a porcelain-git.lisp, porcelain-hg.lisp, etc
I fear we now loose track of what's implemented for which system, but I'll read carefully.
I have more feedback for the review to add, mostly style and a couple things you are un-doing,
but let's speak about the important question: what is this fixing? (going with generic functions)
whereas dynamic bindings may prove to be a huge pain point in the future if Lem starts making use of paralellism.
given this https://github.com/lem-project/lem/pull/1443#issuecomment-2242669955 it looks like you started this development with a false premise?
from the SBCL manual:
bindings (e.g. using LET) are local to the thread
so the pattern I envision is:
(start-thread
(with-current-project ()
;; this does:
(let ((*vcs (get-current-vcs)))
(do-stuff)))
Does something need fixing right now?
Is legit's design really bad for multi-threading? (if so please ELI5 : ) )
still related, about the case… vcs pattern:
There's no reason to have all of this dispatch logic done manually: this is a textbook case for using generic functions. Now, the thing is that generic functions need something to dispatch on.
yes, but because we can doesn't imply we should, and I have been actively avoiding to go this way so far. Writing these "case" is no big deal. We could also write a simple macro to save a few lines, but that isn't necessary. The reasons I avoided the GF so far are:
- the current way with plain functions is easier and faster to work with on the REPL during development: you can quickly call
(git-commits)and friends.- if you call the "commits" general function that accepts
*vcs*as argument, you can still use "with-current-project" or bind the global*vcs*for development or use alet.
- if you call the "commits" general function that accepts
- the
porcelaincode base is lighter, has less ceremony and I like it. - each vcs function can have a different set of key arguments. Now each method that implements the generic function must have the exact same lambda list. We could use
&allow-other-keys, at the risk of loosing track of the specific arguments.
So I know that going with GF seems natural, and using the manual case looks like a poor-man's object, so maybe it's the way to go, but as we are experimenting with this library maybe not.
And this change is not fixing anything, AFAICT.
By switching all the legit-functions to be generic functions, the implementation for each VCS can also define their vcs objects
Using objects here is great, so we can fix the limitation of only 1 rebase PID per Lem process, but we don't need GF for that, we can as well carry on with a lexical *vcs* which is an object, can't we?
It was a lot of changes and some unrelated and some cluttered the others :S
given this #1443 (comment) it looks like you started this development with a false premise?
Yes, that's unfortunately true.. However, I still think that generic functions are appropriate: let me try and make the case.
There's no reason to have all of this dispatch logic done manually: this is a textbook case for using generic functions. Now, the thing is that generic functions need something to dispatch on.
yes, but because we can doesn't imply we should, and I have been actively avoiding to go this way so far. Writing these "case" is no big deal. We could also write a simple macro to save a few lines, but that isn't necessary. The reasons I avoided the GF so far are:
the current way with plain functions is easier and faster to work with on the REPL during development: you can quickly call
(git-commits)and friends.
- if you call the "commits" general function that accepts
*vcs*as argument, you can still use "with-current-project" or bind the global*vcs*for development or use alet.
Admittedly adding a parameter makes the porcelain functions slightly harder to call from the REPL: but I think you're overblowing this a bit. You can still totally bind whatever variables you want during development and use them as arguments: (commits *vcs*) is really not all that different from (git-commits).
Furthermore, the change has some advantages for development too. It's now easier to work with multiple different repositories in a single REPL session, for instance
(defparameter *git-vcs* ...)
(defparameter *hg-vcs* ...)
;; Current model - view commits in different repos
(let ((*vcs* *git-vcs*)) (commits))
(let ((*vcs* *hg-vcs*)) (commits))
;; New model
(commits *git-vcs*)
(commits *hg-vcs*)
These porcelain functions need a vcs argument, there's no getting around that. I'd rather pass the argument by, well, passing the argument, than by binding some special variable. There are contexts in which the latter strategy makes sense (user-customizable values for instance), but IMO this is not one of them.
Not to mention with explicit arguments, sblint and compilation functions will detect when a function is called with an incorrect number of arguments. No such guarantees with pass-by-dynamic-let.
- the
porcelaincode base is lighter, has less ceremony and I like it.
Disagree. I'm not sure what you mean by "lighter" (realistically, any performance differences here are a wash), but all this manual dispatch seems like an awful lot of ceremony to me.
- each vcs function can have a different set of key arguments. Now each method that implements the generic function must have the exact same lambda list. We could use
&allow-other-keys, at the risk of loosing track of the specific arguments.
False advantage IMO. We're defining an interface, whether we write it via generic functions or via the case *vcs* pattern, and the various porcelains should have the same signatures for these functions.
If the "generic" porcelain functions pass extra parameters to a given method then these parameters are generated from vcs-specific dynamic variables anyways, and the specific methods can just do that for themselves. If a given method takes less parameters than the equivalent "generic" then (a) that's probably a sign that the signature is wrong but (b) it's just as easy to (declare (ignore ...))
So I know that going with GF seems natural, and using the manual
caselooks like a poor-man's object, so maybe it's the way to go, but as we are experimenting with this library maybe not. And this change is not fixing anything, AFAICT.By switching all the legit-functions to be generic functions, the implementation for each VCS can also define their vcs objects
Using objects here is great, so we can fix the limitation of only 1 rebase PID per Lem process, but we don't need GF for that, we can as well carry on with a lexical
*vcs*which is an object, can't we?
The PR is a big refactor, yes. But I think it comes with advantages, and I meant the rebase-limit removal to illustrate them: I should have been explicit about that.
Let's ask the question, what would removing the rebase limit under the current model actually entail? At the moment *rebase-pid* is sitting around in porcelain.lisp for all to see, even the hg and fossil functions (a strike in and of itself imo :). But let's ignore that. Basically the problem is to map *vcs*s to PIDs, so... we could just use a hash-table, after all. Except hash-table isn't thread safe! So now we need to deal with locking. And now all of our threads serialize on one object. And when do we delete objects from the map, to prevent resource leaks? And so on and so forth.
A natural solution is to, as you point out, change what *vcs* is so that it's not just :git or :hg or some other symbol... But now all of the dispatch code needs to be updated. That's pretty annoying, especially for a change which only concerns the git internals. We could make vcs a plist I guess, and each "method" can pull whatever it wants out of it... but now we're really reinventing CLOS.
But with the approach I've taken, well, you can see how easy things become. That's a win, and it'll continue to pay dividends as VCS change in the future.
Plus, we get better encapsulation from my PR. The git/hg/fossil code live in their own files along with whatever specific state they care to define, and there's no possibility of hg code accidentally using a variable defined for git or whatever. When the time comes to add libgit2 support it'll basically come down to adding a single file instead of having to go through and update everything in porcelain. The approach is cleaner, and in my eyes that counts for a lot.
It was a lot of changes and some unrelated and some cluttered the others :S
Now this I completely agree with: thanks for wading through :)
anlsh wants to merge 34 commits from anlsh:legit-libgit2
your branch is badly named.
Also is it rebased onto main? It would help for reviewing it locally.
Admittedly adding a parameter makes the porcelain functions slightly harder to call from the REPL: but I think you're overblowing this a bit. You can still totally bind whatever variables you want during development and use them as arguments: (commits vcs) is really not all that different from (git-commits).
yeah I kinda don't want to agree but kinda agree. I call my functions with shortcuts (C-c C-y), I don't type at the REPL, but I would find quick ways.
Furthermore, the change has some advantages for development too. It's now easier to work with multiple different repositories in a single REPL session, for instance
mmh ok a bit. However, to work with multiple different repositories (projects) at the REPL, we currently need to ,cd into the project root. If we had a project object with its root saved in a slot, this may be easier. I'd prefer we study this change in another, isolated PR ; )
Basically the problem is to map vcss to PIDs, so... we could just use a hash-table, after all. Except hash-table isn't thread safe! So now we need to deal with locking. And now all of our threads serialize on one object. And when do we delete objects from the map, to prevent resource leaks? And so on and so forth.
hash-tables might be sufficient, but this is a good case and a natural choice for objects.
We can talk about my reviews before you start if you want.
I'm starting to not find real arguments to reject this but I'll think about it more ; )
@vindarel thanks for the review- I went through and addressed some of your comments (and rebased onto main, which is why I force-pushed).
Though it was a little unclear to me which things you considered "important" / had "real" thoughts on. Do let me know of the more substantive things you still want to discuss
anlsh wants to merge 34 commits from anlsh:legit-libgit2
your branch is badly named
bad branch name, I think it is problematic to merge it and have this false "libgit2" mention in the commits log.
hey @anlsh I believe we are not far from merging this, do you think you could give it another look soon-ish? We are holding back in touching legit to avoid conflicts.
Hey, sorry about that- I had tried earlier, but ran into some build issues with lem that made me just throw up my hands. Since you're waiting on this, I'll get to it in the next day or two
@vindarel Sorry for the delay. I've gone through and resolved all of the open items. Changing the branch name will basically involve deleting this PR and creating a new one though. Assuming we merge, maybe you can just manually edit the merge message to note/correct the inaccuracy?
A rebase interactive fails with:
12: (ERROR LEM/PORCELAIN:PORCELAIN-ERROR :MESSAGE "lem/porcelain:rebase-interactively not implemented for vcs #<VCS-GIT {100432DD73}>")
a quick look didn't give me the answer.
In addition, the error reporting macro needs this:
(defun call-with-porcelain-error (function)
(handler-bind ((lem/porcelain:porcelain-error
(lambda (c)
(lem:editor-error (slot-value c 'lem/porcelain::message))))) ;; <---- full name for 'message
(funcall function)))
Thanks for the correction on the error macro. The rebase-interactively thing turned out to be that I wasn't exporting the symbol from lem/porcelain: so the corresponding defmethods were just creating new generic functions.
I can't really verify that it's totally working, because rebase-interactively seems broken at main atm (f509ebd1). But with the fixes, legit-libgit2 now appears to be broken in the same way.
Anyways, defmethods behavior is really annoying. My vote is to shadow the symbol in a lem-cl package or something and have it defined as
(defmacro defmethod (...)
(if (fboundp ...)
(cl:defmethod ...)
(error "Wtf defgeneric it please")))
But that's a whole 'nother discussion. No reason to litigate that here and now :)
I can go further but now, when I want to confirm the rebase with C-c C-c:
The slot LEM/PORCELAIN-GIT::REBASE-PID is unbound in the object #<VCS-GIT>
so the PID wasn't set?
with-porcelain-error is now fixed in main (same as "legit: Properly package-qualify a symbol in legit-common) and the interactive rebase is confirmed to work in main.
@anlsh I am now pessimistic, I assume the PR was too big to start with. If we don't merge this then we shouldn't touch the legit code to avoid merge conflicts, which would only delay this PR. The situation is not great.
@anlsh The PID is still found correctly (we can see output on the terminal with log:error)
(if (uiop:process-alive-p process)
(let* ((output (read-line (uiop:process-info-output process)))
(pidtxt (str:trim (second (str:split ":" output)))))
(log:error "--------- pid? " pidtxt)
=> it's set
when hitting C-c C-c, we call rebase-continue:
(define-command rebase-continue () ()
(with-current-project (vcs)
(log:error "---- continuing rebase with vcs " vcs)
(run-function (lambda () (lem/porcelain:rebase-continue vcs)))
(when (get-buffer "git-rebase-todo")
(kill-buffer "git-rebase-todo"))))
and here, the PID is NIL.
Doesn't with-create-project create a new VCS object, with PID initialized to NIL?
it works with a root-> vcs object mapping as in
(defvar *git-projects-mapping* (make-hash-table :test #'equal)
"Map a VCS project root to its VCS object.
We need a single VCS object to retain data, such as the ongoing rebase PID.")
(defun git-project-p ()
"When we find a .git/ directory in the current directory (which should be the project root. Use `lem/porcelain:with-current-project`),
return a VCS object representing the repo: otherwise, return nil"
(when (uiop:directory-exists-p ".git")
(let* ((root (first (uiop:directory* "./")))
(existing (gethash root *git-projects-mapping*)))
(if existing
existing
(setf (gethash root *git-projects-mapping*)
(make-instance 'vcs-git))))))
(edit) I have more modifications to make but were we not trying to avoid global vars to start with?
Some global state (basically, what you have there) is unavoidable. The VCS object still provides an advantage in that it eliminates the need for multiple global state objects: like one map for git rebase PIDs, another for hg artifacts, etc.
But anyways @vindarel, if you feel that this PR is too big, or if you think it's a solution in search of a problem, then feel free to mothball it. Ultimately you're the maintainer here, and I probably won't end up contributing that much to lem. Do whatever you feel is best.
bug found: when changing the VCS preference order, rebase-in-progress signals an error when it shouldn't.
It's OK on main:
(case *vcs*
(:git
(when (uiop:directory-exists-p ".git/rebase-merge/")
(let ((head (str:trim (str:from-file ".git/rebase-merge/head-name")))
(onto (str:trim (str:from-file ".git/rebase-merge/onto"))))
(list :status t
:head-name head
:head-short-name (or (third (str:split "/" head))
head)
:onto onto
:onto-short-commit (str:shorten 8 onto :ellipsis "")))))
(t
(log:info "rebase not available for ~a" *vcs*)
(values)))
I'm on it.
Some global state (basically, what you have there) is unavoidable
yes of course, I'm repeating myself, my bad. I was just surprised that the interactive rebase was broken.
closing in favor of https://github.com/lem-project/lem/pull/1538
It was tedious because of the lack of testing, but the result is good. Thanks for doing it @anlsh and for answering my doubts.
It would be great if you worked on an optional and minimally impacting ;) libgit2 interface as you started.