pygit2
pygit2 copied to clipboard
Bugs in python backend
I am creating an issue rather than a PR because I have not been able to create a working Odb or Refdb backend, but I have found several bugs. Please let me know if I am completely misunderstanding this code.
I believe the issues I cannot get past come from threading in libgit2. I have found that, at times, calling threading.get_ident
will cause a segfault. I have tried wrapping all C->python code in PyGILState_Ensure
and PyGILState_Release
, but this did not work.
I have found the following possible errors:
- https://github.com/libgit2/pygit2/blob/e012bb5bc9e0bdb68ac79f9c12e3f59e3513328a/src/odb_backend.c#L244 Calls
exists_prefix
notrefresh
. - https://github.com/libgit2/pygit2/blob/e012bb5bc9e0bdb68ac79f9c12e3f59e3513328a/src/refdb_backend.c#L134
glob
may be null - https://github.com/libgit2/pygit2/blob/e012bb5bc9e0bdb68ac79f9c12e3f59e3513328a/src/refdb_backend.c#L134
git_reference* _ref
will double free - https://github.com/libgit2/pygit2/blob/e012bb5bc9e0bdb68ac79f9c12e3f59e3513328a/src/refdb_backend.c#L409
PyIter_Check
should be a check for__iter__
not if the object is currently an iterator.
In my experience, problems like this mean that I am misusing something. Let me know if these appear to be real bugs and I will see if I can at least create test cases to expose them. I don't seem to have a good grasp on the threading segfaults though.
I think you're right about at least some of the bugs, ping @ddevault since he's the author of this code. For the 3rd one I think the link is not correct, should not be https://github.com/libgit2/pygit2/blob/e012bb5bc9e0bdb68ac79f9c12e3f59e3513328a/src/refdb_backend.c#L205 ?
Will give a look at the threading issue.
I think you would be completely bonkers to introduce multithreading into a Python program, let alone a Python program with C modules.
Can you turn your linewise feedback into a patch, for easier commentary and a better path towards merge?
Threads are common enough in python, especially in web. Either way, my python program does not introduce threads (yet, it is web). I am not familiar with libgit2 internals or python c modules but for the reasons I mentioned earlier, my thought is that threading in libgit is not playing well with python.
Thanks for getting back to me. I will clean up my code create a branch and we can continue.
No, multiprocessing is common in Python, not multithreading. If you invite the Kraken into your program, I'm not going to help you make it comfortable :)
@ddevault threading is very common in Python web-frameworks. All of the synchronous frameworks (e.g. CherryPy, Django, Flask) run HTTP handlers (and other things really) in threads, only asynchronous frameworks tend to work with one main thread because of the nature of the async paradigm (IO loop runs in one thread) but they still invite using threads for CPU-bound tasks (because these would introduce long blocking delays blocking the IO loop and making the app freeze).
So no, I'd not say that it's unreasonable to believe that pygit2 would be used in a multithreaded environment. Especially if it's a web-service. In my case, for example, I'm making bots in a form of GitHub Apps, which basically means that I run a web-server where GitHub sends info about the events happening on the platform and since many would be Git-related, guess what, I need something to work with Git and that would be pygit2.
I maintain that multithreading is supremely stupid and if you want to use it, you'll have to write the patches yourself. I'm not interested in supporting your use-case.
I maintain that multithreading is supremely stupid and if you want to use it, you'll have to write the patches yourself. I'm not interested in supporting your use-case.
Relax, I wouldn't ask such a selfishly behaving person to do this anyway. I just wanted to clarify that the case is legit.