gitea icon indicating copy to clipboard operation
gitea copied to clipboard

Pure Go SQLite

Open jolheiser opened this issue 3 years ago • 3 comments

This PR is an attempt at reviving #15002

jolheiser avatar Aug 02 '22 03:08 jolheiser

How much is the binary size effected?

On my machine it looks like it increased ~2mb

EDIT: When I built the "old" version I forgot to use the sqlite tags, I've updated the change (only a ~1mb difference)

jolheiser avatar Aug 02 '22 04:08 jolheiser

This PR is an attempt at reviving #15002

Thanks @jolheiser, I closed mine.

lunny avatar Aug 02 '22 06:08 lunny

How about the performance for the new versions of sqlite compare with cgo sqlite.

lunny avatar Aug 02 '22 06:08 lunny

Looks like sqlite doesn't support lock mechanism of c sqlite. We may need to post issues on upstream.

lunny avatar Aug 09 '22 03:08 lunny

Looks like sqlite doesn't support lock mechanism of c sqlite. We may need to post issues on upstream.

Not sure, this is just how Go works. They are listening for when the provided ctx is cancelled, however in the meantime we are just changing the data of that context pointer. Solution is simple, avoid changing the context.

Gusted avatar Aug 09 '22 03:08 Gusted

I'll have to take a look at why we need to set those values/where they are being used. Unfortunately Go itself doesn't really allow setting context values without the whole ctx = context.WithValue(...) dance afaik

jolheiser avatar Aug 09 '22 04:08 jolheiser

I'll have to take a look at why we need to set those values/where they are being used.

Because context is the only "variable" being shared across the publicKeyHandler and sessionHandler. Maybe giving a copy of the context(context.WithCancel) to the database function in publicKeyHandler would avoid the data race? I'm not even sure why sqlite is still waiting for a interrupt when the database function already returned the result, seems a bug on it's own.

Gusted avatar Aug 09 '22 04:08 Gusted

Just to note this data race is not just in testing but also for actual workflow.

So yeah giving a duplicated context and then cancel it after the function does resolve the data race:

diff --git a/modules/ssh/ssh.go b/modules/ssh/ssh.go
index 8a280fb1b..31084870e 100644
--- a/modules/ssh/ssh.go
+++ b/modules/ssh/ssh.go
@@ -181,7 +181,11 @@ func publicKeyHandler(ctx ssh.Context, key ssh.PublicKey) bool {
                // look for the exact principal
        principalLoop:
                for _, principal := range cert.ValidPrincipals {
-                       pkey, err := asymkey_model.SearchPublicKeyByContentExact(ctx, principal)
+                       // Copy the context and give it a new done channel. This ensures that the read for context
+                       // in sqlite3 is closed after we got the results and avoids a data race when we change the context.
+                       dupCtx, cancel := context.WithCancel(ctx)
+                       pkey, err := asymkey_model.SearchPublicKeyByContentExact(dupCtx, principal)
+                       cancel()
                        if err != nil {
                                if asymkey_model.IsErrKeyNotExist(err) {
                                        log.Debug("Principal Rejected: %s Unknown Principal: %s", ctx.RemoteAddr(), principal)
@@ -242,7 +246,11 @@ func publicKeyHandler(ctx ssh.Context, key ssh.PublicKey) bool {
                log.Debug("Handle Public Key: %s Fingerprint: %s is not a certificate", ctx.RemoteAddr(), gossh.FingerprintSHA256(key))
        }
 
-       pkey, err := asymkey_model.SearchPublicKeyByContent(ctx, strings.TrimSpace(string(gossh.MarshalAuthorizedKey(key))))
+       // Copy the context and give it a new done channel. This ensures that the read for context
+       // in sqlite3 is closed after we got the results and avoids a data race when we change the context.
+       dupCtx, cancel := context.WithCancel(ctx)
+       pkey, err := asymkey_model.SearchPublicKeyByContent(dupCtx, strings.TrimSpace(string(gossh.MarshalAuthorizedKey(key))))
+       cancel()
        if err != nil {
                if asymkey_model.IsErrKeyNotExist(err) {
                        if log.IsWarn() {

Gusted avatar Aug 11 '22 04:08 Gusted

Okay we're getting somewhere, so relevant issues:

--- FAIL: TestDumpDatabase (2.08s)

    engine_test.go:33: 

        	Error Trace:	engine_test.go:33

        	Error:      	Received unexpected error:

        	            	unsupported database type sqlite

        	Test:       	TestDumpDatabase

FAIL

coverage: 7.4% of statements

Seems like xorm doesn't cover the sqlite database type?

    --- FAIL: TestResourceIndex/issue_19 (0.23s)

        issue_test.go:392: 

            	Error Trace:	issue_test.go:392

            	Error:      	Received unexpected error:

            	            	generate issue index failed: database table is locked (262)

            	Test:       	TestResourceIndex/issue_19

Seems like busy_timeout needs to be changed(maybe 3d10000?)

Gusted avatar Aug 18 '22 03:08 Gusted

Related: https://github.com/go-gitea/gitea/issues/24658

silverwind avatar May 11 '23 08:05 silverwind

And put my question here:

Actually, I have somewhat objection for using that CCGO based sqlite package in Gitea

Are we willing to take the risk of a complex essential core library getting unmaintained(some unfixable bugs)?

Although SQLite is popular, but the pure-go solution isn't (actually, most people do not need the pure-go solution).


I would still question whether it's worth to use CCGO in Gitea. What if there is an unfixable bug?

  • If we use a 3rd go package, even if there is a bug, we can fork and fix.
  • But if there is a bug in CCGO related package, how to fix it? I don't have the time to patch the CCGO compiler ....

If CCGO pakcage has been proven to be stable enough, then I have no objection.

wxiaoguang avatar May 11 '23 08:05 wxiaoguang

And put my question here:

Actually, I have somewhat objection for using that CCGO based sqlite package in Gitea

Are we willing to take the risk of a complex essential core library getting unmaintained(some unfixable bugs)?

Although SQLite is popular, but the pure-go solution isn't (actually, most people do not need the pure-go solution).

I would still question whether it's worth to use CCGO in Gitea. What if there is an unfixable bug?

  • If we use a 3rd go package, even if there is a bug, we can fork and fix.
  • But if there is a bug in CCGO related package, how to fix it? I don't have the time to patch the CCGO compiler ....

If CCGO pakcage has been proven to be stable enough, then I have no objection.

As far as I'm aware, there is a bun module to automatically select between the modernc.org/sqlite if CGo is disabled, and mattn/go-sqlite if CGo is enabled. I believe this would be the best option for everyone. This way, users may use whichever they prefer: the pure Go version, easier to compile and with better cross-platform support, or the native C version, harder to compile, but offering bindings to the original version.

As a side note: the modernc.org/sqlite package is used in production, and passes the SQLite3 testing suites. It is used by many applications, specifically using the above bun ORM module. Note that it is also sponsored by a German company as specified in their README; meaning, they have interests to keep the project maintained.

I would personally enjoy having the pure-Go version, as it would be a step towards being able to go build Gitea instead of using the Makefile (at least for the backend); which would make life easier for everyone; maintainers, contributors, packagers or users.

lnchan avatar Jul 26 '23 18:07 lnchan

I would personally enjoy having the pure-Go version, as it would be a step towards being able to go build Gitea instead of using the Makefile (at least for the backend); which would make life easier for everyone; maintainers, contributors, packagers or users.

IMO it's not related. With current CGO sqlite, go build also works well. The Makefile is never a must. The only benefit of the pure-go sqlite is it drops the C compiler dependency.

wxiaoguang avatar Jul 26 '23 18:07 wxiaoguang

I would personally enjoy having the pure-Go version, as it would be a step towards being able to go build Gitea instead of using the Makefile (at least for the backend); which would make life easier for everyone; maintainers, contributors, packagers or users.

IMO it's not related. With current CGO sqlite, go build also works well. The Makefile is never a must. The only benefit of the pure-go sqlite is it drops the C compiler dependency.

I thought there was still a dependency on an old-fashioned way of embedding files in Go binaries, but I may have read incorrectly or be wrong. Dropping the C compiler dependency would be a definite benefit for many users, but not everyone wants that (for diverse reasons), which is why letting the user decide would be, in my opinion, the best option.

lnchan avatar Jul 26 '23 18:07 lnchan

Yes, Dropping CGO would considerably simply our build because we can use the native go compiler, eliminating the xgo dependency. I would definitely prefer it.

silverwind avatar Jul 26 '23 18:07 silverwind