surrealdb.go icon indicating copy to clipboard operation
surrealdb.go copied to clipboard

Serialization/Performance

Open iDevelopThings opened this issue 3 years ago • 8 comments

So to start, I was digging through the code, and our issue mainly stems from the fact that we need to decode the main response, to get the RPC ID for handling the callback

I was playing with this jsonparser package earlier today with something else, then I realised we might be able to use it here.

I did a simple benchmark, and the performance just to get the ID is much faster CleanShot 2022-09-25 at 17 09 34

My idea:

We can use the jsonparser, to get the id, and then we use a struct like:

type RPCRawResponse struct {
	ID       string
	Data    []byte
	Decoded *RPCResponse
}

This "response" can be passed down the channels and such to the end function and only decoded once.

On top of this, we can build a top level "QueryResolver" which uses a generic, and is an opt-in function/struct, which handles context, decoding, checking for errors etc This will take the byte version and pull information out in small pieces, we can also decode once to the final desired output struct from this, rather than Unmarshal -> get id -> Marshal -> Unmarshal -> check for errors and potentially more...

How do we feel about this? I hacked together a crappy version of this "QueryResolver" setup in my project already, but i didn't optimise or care much for code quality as i was just playing, but it's semi elegant to use CleanShot 2022-09-25 at 17 17 46

iDevelopThings avatar Sep 25 '22 16:09 iDevelopThings

The only issue I have with this approach is generics. I personally like the generic approach and previously had my own fork of the driver that used generics, but I'm not sure it is really the best idea for backwards compatibility. The question of what Go version to use has was raised to Tobie a few hours ago in #17, but I would imagine he'll want the driver to be as compatible as possible to allow for easier adoption in existing projects. If he okays using newer Go versions, I think it generics should 100% be used.

I've also been working on a solution to this the past couple of days and have my own solution to the issue pretty much finished. I chose to switch everything over to []byte arrays and mimic Mongo since that was brought up in #18. I also just used the native json package but unmarshalled to json.RawMessage. I'm not too sure of the performance implications of using json.RawMessage, but from a few quick searches it seemed to be relatively okay. I'm not sure if the performance would be any better or worse than what you have it you converted it away from generics.

garrison-henkle avatar Sep 25 '22 17:09 garrison-henkle

Yeah i agree, I think there's a way to do it though, I see in some packages, something like this: CleanShot 2022-09-25 at 18 15 02

So maybe there's a way we can only expose this generic resolver, for go 18+

I hacked together my solution after posting this, I'm going to make a WIP PR, we'll see how people are feeling

iDevelopThings avatar Sep 25 '22 17:09 iDevelopThings

Oh nice, I had no idea build flags like that existed. We should 100% have generics then and just offer other functions for older versions of Go. If you get the generic part to a good state, I would be happy to adjust some of the code I have for the older Go functions.

garrison-henkle avatar Sep 25 '22 17:09 garrison-henkle

Yeah i was doing some digging into them now, I never thought much of it in the past, but seems simple anyway!

https://stackoverflow.com/a/38439941/15727015

So we could use your api for < go 18 and anyone >= 18 would have the option to also use the generic version :D

iDevelopThings avatar Sep 25 '22 17:09 iDevelopThings

json.RawMessage is quite a bit more performant than unmarshalling to a *any, but way less than using jsonparser for just getting the id. I think the Decoded field is not really necessary though, and we should probably hide the implementation details from the users

Keitio avatar Sep 25 '22 17:09 Keitio

So to start, I was digging through the code, and our issue mainly stems from the fact that we need to decode the main response, to get the RPC ID for handling the callback

I was playing with this jsonparser package earlier today with something else, then I realised we might be able to use it here.

I did a simple benchmark, and the performance just to get the ID is much faster CleanShot 2022-09-25 at 17 09 34

My idea:

We can use the jsonparser, to get the id, and then we use a struct like:

type RPCRawResponse struct {
	ID       string
	Data    []byte
	Decoded *RPCResponse
}

This "response" can be passed down the channels and such to the end function and only decoded once.

On top of this, we can build a top level "QueryResolver" which uses a generic, and is an opt-in function/struct, which handles context, decoding, checking for errors etc This will take the byte version and pull information out in small pieces, we can also decode once to the final desired output struct from this, rather than Unmarshal -> get id -> Marshal -> Unmarshal -> check for errors and potentially more...

How do we feel about this? I hacked together a crappy version of this "QueryResolver" setup in my project already, but i didn't optimise or care much for code quality as i was just playing, but it's semi elegant to use CleanShot 2022-09-25 at 17 17 46

Im interested in your solution where you use db.Query[T] to get the results, can you please share your solution? Thank

tomasweigenast avatar Dec 03 '22 23:12 tomasweigenast

So to start, I was digging through the code, and our issue mainly stems from the fact that we need to decode the main response, to get the RPC ID for handling the callback I was playing with this jsonparser package earlier today with something else, then I realised we might be able to use it here. I did a simple benchmark, and the performance just to get the ID is much faster CleanShot 2022-09-25 at 17 09 34 My idea: We can use the jsonparser, to get the id, and then we use a struct like:

type RPCRawResponse struct {
	ID       string
	Data    []byte
	Decoded *RPCResponse
}

This "response" can be passed down the channels and such to the end function and only decoded once. On top of this, we can build a top level "QueryResolver" which uses a generic, and is an opt-in function/struct, which handles context, decoding, checking for errors etc This will take the byte version and pull information out in small pieces, we can also decode once to the final desired output struct from this, rather than Unmarshal -> get id -> Marshal -> Unmarshal -> check for errors and potentially more... How do we feel about this? I hacked together a crappy version of this "QueryResolver" setup in my project already, but i didn't optimise or care much for code quality as i was just playing, but it's semi elegant to use CleanShot 2022-09-25 at 17 17 46

Im interested in your solution where you use db.Query[T] to get the results, can you please share your solution? Thank

In the end I made it into my own lib based on the original: https://github.com/iDevelopThings/surrealdb.go.unofficial

Been a while since i touched surreal stuff, but it should still be fine to use ^^

iDevelopThings avatar Dec 05 '22 23:12 iDevelopThings

@iDevelopThings Nice! Thank you. Have you tested performance?

tomasweigenast avatar Dec 06 '22 10:12 tomasweigenast