jscan
jscan copied to clipboard
Rename named return variable in walker closure
By default, walker closure has the following signature:
func(i *jscan.Iterator[S]) (err bool)
IDE proposes it every time you're trying to use auto-complete. And I find the named return's name quite inconvenient because of the following reasons:
-
err
variable is pretty common and is usually meant to carry a value of typeerror
. This sometimes leads to name collisions - It doesn't quite respond to its semantics: it is more like some
exit
orstop
The preferred solution would be to replace bool with the actual error interface, so jscan.Scan
would return it instead of error with code ErrorCodeCallback
. However, it breaks backward capability, and according to the fact that v2 was released just recently, it definitely won't be implemented just now.
So my current solution is to rename it to some of the already mentioned variants:
-
exit
-
stop
-
break_
(seems ugly tbh)
I'm going to open a PR after the resolution
To understand why I don't return error
, as is common in Go,
consider the following benchmark (https://go.dev/play/p/_iAcdhVU8m7):
=== RUN TestSizeOf
bench_test.go:10: size of Error[string]: 32
bench_test.go:11: size of Error[[]byte]: 40
--- PASS: TestSizeOf (0.00s)
goos: darwin
goarch: arm64
pkg: errbench
BenchmarkInterface
BenchmarkInterface-10 54839221 21.78 ns/op 32 B/op 1 allocs/op
BenchmarkStructString
BenchmarkStructString-10 1000000000 0.9313 ns/op 0 B/op 0 allocs/op
BenchmarkStructBytes
BenchmarkStructBytes-10 334166505 3.578 ns/op 0 B/op 0 allocs/op
PASS
ok errbench 4.804s
Returning the Error[T]
type carrying relevant information as error
will allocate it on the heap.
Using fmt.Errorf
or similar would also allocate memory.
Copying 32 bytes on modern 64-bit systems is essentially free and 40 bytes in case of []byte
is comparably cheap. Therefore, I leave the decision of whether to use Error[T]
as error
to the API user as it does incur a cost.
If the walker closure would return error
then other functions would need to return error
too which I would rather avoid for the reason described above.
The fact that the named return err
can be inconvenient is a fair point and renaming it would be backward compatible so that's possible. exit
and stop
are both good candidates, break_
is not.
P.S.
I also thought about func(i *jscan.Iterator[S]) (errCode int8)
when designing the API, but this could allow returning jscan error codes like ErrorCodeIllegalControlChar
, which isn't ideal as those codes are supposed to be used by the parser only.
When talking about performance, the user must take care of it by himself, too. So by that, I disagree that the returning error interface is that awful in terms of performance, as in my own usecase I use pre-defined static errors. However I agree that there are more usecases as I've seen by my own, and sometimes error text would be needed to be dynamic, but in this case, the user still needs to make an effort to avoid fmt.Sprintf
or anything like that. In other words, the task is difficult by itself, and the approach currently used doesn't really help as far as I can see.
What about renaming the err
- have we reached a consensus here, so I can open a PR?
Using jscan.Error[T]
as error
but in this case, the user still needs to make an effort to avoid
fmt.Sprintf
or anything like that.
Not much effort required since jscan.Error[T]
implements error
, just wrap it:
var err error
if e := jscan.Validate(json); e.IsErr() {
err = e
}
Or simply call the Error
method to get the message allocated:
jscan.Validate(`{"x":_}`).Error() // "error at index 5 ('_'): unexpected token"
Rationale
Consider the following example and benchmark: https://go.dev/play/p/SxuovkZG2D2
IndexOfErrorInJSON
needs to return either the index of where there's an error in the JSON or -1
if there was no error. Returning jscan.Error[T]
actually makes the code both simpler and faster:
if err := jscan.Validate(j); err.IsErr() {
return err.Index
}
return -1
If we were to return error
from jscan.Validate
instead, then we would need to either use a type assertion or errors.As
, which are both less convenient and wasteful:
if err, ok := jscan.Validate(j).(jscan.Error[string]); ok {
return err.Index
}
return -1
var jscanErr jscan.Error[string]
if errors.As(jscan.Validate(j), &jscanErr) {
return jscanErr.Index
}
return -1
goos: darwin
goarch: arm64
pkg: errbench
Benchmark
Benchmark/concrete
Benchmark/concrete-10 21867732 49.74 ns/op 0 B/op 0 allocs/op
Benchmark/errors_as
Benchmark/errors_as-10 7709853 152.6 ns/op 96 B/op 3 allocs/op
Benchmark/typeassert
Benchmark/typeassert-10 14732881 72.35 ns/op 32 B/op 1 allocs/op
PASS
ok errbench 3.712s
The fact that jscan.Validate
returns jscan.Error
gives you an option to avoid paying the extra cost when it's not justified.
A ~30ns difference and an extra allocation is a tiny cost. However, package jscan is deliberately designed to allow your code to remain allocation free.
I do agree that in Go it's typical to work with error
and having to check for errors with if err.IsErr()
may be a bit confusing, but jscan is not your typical Go package since its main focus is performance.
Renaming named return
What about renaming the err - have we reached a consensus here, so I can open a PR?
Yes, this is possible. I don't think we need a PR for that since it's a trivial change I can do myself but if you want - go for it 🙂
EDIT:
as in my own usecase I use pre-defined static errors
Sentinel values would only allow us to replace the error code, but if we also want to return the occurrence index then there's no way around allocation except returning an actual struct implementing the error
interface.