go-json icon indicating copy to clipboard operation
go-json copied to clipboard

Implement omitnil json tag - 100€ bounty

Open ivanjaros opened this issue 2 years ago • 10 comments

Example:

package main

import (
	"github.com/goccy/go-json"
	"os"
)

type Foo struct {
	Bar []string          `json:"bar,omitempty"`
	Baz map[string]string `json:"baz,omitempty"`
}

func main() {
	var a, b Foo
	b.Bar = []string{}          // <- empty, not nil
	b.Baz = map[string]string{} // <- empty, not nil

	e := json.NewEncoder(os.Stdout)
	_ = e.Encode(a)
	println("")
	_ = e.Encode(b)
}

Result is that both a and b will print {} instead of {} and {bar: [], baz: {}}. This is blatantly wrong behavior because it discards information. Just because array/slice/map is empty does not mean it does not exists(which is case of nil).

ivanjaros avatar Feb 26 '23 09:02 ivanjaros

If this won't get fixed to stay in line with the completely wrong native json behavior which the Go team refuses to fix, can you point me to a code to alter the behavior in here so that I could fork it and fix it for myself?

ivanjaros avatar Feb 26 '23 11:02 ivanjaros

Or possibly introduce new tag whose sole purpose will be to act as omitempty but only for nil values. In other words, if field has tag "omitnil", or configuration flag, it will not print out the field if the value is nil. Otherwise it will print the value as is(empty map, slice,..).

I could simply do some processing of json from data lacking omitempty tag that will contain "null" values that can be with a bit of work cut out of the resulting byte slice, BUT this won't work for streaming. Hence the need for built-in functionality.

ps: i'd pay 100€ for that omitnil functionality since I literally run into this on daily basis.

ivanjaros avatar Feb 26 '23 14:02 ivanjaros

this vm.go looks like the code that skips empty, but non-nil slice: obrázok

ivanjaros avatar Feb 26 '23 15:02 ivanjaros

//this worked

package main

import (
	 "github.com/goccy/go-json"
	"encoding/json"
	"os"
)

type Bar []string
type Baz map[string]string
type Foo struct {
	Bar *Bar `json:"bar,omitempty"`
	Baz *Baz `json:"baz,omitempty"`
}

func main() {
	var a, b Foo
	bb := Bar{}
	bz := Baz{}
	b.Bar = &bb // <- empty, not nil
	b.Baz = &bz // <- empty, not nil

	e := json.NewEncoder(os.Stdout)
	_ = e.Encode(a)
	println("")
	_ = e.Encode(b)
}

Output

{}
{"bar":[],"baz":{}}

AbenezerKb avatar Feb 27 '23 11:02 AbenezerKb

@goccy any interest in that 100€ bounty for implementing omitnil?

ivanjaros avatar Feb 27 '23 14:02 ivanjaros

Max already made a merge request https://github.com/goccy/go-json/pull/437 Code looks good, except that test is not exactly in line with the rest of tests(cosmetic).

ivanjaros avatar Mar 01 '23 15:03 ivanjaros

🤨

ivanjaros avatar May 13 '23 16:05 ivanjaros

@ivanjaros are you still interested in this and is the bounty still active?

ianling avatar Sep 25 '23 00:09 ianling

@ivanjaros are you still interested in this and is the bounty still active?

interested yes, bounty no(it has been 7 months since and it no longer makes sense).

ivanjaros avatar Sep 25 '23 11:09 ivanjaros

In the end, I have made a cleaning function that will remove null values from marshalled output rather than patch this or other json marshaller.

Benchmark shows that goccy marshaller takes 329ns/op, native json marshaller takes 642ns/op and when i run the goccy result through my function I get 514ns/op with no allocations, which is faster than native json by 25% but still slower than goccy by 55%. I have spent a lot of time on this, fixing it and tweaking performance to get it here and I cannot find anything else to do. Profiling shows that the entire performance hit comes from bytes.Index and I do not see any way to improve it. I was simply wondering @goccy if you have any performance recommendations to make this faster, if possible?

package foo

import (
	"bytes"
)

var nullPattern = []byte("null")

// modifies the source, allocates no new memory.
func denil(src []byte) []byte {
	var offset int
	var idx int

	for {
		var closing int = 4 // "null"

		idx = bytes.Index(src[offset:], nullPattern)
		if idx >= 0 && len(src) > offset+idx+closing {
			// "null" is 4 bytes and we need to advance forward by one byte,
			// which is inclusive due to 0 slice index offset, so no need for +1 more
			switch src[offset+idx+closing] {
			case ',', '\n':
				idx += offset
				// when we get trailing comma or new line, we remove it along with the preceding value
				closing++
				closing += idx
			case '}', ']':
				idx += offset
				closing += idx
			default:
				// this is not actual null
				offset += idx
				offset += closing
				continue
			}
		} else {
			// we're done
			break
		}

		idx = findColon(src, idx)
		if idx < 0 {
			offset += 4 // 4 bytes for matched nil pattern
			continue
		}

		idx = findQuote(src, idx)
		if idx < 0 {
			offset += 4 // 4 bytes for matched nil pattern
			continue
		}

		idx = findQuote(src, idx)
		if idx < 0 {
			offset += 4 // 4 bytes for matched nil pattern
			continue
		}

		idx, closing = findEdges(src, idx, closing)

		src = append(src[:idx], src[closing:]...)

		offset = idx
	}

	return src
}

func findColon(src []byte, idx int) int {
	idx--

	if len(src)-1 < idx {
		return -1
	}

	for idx >= 0 {
		switch src[idx] {
		case ' ':
			idx--
		case ':':
			return idx
		default:
			return -1
		}
	}

	return -1
}

func findQuote(src []byte, idx int) int {
	idx--

	if len(src)-1 < idx {
		return -1
	}

	for idx >= 0 {
		if src[idx] == '"' {
			if idx > 0 && src[idx-1] == '\\' {
				idx--
			} else {
				return idx
			}
		} else {
			idx--
		}
	}

	return -1
}

func findEdges(src []byte, start, finish int) (int, int) {
	for i := start - 1; i >= 0; i-- {
		switch src[i] {
		case ' ', ',', '\n', '\t':
		default:
			start = i + 1
			i = 0 // break for loop
			break
		}
	}

	for i := finish; i < len(src); i++ {
		switch src[i] {
		case ' ', ',', '\n', '\t':
		default:
			finish = i
			i = len(src) // break for loop
			break
		}
	}

	// due to the way slicing works, the real finish is at -1.
	// we need to avoid removing commas from both sides.
	if src[start] == src[finish-1] {
		if src[start] == ',' {
			start++
		}
	}

	return start, finish
}

ivanjaros avatar Feb 04 '24 17:02 ivanjaros