diskv icon indicating copy to clipboard operation
diskv copied to clipboard

Suggestion: Improve performance by maintaining structs

Open WinstonPrivacy opened this issue 6 years ago • 7 comments

Currently diskv maintains in-memory values in []byte format. It would be excellent from a performance standpoint if arbitrary structs could be maintained instead as this would eliminate the overhead of deserialization. I suspect that the performance improvement would be 1-2 orders of magnitude depending on the application.

WinstonPrivacy avatar Mar 23 '18 14:03 WinstonPrivacy

I'm not sure how this makes sense. All structures need to eventually get serialized to []byte to make it to disk, there's no avoiding it. By using []byte we allow the user to use whatever serialization makes sense in their use case.

peterbourgon avatar Mar 23 '18 16:03 peterbourgon

The key is that they eventually get serialized to disk. If you are reading and updating the key several hundred times a second or have decoupled the memory and disk (as we've done in our fork), this eliminates a lot of overhead.

WinstonPrivacy avatar Mar 23 '18 16:03 WinstonPrivacy

~~I understand. I guess what I'm saying is that your use-case may not be a good fit for the original design goals of diskv, which reduces cognitive overhead in users by not allowing for this decoupling. But if it's not too disruptive to the API, maybe it can be added. I'll have to make a judgment call :)~~

Oh, I thought I was replying to the other issue.

Yeah, this would definitely be a step too far. The only way to do this would be to store values as interface{}, or something custom like

type Serializable interface {
    Size() int64
    encoding.BinaryMarshaler
}

and I'm not interested in that.

peterbourgon avatar Mar 23 '18 16:03 peterbourgon

On second thought that second interface might work. The degenerate case would be

type SerializableBytes []byte

func (b SerializableBytes) Size() int64 {
    return len(b)
}

func (b SerializableBytes) MarshalBinary() ([]byte, error) {
    return b, nil
}

peterbourgon avatar Mar 23 '18 17:03 peterbourgon

Could you make your custom structs able to report their serialized size?

peterbourgon avatar Mar 23 '18 17:03 peterbourgon

Hmmm... I'm not totally sure I'm following you (my fault, not yours). But I'm using GOB encoding, so I think the only way to figure out their size would be to serialize them, which defeats the point.

WinstonPrivacy avatar Mar 23 '18 17:03 WinstonPrivacy

We need to know size to be able to abide by CacheSizeMax requirement. I'm not aware of a way to get size of a struct without using unsafe.

peterbourgon avatar Mar 23 '18 17:03 peterbourgon