go-osc icon indicating copy to clipboard operation
go-osc copied to clipboard

Performance could be a lot better

Open chabad360 opened this issue 3 years ago • 6 comments

Hey all, I used this library in a project of mine, and in the process discovered that during the processing of each message, a rather large amount of memory is used, far more than necessary. I created a fork with the goal of bringing down the amount of heap allocations as much as possible (primarily for the server). I think I succeeded:

BenchmarkOriginalParsePacket-4        107958         10519 ns/op        4584 B/op         16 allocs/op
BenchmarkMyParsePacket-4             1000000          2298 ns/op         320 B/op         10 allocs/op

This is when running on repl.it, the performance on an 8th gen I7 is much better (around 310 ns/op).

I would like to contribute back some of these changes (some of them are not suited for concurrent workloads without introducing locking). As well as start a discussion about what else could be improved (there are some stuff that I didn't get to). Please feel free to explore my fork. I should note, I originally (in my master branch) divided up the single file into multiple to make it easier to modify without getting lost, perhaps this is a change that should be introduced as well.

https://github.com/hypebeast/go-osc/compare/master...chabad360:patch-1

chabad360 avatar Mar 18 '21 16:03 chabad360

Hey,

Thanks a lot for your input. I'm always happy to receive a PR for your performance improvements. Furthermore, I'm also open to discuss the restructuring. Maybe you can create two PRs: one for the perf. improvements and one for the restructuring. This makes it a little bit easier for me. And first I want to merge the PRs #39, #40 and #41 before I start with a restructuring.

hypebeast avatar Mar 18 '21 22:03 hypebeast

The compare link that I put in the OP is compatible with the current structure.

But I think I would rather do a restructuring first, as that would help make it more clear where the various optimizations are happening. Also it extremely simple to do, as the code itself is already quite organized into what would become various files.

chabad360 avatar Mar 19 '21 01:03 chabad360

Let's move the conversation about this over to #48, this way we can easily discuss actual code.

chabad360 avatar Mar 22 '21 18:03 chabad360

Thanks for the PR. I'll have a look in the next days.

hypebeast avatar Mar 22 '21 22:03 hypebeast

One major performance bottleneck that I ended up fixing when running on a rpi with high osc load was the dispatching logic. For some reason the current baseline implementation compiles the regexp patters for each message processing hook for each message. I ended up putting the patterns as compiled regexps into an array and bailing after the first match and optimizing the most used message handlers to the start of the array.

This really bites when you have large amount of messages and handlers for a complex system

depili avatar Jan 24 '22 20:01 depili

I haven't even touched the dispatch portion yet (and I think it could be much more efficient) mainly because I just use a contains in my logic.

But I do intend to make a significant rewrite or this library, so this will definitely be included.

chabad360 avatar Jan 25 '22 00:01 chabad360