commons-csv icon indicating copy to clipboard operation
commons-csv copied to clipboard

Replaced BufferedReader without synchronization blocks

Open mama-btc-es opened this issue 1 year ago • 11 comments

Replaced the BufferedReader from java.io package by a modified BufferedReader from the Apache Harmany project.

All synchronized blocks are removed from the BufferedReader in order to reach a better performance since in Java 15 the synchronisation optimization is deprecated and in Java 18 the optimization is removed.

I added a jmh file for performance test results.

jmh-result.json

mama-btc-es avatar Sep 10 '24 07:09 mama-btc-es

This PR should be put in the "draft" state for now IMO.

garydgregory avatar Sep 10 '24 17:09 garydgregory

@garydgregory @ppkarwasz

I checked the proposal without BufferedReader and I extended the ExtendedBufferedReader from Reader. Unfortunately this solution is still slower as before in my use case. And the UT failed because mark() and reset() are not implemented in ExtendedBufferedReader and Reader.

mama-btc-es avatar Sep 11 '24 10:09 mama-btc-es

@mmstan,

Regarding the tests don't worry: ExtendedBufferedReader is not a public class, so we can drop the mark() and reset() methods (and relative tests) without any problem.

ppkarwasz avatar Sep 11 '24 14:09 ppkarwasz

@ppkarwasz The ExtendedBufferedReader calls reset() and mark() methods, we cannot remove the calls. I think it is not possible to extend the ExtendedBufferedReader from Reader, the implementations for reset() and mark() are missing. Or I missunderstand your proposal.

mama-btc-es avatar Sep 12 '24 12:09 mama-btc-es

@ppkarwasz The ExtendedBufferedReader calls reset() and mark() methods, we cannot remove the calls. I think it is not possible to extend the ExtendedBufferedReader from Reader, the implementations for reset() and mark() are missing. Or I missunderstand your proposal.

No, you didn't misunderstand. Effectively, looking at the code, the task is larger than I expected, so you'll need to add some methods to ExtendedBufferedReader if you refactor it as direct derivative class of Reader. Please, provide a minimal implementation: if Lexer does not need a method, don't implement it, because it will be dead code.

In your current approach you copied and refactored the entire BufferedReader class. Lexer does not need all of that.

ppkarwasz avatar Sep 13 '24 06:09 ppkarwasz

Let's do this through Commons IO where I added UnsynchronizedBufferedReader and UnsynchronizedReader, such that ExtendedBufferedReader can extend UnsynchronizedBufferedReader.

On Java 8 on Windows 10, running PerformanceTest a few times and picking best runs, shows an improvement from:

Best time out of 10 is 1,290 milliseconds.

to:

Best time out of 10 is 1,023 milliseconds.

or ~20% improvement

I'll push an RC for Commons IO in the coming week or so, then we can pick it up here.

garydgregory avatar Sep 14 '24 14:09 garydgregory

@mama-btc-es CC: @ppkarwasz

Please test git master which now uses an unsynchronized reader.

garydgregory avatar Sep 18 '24 22:09 garydgregory

Ping @mama-btc-es

garydgregory avatar Sep 21 '24 01:09 garydgregory

@garydgregory I checked the changes on git master. Now my use case is so fast as before. Thank you.

mama-btc-es avatar Sep 23 '24 09:09 mama-btc-es

@mama-btc-es TY for your feedback. OK to close this ticket then or do see any more changes needed?

garydgregory avatar Sep 23 '24 10:09 garydgregory

@garydgregory No changes from my point of view anymore. You can close the ticket. I will delete my fork branch and the PR.

mama-btc-es avatar Sep 24 '24 06:09 mama-btc-es

Closing after review and agreement from @mama-btc-es TY!

garydgregory avatar Oct 31 '24 20:10 garydgregory