server
server copied to clipboard
MDEV-34680 Asynchronous and Buffered Logging in MySQL Audit Plugin
Description
- Buffered Logging:
Implement an in-memory buffer to reduce disk I/O by batching log writes.
- Asynchronous Logging:
Delegate logging to a dedicated logger thread for independent log handling, minimizing wait times for main audit threads
- Dynamic Buffer Management:
Add log_buffer_size and log_buffer_time system variables for user-customizable
flushing based on data volume or time, adapting to varying workloads.
- Confirm Performance Improvement:
An average of 15% speed improvements confirmed in benchmark tests, demonstrating the effectiveness of these logging enhancements in high-load environments.
Release Notes
N/A - Asynchronous and Buffered Logging.
How can this PR be tested?
The mtr tests server_audit.test, server_audit_buffer_size.test, and server_audit_buffer_time.test have been updated to account for asynchronous and buffered logging.
Basing the PR against the correct MariaDB version
- [x] This is a new feature and the PR is based against the latest MariaDB development branch.
PR quality check
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.
All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
I know this is a WIP, but just in case you missed it, there are compiler errors with this PR:
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c: In function ‘void flush_buffer(const std::vector<std::__cxx11::basic_string<char> >&)’:
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:1096:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘long unsigned int’} [-Werror=sign-compare]
1096 | for (int i= 0; i < all_messages.size(); i++)
| ~~^~~~~~~~~~~~~~~~~~~~~
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c: In function ‘void signal_log(const string&)’:
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:1189:22: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
1189 | if (log_queue_size >= max_size_log_queue)
| ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c: In function ‘void audit_plugin_so_init()’:
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:3267:84: error: cast between incompatible function types from ‘void (*)(THD*, mysql_event_general_v8*)’ to ‘void (*)(THD*, unsigned int, const void*)’ [-Werror=cast-function-type]
3267 | mysql_descriptor.event_notify= (void (*)(THD*, unsigned int, const void*)) auditing_v8;
| ^~~~~~~~~~~
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:3272:84: error: cast between incompatible function types from ‘void (*)(THD*, unsigned int*)’ to ‘void (*)(THD*, unsigned int, const void*)’ [-Werror=cast-function-type]
3272 | mysql_descriptor.event_notify= (void (*)(THD*, unsigned int, const void*)) auditing_v13;
| ^~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [plugin/server_audit/CMakeFiles/server_audit.dir/build.make:63: plugin/server_audit/CMakeFiles/server_audit.dir/server_audit.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:9048: plugin/server_audit/CMakeFiles/server_audit.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
I know this is a WIP, but just in case you missed it, there are compiler errors with this PR:
/home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c: In function ‘void flush_buffer(const std::vector<std::__cxx11::basic_string<char> >&)’: /home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:1096:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘long unsigned int’} [-Werror=sign-compare] 1096 | for (int i= 0; i < all_messages.size(); i++) | ~~^~~~~~~~~~~~~~~~~~~~~ /home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c: In function ‘void signal_log(const string&)’: /home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:1189:22: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 1189 | if (log_queue_size >= max_size_log_queue) | ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ /home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c: In function ‘void audit_plugin_so_init()’: /home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:3267:84: error: cast between incompatible function types from ‘void (*)(THD*, mysql_event_general_v8*)’ to ‘void (*)(THD*, unsigned int, const void*)’ [-Werror=cast-function-type] 3267 | mysql_descriptor.event_notify= (void (*)(THD*, unsigned int, const void*)) auditing_v8; | ^~~~~~~~~~~ /home/buildbot/amd64-ubuntu-2004-debug/build/plugin/server_audit/server_audit.c:3272:84: error: cast between incompatible function types from ‘void (*)(THD*, unsigned int*)’ to ‘void (*)(THD*, unsigned int, const void*)’ [-Werror=cast-function-type] 3272 | mysql_descriptor.event_notify= (void (*)(THD*, unsigned int, const void*)) auditing_v13; | ^~~~~~~~~~~~ cc1plus: all warnings being treated as errors make[2]: *** [plugin/server_audit/CMakeFiles/server_audit.dir/build.make:63: plugin/server_audit/CMakeFiles/server_audit.dir/server_audit.c.o] Error 1 make[1]: *** [CMakeFiles/Makefile2:9048: plugin/server_audit/CMakeFiles/server_audit.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs....
This problem is now resolved.
Currently Investigating: The server_audit_buffer_size.test is failing inconsistently in the CI environment but passing on my local machine. The hypothesis is that there might be a race condition causing a very small number of logs to be displaced. Note that the logs are not missing but rather appear in a later position in the file.
Update: This race condition also shows up in the server_audit.test on CI when testing with the default original synchronous direct logging implementation. Race Condition is likely not due to this implementation.
Would love any feedback on the implementation.
Currently Investigating: The server_audit_buffer_size.test is failing inconsistently in the CI environment but passing on my local machine. The hypothesis is that there might be a race condition causing a very small number of logs to be displaced. Note that the logs are not missing but rather appear in a later position in the file.
That is likely to be expected with something that is asynchronous.
Update: This race condition also shows up in the server_audit.test on CI when testing with the default original synchronous direct logging implementation. Race Condition is likely not due to this implementation.
Are you sure? I have not seen evidence of this, and it isn't on own know failures list.
I'll get someone to look over the implementation of this. But some versioning changes will likely need to be made. I'm specifically thinking about VERSION_logger in include/service_versions.h and the version plugin_audit_v4.h.
I think "dedicated thread" that is doing nothing, just waiting most of the time is a wrong idea. MySQL has multiplied threads for anything possible, "a thread that is waiting for another thread and signals third thread", I would not be like them and multiply threads without necessity. In fact, MariaDB already has timers in mysys, and I think it makes sense to explore them instead. It is just 3 functions you'd need - thr_timer_init, thr_timer_settime, thr_timer_end. Whenever you write to buffer at position 0, and not immediately flush, you can do thr_timer_setttime that flushes buffer for you, in 1 sec from now, or whatever the timeout is. The job will still be performed by another thread (yes, there is a dedicated timer thread, that handles all timers), but this one already exists,it has started, and you do not need to repeat its job.
I think "dedicated thread" that is doing nothing, just waiting most of the time is a wrong idea. MySQL has multiplied threads for anything possible, "a thread that is waiting for another thread and signals third thread", I would not be like them and multiply threads without necessity. In fact, MariaDB already has timers in mysys, and I think it makes sense to explore them instead. It is just 3 functions you'd need -
thr_timer_init,thr_timer_settime,thr_timer_end. Whenever you write to buffer at position 0, and not immediately flush, you can dothr_timer_setttimethat flushes buffer for you, in 1 sec from now, or whatever the timeout is. The job will still be performed by another thread (yes, there is a dedicated timer thread, that handles all timers), but this one already exists,it has started, and you do not need to repeat its job.
Thank you so much for the feedback. I will be looking into this ASAP.
@LinuxJedi @vaintroub I have now updated the branch with your suggestions. The details about the implementation and its benchmarking scores can be found on this JIRA. Please let me know if you any other suggestions. Thank you !
This is much better, thanks! you got rid of extra thread that I was complaining about.
But, as I commented, there is
- No reason for periodic timer, you know when to start it.
- You only need to call thr_timer_settime in one case, when you push first message to queue, without flushing it yourself.
I think I got the logic right, or is there something I'd miss?
Hey @vaintroub,
Thank you for your suggestions. I have updated the implementation accordingly. Now, the audit plugin only calls thr_timer_init(), thr_timer_end, and thr_timer_settime (once). I've tested this on a local server, and everything is working well. Please let me know if you have any further comments.
MDEV-34680 is already done, pushed and released in 12.1.1.
Main functionality is in file_logger.cc as @janlindstrom suggested above. And it uses IO_CACHE instead of implementing yet another caching logic, that made the implementation much simpler and smaller.