envoy
envoy copied to clipboard
[ip_tagging] Support dynamic file based ip tags
Commit Message: Support async file based reloading of ip tags in ip tagging filter (done via Datasource). The supported file formats for ip tags are yaml and json. One instance of loaded ip tags file exists per process, the bookkeeping logic is handled by custom singleton that hashes and caches loaded file instances based on ip tags file path. Additional Description: Risk Level: Medium (Can be low if guarded by runtime flag) Testing: Unit + integration tests Docs Changes: Done Release Notes: TBD Platform Specific Features: NA
As a reminder, PRs marked as draft will not be automatically assigned reviewers, or be handled by maintainer-oncall triage.
Please mark your PR as ready when you want it to be reviewed!
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).
CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).
CI failure looks legit. /wait
cc @ravenblackx @adisuissa
I can look once CI is green?
/wait
@mattklein123 @adisuissa the ci has been fixed, now ready for review, ptal.
Just wanted to clarify one moment with proposed implementation, while loading of file happens on main thread via datasource, there is a custom thread that performs periodic re-parsing of that file content into a trie like structure. One may wonder why is it running on custom thread? Am writing this feature for Docker use case, where ip tags file contains ~66k entries to parse and I believe main thread should be doing housekeeping work and not performing extension heavy computational tasks. cc @wbpcode @adisuissa @yanavlasov
Just wanted to clarify one moment with proposed implementation, while loading of file happens on main thread via datasource, there is a custom thread that performs periodic re-parsing of that file content into a trie like structure. One may wonder why is it running on custom thread? Am writing this feature for Docker use case, where ip tags file contains ~66k entries to parse and I believe main thread should be doing housekeeping work and not performing extension heavy computational tasks. cc @wbpcode @adisuissa @yanavlasov
Thanks, it makes sense.
/docs
Docs for this Pull Request will be rendered here:
https://storage.googleapis.com/envoy-pr/38574/docs/index.html
The docs are (re-)rendered each time the CI envoy-presubmit (precheck docs) job completes.
Managed to repro segfault in integration test (faster feedback than building and deploying custom images). Unfortunately I will not be able to run parsing on custom thread. Custom thread needs to access data source and stats data via TLS and TLS implementation has built in assertions for which thread is registering for & accessing the thread local data (it does not allow custom threads).
Managed to repro segfault in integration test (faster feedback than building and deploying custom images). Unfortunately I will not be able to run parsing on custom thread. Custom thread needs to access data source and stats data via TLS and TLS implementation has built in assertions for which thread is registering for & accessing the thread local data (it does not allow custom threads).
You can post a message from your custom thread to the main thread to update TLS
@yanavlasov with custom thread we will get chained dispatchers invocations:
ip_tags_reload_timer_ = main_dispatcher.createTimer([this]() -> void {
//reading data from datasource can only be done on main or worker thread
const auto new_data = tags_loader_.getDataSourceData();
ip_tags_reload_dispatcher.post([this, &new_data]{
auto new_tags_or_error = tags_loader_.refreshTags(new_data);
if (new_tags_or_error.status().ok()) {
updateIpTags(new_tags_or_error.value());
// stats update can only run on main or worker thread
main_dispatcher.post([]{
//updates reload stats
reload_success_cb_();
});
} else {
ENVOY_LOG(debug, "Failed to reload ip tags, using old data: {}",
new_tags_or_error.status().message());
// stats update can only run on main or worker thread
main_dispatcher.post([]{
//updates reload stats
reload_error_cb_();
});
}
});
ip_tags_reload_timer_->enableTimer(ip_tags_refresh_interval_ms_);
});
@yanavlasov this is ready for another round. Lmk if you would still prefer to run on custom thread. Performance-wise it makes sense just makes code much more complex.
The change looks good to me. Needs CI to be stabilized and API review.
/wait
just a quick note, i have been on a 2 weeks summer vacation, coming back to this next week.
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
Bunch of fixes in latest commit:
- Disarming of timer on geoip provider destruction
- Decoupling provider and filterConfig classes, since instances of such classes have different life time:
- got rid of provider -> filterConfig callbacks for incrementing reload stats, on LDS update filterConfig will be destroyed and callback references will be invalid (unless callbacks are properly posted to filterConfig dispatcher). Moved tracking of such stats to provider instead.
- provider is no longer holding reference to loader (owned by filterConfig), but instead creates its own instance of loader to avoid segfaults on LDS updates.
- creation and tracking of hit metrics is moved to filter
onHeadermethod. There before incrementing each hit stat we check is such counter exists. This is to accommodate for case when tags have been reloaded in background and some new counters may need to be registered. This is a simpler approach, which may introduce overhead in request path, but in reality is just one extra check in a map for counter existence. Another approach would have been to create and pass filterConfig callbacks to provider on each filterConfig creation/LDS update and then invoke the callbacks via thread dispatcher and register new counters. For now leaned towards simpler approach.
@wbpcode would you do another pass on the API?
@wbpcode @yanavlasov addressed review comments.
I think we get agreement on the refactor to the DataSourceProvider because I got a 👍 . It's will be a big improvement because almost all DataSource based configuration could benefit from it.
/wait
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
I think we get agreement on the refactor to the
DataSourceProviderbecause I got a 👍 . It's will be a big improvement because almost all DataSource based configuration could benefit from it./wait
https://github.com/envoyproxy/envoy/pull/41961