aws-otel-collector
aws-otel-collector copied to clipboard
Add new ECS error reporting feature in AOC
Description: This is the new error reporting feature for ECS. Except sending all error logs to the original destinations, we need to also send error or higher level log messages to another single error log file (the file path configurable later) and keep it less than 1 KB, so that ECS agent could mount this path and read the error log file. When the error log file exceed the max size (1KB), we want to delete the oldest log message to release the space. Also we want to rotate the logs by repeatedly delete the expired logs (the expired time is also configurable later).
Testing:
make build
and run the executable file with a config file with some errors, and succeed with 2 config errors write into the error log file.
Previous Pull Request: https://github.com/aws-observability/aws-otel-collector/pull/503
Codecov Report
Merging #538 (d6c0433) into main (e50d672) will decrease coverage by
7.65%
. The diff coverage is42.74%
.
@@ Coverage Diff @@
## main #538 +/- ##
==========================================
- Coverage 61.45% 53.79% -7.66%
==========================================
Files 8 9 +1
Lines 179 303 +124
==========================================
+ Hits 110 163 +53
- Misses 54 111 +57
- Partials 15 29 +14
Impacted Files | Coverage Δ | |
---|---|---|
pkg/logger/ecs_error_log.go | 42.74% <42.74%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update e50d672...d6c0433. Read the comment docs.
ping @mxiamxia
Hi @skyduo , since we use main branch for prod release. These changes are on critical path of running ADOT collector. Do you think we can move this changes in a feature branch and we can merge it after enough testing later?
hey @mxiamxia ! sure, we can do that
@skyduo I think you have another PR in upstream so this one can be closed?
hey @pingleig! that is a new health check feature in the upstream repo which is different from this error reporting feature, also @mxiamxia can we merge this pr?
@skyduo I think you need to sync w/ @jhnlsn or @alolita on merging this. We (me and Min) no longer actively work on ADOT.
btw: you need to rebase and fix the conflict before merging
This PR is stale because it has been open 30 days with no activity.
This PR is stale because it has been open 30 days with no activity.
This PR is stale because it has been open 30 days with no activity.