fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

in_tail: Fix data loss when exiting with buffered data from an uncompressed file

Open jinyongchoi opened this issue 1 month ago β€’ 9 comments

Unprocessed data in the internal buffer is discarded when Fluent Bit stops, causing data loss because the DB offset is already advanced.

This patch fixes the issue for uncompressed files by rewinding the file offset by the remaining buffer length on exit, ensuring data is re-read on restart.

For compressed files, rewinding is not performed due to the complexity of mapping decompressed buffer size back to compressed file offset; a warning is logged instead.

Additionally, this patch prevents resurrecting deleted file entries in the DB by resetting db_id to 0 upon deletion and checking it before updating the offset.

Closes #11265


Enter [N/A] in the box, if an item is not applicable to your change.

Testing Before we can approve your change; please submit the following in a comment:

  • [x] Example configuration file for the change
[SERVICE]
    flush 2
    grace 60
    log_level debug
    log_file /tmp/testing/logs/testing.log
    parsers_file /tmp/testing/parsers.conf
    plugins_file /tmp/testing/plugins.conf
    http_server on
    http_listen 0.0.0.0
    http_port 22002

    storage.path /tmp/testing/storage
    storage.metrics on
    storage.max_chunks_up 512
    storage.sync full
    storage.checksum off
    storage.backlog.mem_limit 100M

[INPUT]
    Name tail
    Path /tmp/testing.input
    Exclude_Path *.gz,*.zip
    Tag testing
    Key message
    Offset_Key   log_offset

    Read_from_Head true
    Refresh_Interval 3
    Rotate_Wait 31557600

    Buffer_Chunk_Size 1MB
    Buffer_Max_Size 16MB
    Inotify_Watcher false

    storage.type filesystem
    storage.pause_on_chunks_overlimit true

    DB /tmp/testing/storage/testing.db
    DB.sync normal
    DB.locking false

    Alias input_log

[OUTPUT]
    Name file
    Match *
    File /tmp/testing.out
  • [x] Debug log output from testing the change
[2025/12/09 10:33:57.6842243] [debug] [input:tail:input_log] inode=50630975 rewind offset for /tmp/testing.input: old=1348719585 new=1348719294 (buf_len=291)
  • [x] Attached Valgrind output that shows no leaks or memory corruption was found
valgrind --leak-check=full ./bin/fluent-bit -v -c ./fluentbit.conf
...
==545013== 
==545013== HEAP SUMMARY:
==545013==     in use at exit: 0 bytes in 0 blocks
==545013==   total heap usage: 249,337 allocs, 249,337 frees, 267,950,893 bytes allocated
==545013== 
==545013== All heap blocks were freed -- no leaks are possible
==545013== 
==545013== For lists of detected and suppressed errors, rerun with: -s
==545013== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [x] Run local packaging test showing all targets (including any new ones) build.
  • [x] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • [x] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Rewind file read offsets when buffered, unprocessed data exists so it can be re-read after restart; persisted offsets updated.
    • Emit a warning when rewind isn’t possible for compressed streams to note potential loss of buffered decompressed data.
    • Clear stale in-memory file DB references after a file entry is deleted.
  • New Features

    • Added tunable settings for rotation monitoring and batch size limits.
  • Tests

    • New runtime test verifying DB-backed offset rewind on shutdown and restart.

✏️ Tip: You can customize this high-level summary in your review settings.

jinyongchoi avatar Dec 09 '25 02:12 jinyongchoi

[!NOTE]

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds gzip-aware resume state and DB-backed bookkeeping: new per-file fields (anchor_offset, skip_bytes, exclude_bytes, skipping_mode), DB schema migration and SQL updates for skip/anchor, seek/processing changes to respect gzip decompression and member boundaries, rewind DB offsets when buffered data remains, and new gzip/DB tests.

Changes

Cohort / File(s) Summary
Core file & decompression logic
plugins/in_tail/tail_file.c
Add gzip-aware resume logic and includes; choose seek position from DB/anchor/read_from_head; initialize anchor/exclude/skipping after seek; apply skip/exclude logic during chunk processing; increment/reset skip_bytes across gzip members; rewind DB offset on removal when buffered data exists (unless decompression active).
DB layer & SQL
plugins/in_tail/tail_db.c, plugins/in_tail/tail_sql.h
Add pragma-based migration to add skip/anchor columns; replace in-file query_status with cb_column_exists; extend db_file_exists to return skip and anchor; bind/read skip/anchor in insert/update/offset/rotate/delete paths; reset db_id to FLB_TAIL_DB_ID_NONE after delete.
File struct & internal headers
plugins/in_tail/tail_file_internal.h, plugins/in_tail/tail.h
Add per-file fields anchor_offset (int64_t), skip_bytes (uint64_t), exclude_bytes (uint64_t), skipping_mode (int); add macro FLB_TAIL_DB_ID_NONE.
FS event / stat handling
plugins/in_tail/tail_fs_inotify.c, plugins/in_tail/tail_fs_stat.c
On truncation (size_delta < 0) initialize anchor_offset, skip_bytes, exclude_bytes, and skipping_mode alongside offset and buf_len.
File removal & counters
plugins/in_tail/tail_file.c (remove/adjust_counters)
On removal, if buffered data exists and no decompression, rewind file->offset by buf_len (floor 0) and persist to DB; on truncation reset anchor/skip/exclude/skipping and update DB when enabled.
SQL definitions
plugins/in_tail/tail_sql.h
Add skip and anchor columns (DEFAULT 0) to in_tail_files; update SQL_INSERT_FILE and SQL_UPDATE_OFFSET to include/handle skip and anchor.
Tests & helpers
tests/runtime/in_tail.c
Add raw write helper, gzip create/append utilities, gzip-resume inspection and wait utility; add DB/gzip resume, append and rotation tests and register them in TEST_LIST.

Sequence Diagram(s)

sequenceDiagram
    participant Disk as Disk File
    participant Reader as in_tail Reader
    participant Decompress as Gzip Decompressor
    participant Buffer as In-memory Buffer
    participant DB as SQLite DB

    Disk->>Reader: read compressed/raw bytes (advance raw offset)
    Reader->>Buffer: append raw bytes
    alt decompression_context (gzip)
        Buffer->>Decompress: feed compressed bytes
        Decompress-->>Buffer: decompressed data (may span members)
        Buffer->>Buffer: apply exclude_bytes / skipping_mode (drop initial decompressed bytes)
        alt gzip member boundary reached
            Decompress->>Reader: notify member end
            Reader->>Reader: set anchor_offset (member start), reset skip_bytes
        end
        Reader->>DB: persist raw offset, skip, anchor
    else no decompression
        Buffer->>DB: persist raw offset
    end
    alt Shutdown with buffered unprocessed data and no decompression
        Reader->>DB: rewind offset (offset -= buf_len) and persist
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Pay attention to:
    • DB migration and SQL binding/column ordering (plugins/in_tail/tail_db.c, plugins/in_tail/tail_sql.h).
    • Seek selection and post-seek initialization for compressed streams (plugins/in_tail/tail_file.c).
    • Lifecycle and transitions of skip_bytes / exclude_bytes / skipping_mode across reads and gzip-member boundaries.
    • Correctness of offset rewind logic in flb_tail_file_remove and DB synchronization on shutdown.
    • New test helpers and gzip test correctness in tests/runtime/in_tail.c.

Possibly related PRs

  • fluent/fluent-bit#10785 β€” touches plugins/in_tail/tail_file.c; potential overlap in file removal/process paths.
  • fluent/fluent-bit#11059 β€” modifies in_tail chunk/process logic; likely to intersect with gzip resume and offset handling.

Suggested labels

backport to v4.0.x, backport to v4.1.x

Suggested reviewers

  • edsiper
  • cosmo0920
  • leonardo-albertovich
  • koleini
  • fujimotos

Poem

πŸ‡
I nibble compressed bytes at dawn,
I mark anchors where members yawn.
When partial lines hide from sight,
I hop back offsets through the night.
Hops, skips, and crumbs β€” I make logs right.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.38% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
βœ… Passed checks (4 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The PR title accurately summarizes the main objectives: fixing data loss with buffered data on shutdown and adding gzip file handling.
Linked Issues check βœ… Passed The PR comprehensively addresses all coding requirements from issue #11265: implements offset rewinding for uncompressed files, handles gzip with schema changes and resume logic, resets db_id to prevent resurrection, and adds DB migration for new columns.
Out of Scope Changes check βœ… Passed All changes are scoped to addressing the data loss issue: buffer handling, gzip resume logic, DB schema migration, and test coverage. No unrelated modifications were detected.
✨ Finishing touches
  • [ ] πŸ“ Generate docstrings
πŸ§ͺ Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

πŸ“œ Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between 8640940a39a9d34307978d45cfe5116873dae2c9 and fdd817452d12aa6a738cde060c3a0b292c0ef040.

πŸ“’ Files selected for processing (6)
  • plugins/in_tail/tail_db.c (11 hunks)
  • plugins/in_tail/tail_file.c (11 hunks)
  • plugins/in_tail/tail_file_internal.h (1 hunks)
  • plugins/in_tail/tail_fs_inotify.c (1 hunks)
  • plugins/in_tail/tail_fs_stat.c (1 hunks)
  • plugins/in_tail/tail_sql.h (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugins/in_tail/tail_fs_inotify.c
  • plugins/in_tail/tail_file_internal.h
  • plugins/in_tail/tail_sql.h
🧰 Additional context used
🧠 Learnings (12)
πŸ““ Common learnings
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.
πŸ“š Learning: 2025-10-23T07:43:16.216Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

Applied to files:

  • plugins/in_tail/tail_fs_stat.c
  • plugins/in_tail/tail_file.c
  • plugins/in_tail/tail_db.c
πŸ“š Learning: 2025-09-22T15:59:55.794Z
Learnt from: nicknezis
Repo: fluent/fluent-bit PR: 10882
File: plugins/out_http/http.c:112-116
Timestamp: 2025-09-22T15:59:55.794Z
Learning: When users consider bug fixes out of scope for their focused PRs, it's appropriate to create separate GitHub issues to track those concerns rather than expanding the current PR scope.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-29T06:24:26.170Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-29T06:25:02.561Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • plugins/in_tail/tail_file.c
πŸ“š Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

  • plugins/in_tail/tail_file.c
🧬 Code graph analysis (2)
plugins/in_tail/tail_file.c (1)
plugins/in_tail/tail_db.c (1)
  • flb_tail_db_file_offset (366-399)
plugins/in_tail/tail_db.c (2)
src/flb_sqldb.c (2)
  • flb_sqldb_query (151-166)
  • flb_sqldb_close (129-149)
plugins/in_tail/tail_file.h (1)
  • flb_tail_target_file_name_cmp (64-116)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-centos-7
πŸ”‡ Additional comments (16)
plugins/in_tail/tail_fs_stat.c (1)

139-142: LGTM! Consistent gzip state reset on truncation.

The initialization of anchor_offset, skip_bytes, exclude_bytes, and skipping_mode on file truncation is correct and consistent with the corresponding logic in tail_fs_inotify.c and adjust_counters() in tail_file.c.

plugins/in_tail/tail_file.c (8)

47-47: LGTM!

Required include for the decompression state constants used in the gzip member boundary detection logic.


1031-1096: Well-structured gzip-aware DB position handling.

The logic correctly handles three scenarios:

  1. Gzip files with persisted anchor/skip: seeks to anchor_offset and enables skipping mode
  2. Normal files or legacy gzip entries: uses the stored offset
  3. No DB position with read_from_head=off: seeks to EOF

The comment at lines 1089-1094 appropriately documents the multi-member gzip limitation where skip_bytes only tracks bytes within the current member. This is a known limitation acknowledged in the PR discussion.


1130-1138: LGTM!

Correctly initializes gzip resume fields for compressed files without DB persistence. Setting stream_offset = 0 is intentional as there's no prior decompressed byte count to track.


1329-1338: LGTM!

Good initialization of the gzip resume fields. Using FLB_TAIL_DB_ID_NONE instead of a magic 0 improves code clarity and consistency with the deletion path.


1509-1548: Correct offset rewind implementation for uncompressed files.

The logic properly:

  1. Rewinds offset by buf_len for uncompressed files to prevent data loss on restart
  2. Logs a debug message documenting the acknowledged gzip limitation
  3. Uses db_id > FLB_TAIL_DB_ID_NONE check to avoid resurrecting deleted DB entries (addressing the past review concern)

The gzip limitation (lines 1537-1541) is intentional and documented in the PR objectivesβ€”mapping decompressed buffer positions back to compressed offsets is infeasible with streaming decompression.


1668-1671: LGTM!

Correctly resets all gzip resume state fields on file truncation, consistent with the truncation handling in tail_fs_stat.c and tail_fs_inotify.c.


1872-1892: Correct implementation of decompressed data skipping.

The skip logic properly handles:

  1. Full skip: when exclude_bytes >= decompressed_data_length, decrement and discard all data
  2. Partial skip: calculate remaining bytes, use memmove for the overlapping buffer shift, clear skipping_mode

Using memmove is correct here since source and destination overlap.


1933-1954: Solid gzip member boundary detection and anchor update.

The logic correctly:

  1. Tracks decompressed bytes within the current member via skip_bytes
  2. Detects member completion when the decompressor expects a new header and all buffers are empty
  3. Updates anchor_offset to the current raw file position for safe resume

As noted in the PR discussion (by cosmo0920), there's a known corner case with multi-member gzip + multiline where a shutdown between the in-memory skip_bytes increment and the DB persist can cause small duplication on restart. This is technically difficult to eliminate and is appropriately documented as a limitation rather than a bug.

plugins/in_tail/tail_db.c (7)

28-34: LGTM!

Clean callback implementation for detecting query results. This addresses the previous review feedback about using pragma_table_info for reliable column existence detection.


61-107: Robust schema migration using pragma_table_info.

This addresses the previous review feedback:

  • Uses pragma_table_info to reliably detect column existence
  • Properly distinguishes between query failures (returns NULL with error log) and missing columns (triggers migration)
  • Using flb_plg_debug for migration messages is appropriate per the PR discussion

182-232: LGTM! Proper type handling for skip/anchor columns.

The extended signature correctly uses:

  • int64_t for offset and anchor (matching sqlite3_column_int64 return type)
  • uint64_t for skip (matching file->skip_bytes type)

This addresses the previous review concern about potential truncation on platforms where off_t is 32-bit. The added cleanup at lines 204-205 before returning on error is also good practice.


262-263: LGTM!

Correctly binds skip_bytes and anchor_offset to the INSERT statement parameters.


323-359: LGTM! Correct restoration of gzip resume state from DB.

The logic properly:

  1. Retrieves skip_bytes and anchor_offset from the database
  2. Initializes skipping_mode and exclude_bytes when skip_bytes > 0, enabling the skip logic in flb_tail_file_chunk()
  3. Uses correct types matching the db_file_exists signature

373-375: LGTM!

Binding order correctly matches the SQL_UPDATE_OFFSET statement: offset, skip, anchor for SET clause, then db_id for WHERE clause.


444-444: Critical fix: Reset db_id to prevent DB entry resurrection.

Setting db_id = FLB_TAIL_DB_ID_NONE after deletion works in conjunction with the db_id > FLB_TAIL_DB_ID_NONE check in flb_tail_file_remove() to prevent the bug where a deleted file's DB entry could be recreated if offset rewinding occurs after deletion.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Dec 09 '25 02:12 coderabbitai[bot]

@codex review

jinyongchoi avatar Dec 09 '25 03:12 jinyongchoi

@codex review

jinyongchoi avatar Dec 09 '25 05:12 jinyongchoi

Codex Review: Didn't find any major issues. :rocket:

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@coderabbitai review

jinyongchoi avatar Dec 09 '25 07:12 jinyongchoi

βœ… Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai[bot] avatar Dec 09 '25 07:12 coderabbitai[bot]

@edsiper @leonardo-albertovich @cosmo0920 Hello, This PR is ready for review.

Accurately mapping decompressed buffer positions back to compressed file offsets is not feasible with the current streaming decompression approach. Therefore, I have scoped this PR to address the data loss issue for uncompressed files only.

For compressed files, I have added a warning log when unprocessed buffered data exists upon exit. I have also updated the PR title and description to reflect this scope.

If there are any good ideas for supporting compressed files, I would be happy to incorporate them. However, given the scope and complexity, I believe it would be better addressed in a separate PR.

Thanks.

jinyongchoi avatar Dec 09 '25 08:12 jinyongchoi

@codex review

jinyongchoi avatar Dec 09 '25 11:12 jinyongchoi

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@codex review

jinyongchoi avatar Dec 14 '25 11:12 jinyongchoi

@codex review

jinyongchoi avatar Dec 15 '25 06:12 jinyongchoi

@codex review

jinyongchoi avatar Dec 15 '25 08:12 jinyongchoi

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@codex review

jinyongchoi avatar Dec 15 '25 14:12 jinyongchoi

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@cosmo0920

In addition to the existing implementation, the following changes were added:

  • Fixed data loss in the in_tail plugin when tailing gzip files across restarts
  • Added DB columns (skip_bytes, anchor_offset) to persist the uncompressed position
  • Implemented a skip mechanism to resume from the correct position in the decompressed stream
  • Reset tracking fields on file rotation

Please let me know if you have any feedback or see any issues with this approach.

Thanks!

jinyongchoi avatar Dec 16 '25 03:12 jinyongchoi

@cosmo0920 Thanks for the review! I've addressed the feedback in two commits:

  • a6afba7 - Code style fix Fixed else formatting to match project style

  • 1263901 - Initialize stream_offset for compressed files DB path: Added missing stream_offset = skip_bytes initialization when restoring compressed files from DB Non-DB path: Explicitly initialize exclude_bytes and stream_offset to 0 for clarity and consistency Includes comment noting the limitation for multi-member gzip files

Please let me know if there are any other concerns.

jinyongchoi avatar Dec 16 '25 13:12 jinyongchoi

I found a potential issue but we could be skipped in this PR:

Interaction with multiline + gzip (MEDIUM risk, but acceptable)

There’s a subtle corner:

Scenario:

  1. Multi-member gzip
  2. Mid-member processing
  3. Shutdown happens after decompression but before anchor update
  4. DB has:
    1. anchor_offset β†’ previous member
    2. skip_bytes β†’ partially updated
  5. Restart resumes correctly most of the time, but…

Risk:

If shutdown occurs between

  • skip_bytes += processed_bytes;

and

  • flb_tail_db_file_offset()

then DB may lag by one chunk.

This results in small duplication, not loss.

This scenario could be still a main risk to behind the right offset of tailing gzipped files but this just causes duplicated contents of gzipped files. This could be technically difficult to solve this. So, we need to describe such corner case if we accept this behavior.

Other risks are still existing in this patch but others are relatively too small than this.

Plus, it's just a logging issue but should we need to show altering database when starting to do this operation? This is just my curiosity.

cosmo0920 avatar Dec 17 '25 07:12 cosmo0920

Thanks for the detailed analysis. I fully agree with your opinion. Although there is a risk of duplication in case of abrupt shutdown, it is technically difficult to solve completely, and duplication is definitely better than data loss.

Also, regarding your question about the logs, I have changed the log level of the database migration messages to debug. This ensures they are not too noisy during normal operation while still being available for troubleshooting if needed.

Finally, should I add a note about the limitation (potential duplication on crash) to the documentation? I think adding a warning/note to the 'Database file' section would be helpful for users.

Let me know what you think! Thanks!

jinyongchoi avatar Dec 18 '25 05:12 jinyongchoi

Finally, should I add a note about the limitation (potential duplication on crash) to the documentation? I think adding a warning/note to the 'Database file' section would be helpful for users.

Let me know what you think! Thanks!

I suppose that we need to Note annotations to depicts the possibility for database corruptions in the official documentation which should be corresponding PR for documentation. This could be corner cases but it's technically hard to solve cleanly.

cosmo0920 avatar Dec 19 '25 06:12 cosmo0920

Finally, should I add a note about the limitation (potential duplication on crash) to the documentation? I think adding a warning/note to the 'Database file' section would be helpful for users. Let me know what you think! Thanks!

I suppose that we need to Note annotations to depicts the possibility for database corruptions in the official documentation which should be corresponding PR for documentation. This could be corner cases but it's technically hard to solve cleanly.

Got it! I'll create a separate PR for the documentation. Thanks!

jinyongchoi avatar Dec 19 '25 07:12 jinyongchoi