containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECS Fargate] [request]: Partial_message field not included in platform version 1.4.0

Open PettitWesley opened this issue 3 years ago • 8 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request

In all container runtimes that I am aware of, stdout/stderr messages from containers are split after a certain limit. In Docker, CRI, and containerd, IIRC the limit is always 16 KB. This means that if you emit messages larger than 16KB from your container, they will be split into multiple messages.

You can then use a tool like Fluentd (Fluent Bit support is something I am working on: https://github.com/aws/aws-for-fluent-bit/issues/25), to concatenate those records back into one. However, this relies on the runtime setting some flag to tell you that the series of messages you are receiving was actually originally one message.

The Fluentd Plugin Concat can join these back together: https://github.com/fluent-plugins-nursery/fluent-plugin-concat

To understand how this works, let's look at the example from the Fluentd plugin unit test: https://github.com/fluent-plugins-nursery/fluent-plugin-concat/blob/master/test/plugin/test_filter_concat.rb#L551

  sub_test_case "partial_key" do
    test "filter with docker style events" do
      config = <<-CONFIG
        key message
        partial_key partial_message
        partial_value true
      CONFIG
      messages = [
        { "container_id" => "1", "message" => "start", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 1", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 2", "partial_message" => "true" },
        { "container_id" => "1", "message" => "end", "partial_message" => "false" },
        { "container_id" => "1", "message" => "start", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 3", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 4", "partial_message" => "true" },
        { "container_id" => "1", "message" => "end", "partial_message" => "false" },
      ]
      filtered = filter(config, messages, wait: 3)
      expected = [
        { "container_id" => "1", "message" => "start\n message 1\n message 2\nend" },
        { "container_id" => "1", "message" => "start\n message 3\n message 4\nend" },
      ]
      assert_equal(expected, filtered)
    end

Which service(s) is this request for? ECS Fargate

Are you currently working around this issue?

I am not aware of any way to work around this.

PettitWesley avatar Nov 04 '21 21:11 PettitWesley

I suspect this will require changes in the shim loggers: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd

PettitWesley avatar Nov 04 '21 21:11 PettitWesley

I should note that my original examples are not complete, this is what real split logs should look like, note the multiple partial message fields:

{"source"=>"stdout", "log"=>"{"payload": "0123456789......01234567890abcdef", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"1", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig"}]
{"container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"{"payload": "..01234567890abcdef", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"1", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2"}]
{"partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"ghijklmnopqrstuvwxyz..0123456789", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"2"}]
{"log"=>"0...012", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"3", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout"}]
{"source"=>"stderr", "log"=>"ghijklmnopqrstuvwxyz..0123456789", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"2", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig"}]
{"partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"3", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"0...012", "partial_message"=>"true"}]
{"container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"34567890..abcdefghijklmnopqrstuv", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"4", "partial_last"=>"false"}]
{"partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"34567890..abcdefghijklmnopqrstuv", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"4"}]
{"container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"wxyz....01234567890.", "event_id": 692, "counter": 0, "global_counter": 0, "time": "2022-01-30 23:36:32.001007"}", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"5", "partial_last"=>"true"}]
{"log"=>"wxyz....01234567890.", "event_id": 692, "counter": 0, "global_counter": 0, "time": "2022-01-30 23:36:32.001007"}", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"5", "partial_last"=>"true", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr"}]

PettitWesley avatar Feb 07 '22 23:02 PettitWesley

What will it take to get this prioritised? Is there any ETA? It seems that related discussions have been rumbling on since at least November 2020: https://github.com/aws/aws-for-fluent-bit/issues/100

marksumm avatar Mar 09 '22 12:03 marksumm

@marksumm I can not provide an ETA or launch date, however, this hasn't been forgotten and I am working on it.

PettitWesley avatar Mar 15 '22 04:03 PettitWesley

@marksumm implementation complete: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd/pull/24

PettitWesley avatar Apr 18 '22 06:04 PettitWesley

@marksumm implementation complete: aws/amazon-ecs-shim-loggers-for-containerd#24

Great news! Thanks for your work on this.

marksumm avatar Apr 18 '22 06:04 marksumm

@PettitWesley - What is this issue pending now?

rcollette avatar Nov 08 '23 19:11 rcollette

Can you please close this issue as it's been implemented?

remiflament avatar Apr 23 '24 09:04 remiflament