ethfoo comments

Results 102 comments of


                                            ethfoo

Kafka source disable auto-commit will occur re-consume

A simple description in this flow chart: ![image](https://user-images.githubusercontent.com/3616466/160340430-55ab6304-fdeb-4e56-ae5d-f182cb5bd6d0.png) Loggie has At-least-once semantics, so it is possible for an event to be delivered more than once. It seems that `franz-go` Kafka...

interceptors json_decode配置和源码对应不上

你看的是单独的json_decode interceptor，这块的实现已经移到normalize interceptor的jsonDecode processor，单独的interceptor后面将会废弃，所以文档上并没有体现。

Kafka Source如何输出消息体key值

暂时不支持，可以考虑支持一下。

每Pipeline不能多个Sink的问题

#123

每Pipeline不能多个Sink的问题

> 可能opentelementry-collector的多pipeline复用source的方式才可以解决这个问题？我印象中，opentelemetry-collector虽然配置里只配置里一个source，但是实际上运行的时候会生成多个pipeline吧。这样本质上和Loggie配置多个pipeline去实现多个sink一样了。（可能配置上看起来会简洁一点？）多Sink实现上的挑战，**最本质的**点还在于source和sink的（采集-> 发送）进度同步问题。 A. **单source改造**：如果只有一个source，那么需要这个source感知到多个sink的发送进度(offset)，这样才能保证不丢失(at least once)，但这样对source要求太高不合适，而且可能对sink有强依赖。 B. **单持久化队列**：退而求次，如果有一个持久化队列，那么只需要source发送到queue后，就认为发送成功，这样source采集的进度就可以在source-> queue端同步。多sink场景下，依然需要保证queue->sink->的事务，实现的复杂性放在了queue里。（需要所有的sink ack后才能删除queue里的数据？） C. **一对一的sink持久化队列**：在这种情况下，即使有多个queue，每个queue对应一个sink，仍然存在多个sink的发送进度不同步的问题（某个sink不可用实际上也可以理解为这个sink发送的进度慢），这个时候，一般的策略可能也就是某一个queue满了，则阻塞source，否则source仍然需要感知并维护多个queue的ack，但是这样依旧会导致sink之间互相影响。所以，综合看起来，其实配置一个另外的pipeline去做多sink，缺点可能只有额外的source的资源，对比持久化队列所占的资源，可能还更少一点。这里面的trade-off，感觉需要再进一步的讨论。看起来老哥你研究的还挺深的，欢迎加Loggie bot，我们再详细探讨。[https://loggie-io.github.io/docs/getting-started/overview/#_3](https://loggie-io.github.io/docs/getting-started/overview/#_3)

读取配置热加载需求场景

本质上都是Loggie的discovery模块，提供一个kubernetes外的配置下发实现，根据你的需求，可以为http或者webhook。这里可能有两种形态： 1. 监听模式：类似k8s client-go list&watch 2. 接收请求：起一个http server，第三方服务发送配置post

读取配置热加载需求场景

针对定时http轮询的需求场景，经讨论后整体的流程为： 1. loggie启动的时候，将本地的node 标识信息上报到server register api 2. loggie启动后，定时轮询server config api，然后写入到本地配置文件中

feat(sink): add pulsar sink

@carrotshub 麻烦冲突解决一下

goccy/go-yaml deal with blank string uncorrectly

Please refer to [this](https://loggie-io.github.io/docs/user-guide/use-in-kubernetes/collect-container-logs/#_5) document first when parsing the container log. The two YAML libraries do have some differences, and I'm still trying to figure out how to handle them.

goccy/go-yaml deal with blank string uncorrectly

https://github.com/goccy/go-yaml/issues/309