fluentd
fluentd copied to clipboard
Incorrect msgpack format?
Describe the bug
Please see this related issue for more detail: https://github.com/fluent/fluent-logger-forward-node/issues/37
In attempting to use a 3rd party plugin to decode 'out_forward' requests from a fluentd server, there seems to be issues when decoding the packed message.
I am aware that out_forward plugin sends events by the PackedForward mode. It encloses event records with msgpack str format instead of bin format for a backward compatibility reason.
The library seems to work fine with fluent bit.
I wasn't (and still, honestly, am not completely) sure whether the issue was within fluentd, using some non-standard msgpack thing, the msgpack library or the fluent library I'm using.
Using fluent/fluent-logger-forward-node
(see this issue: https://github.com/fluent/fluent-logger-forward-node/issues/37 ) which uses "@msgpack/msgpack": "^2.7.1". I've tried several different versions of msgpack and the fluent-logger-forward-node lib with no real better results.
More information about the issue and my attempts to track it down are listed in the issue linked above.
To Reproduce
- Setup a fluentd server with out_foward pushing something to localhost
- Create a simple app with the library as a forward server and listen for forwarded events
- No events are emitted due to incorrect parsing of the message.
Code for my node test app, note that the local client inside this test app works fine; the messages it sends get emitted by the server just fine. It also seems like, for the most part (although I did seem to have some issues, see other issue) fluent bit works fine as an initial forwarder as well.
#!/usr/bin/node
'use strict';
const FluentServer = require("@fluent-org/logger").FluentServer;
const FluentClient = require("@fluent-org/logger").FluentClient;
const server = new FluentServer({ listenOptions: { port: 24420 }});
const logger = new FluentClient("tag_prefix", {
socket: {
host: "localhost",
port: 24420,
timeout: 3000, // 3 seconds
}
});
let main = async ()=>{
await server.listen();
logger.emit('tag', { message: 'message' });
};
server.on('entry', (a,b,c) => console.log(a,b,c));
main()
fluentd/td-agent/calyptia config (tried several versions)
<match system.**>
@type forward
send_timeout 60s
recover_wait 10s
hard_timeout 60s
<server>
name fluent_node
host 127.0.0.1
port 24420
weight 60
</server>
<secondary>
@type file
...
</secondary>
</match>
Now, I've also tried taking the message that's packed within the main message (the main, containing message seems to decode properly) and attempted to decode it with other things, such as an online msgpack service (albiet not knowing which lib they're using) with no real luck.
It seems to decode the entire packet as a utf8 string; my first assumption, after reading the protocol spec and it saying that the actual message was encoded as mentioned above, however, the bytes from that string seem to not ever decode properly themselves alone.
Expected behavior
The packet should be able to be decoded correctly by a msgpack library, even if the contained messages need to be re-decoded again, they seem to not be decodeable.
Your Environment
- Fluentd version: docker:edge, other versions w/td-agent and calyptia-fluentd
- TD Agent version: v3, v4 and calyptia-fluentd
- Operating system: ubuntu 20.04, debian 10
- Kernel version: likely n/a
Your Configuration
<match td.*.*>
@type tdlog
@id output_td
apikey YOUR_API_KEY
auto_create_table
<buffer>
@type file
path /var/log/td-agent/buffer/td
</buffer>
<secondary>
@type file
path /var/log/td-agent/failed_records
</secondary>
</match>
<match debug.**>
@type stdout
@id output_stdout
</match>
<source>
@type exec
tag system.loadavg
command cat /proc/loadavg | cut -d ' ' -f 1,2,3
run_interval 15s
<parse>
@type tsv
keys avg1,avg5,avg15
delimiter " "
</parse>
</source>
<source>
@type syslog
@id apidevtest_syslog
port 5140
bind 127.0.0.1
tag system
</source>
<match **>
@type copy
<store>
@type forward
send_timeout 60s
recover_wait 10s
hard_timeout 60s
keepalive true
# heartbeat_type none
require_ack_response false
verify_connection_at_startup true
<server>
name fluent_node
host 127.0.0.1
port 24420
weight 60
</server>
<secondary>
@type file
path /var/log/td-agent/secondary
</secondary>
</store>
<store>
@type file
@id output_file
path /var/log/td-agent/copy
</store>
</match>
Your Error Log
No fluentd error log that applies; see the linked issue above for outputs from attempted debugging.
Additional context
No response
Closed thinking that the issue was somewhere else, re-opening because maybe it IS in here; see: https://github.com/fluent/fluent-logger-forward-node/issues/37
To quote @jamiees2's excellent synopsis of the issue:
Yeah, I think the actual problem is in msgpack-ruby, at least @msgpack/msgpack seems to be doing the right thing according to the spec.
The thing is, bytes like 0x92 aren't even serializable in UTF-8, so I'm not sure how this would ever have worked. I also got wildly different results when I ran with TEXT_DECODING=force, which forces the msgpack decoder to use TextDecoder instead of their custom one, so it's implementation dependent. I initially looked at ways to reverse the utf8 decoding process, but it looked like it was lossy, and also seemed harder than just messing with the parsing instance.
The data was never in UTF8, fluentd was encoding it in this way. All it does is encode it in msgpack bin format, but stick a 0xd9/0xda/0xdb in front instead of a 0xc4/0xc5/0xc6.
I'm actually somewhat confused why fluentd sends its data in this way, it says it's for backwards compatibility reasons, but fluentd accepts bin formatted data as well. I imagine msgpack-ruby had a bug in it that caused this behavior, and once it got fixed they couldn't change both at once? idk
..thoughts?
This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days
This issue was automatically closed because of stale in 30 days