vector icon indicating copy to clipboard operation
vector copied to clipboard

protobuf parsing error when sending pyroscope-dotnet info to vector

Open jsonarso opened this issue 6 months ago • 5 comments

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

I'm developing a PoC about connecting pyroscope-dotnet with vector. I have this example app running in a simple pod within a k8s cluster.

PYROSCOPE_SERVER_ADDRESS env var is set and points to a vector http server using protobuf decoding like so:

pyroscope:
    type: http_server
    address: 0.0.0.0:4040
    path: /ingest
    decoding:
      codec: protobuf
      protobuf:
        desc_file: /vector-data-files/profile.desc
        message_type: perftools.profiles.Profile

As you might see, I'm referencing a profile.desc based on this .proto file and generating it using protoc CLI tool like this:

protoc --descriptor_set_out=profile.desc --include_imports profile.proto

profile.desc is being transformed into a configMap so the pod is able to consume it.

Seems that both actors are able to communicate but when checking the logs I see the following:

Sample app is erroring as follows: [2024-01-27 00:30:20.158 | info | PId: 1 | TId: 13] PyroscopePprofSink 400

vector logs shows:

{"error":"Error parsing protobuf: DecodeError { description: \"invalid wire type: ThirtyTwoBit (expected LengthDelimited)\", stack: [] }","error_code":"decoder_deserialize","error_type":"parser_failed","host":"aks-default-42715425-vmss000028","internal_log_rate_limit":true,"message":"Failed deserializing frame.","metadata":{"kind":"event","level":"ERROR","module_path":"vector::internal_events::codecs","target":"vector::internal_events::codecs"},"pid":1,"source_type":"internal_logs","stage":"processing","timestamp":"2024-01-27T00:30:20.158087899Z","vector":{"component_id":"pyroscope-ingest","component_kind":"source","component_type":"http_server"}}

Configuration

apiVersion: v1
kind: Namespace
metadata:
  name: poc
---
apiVersion: v1
kind: Pod
metadata:
  name: pyropoc
  namespace: poc
  annotations:
    app: pyropoc
spec:
  containers:
  - name: pyropoc
    image: ""
    env:
      - name: ASPNETCORE_URLS
        value: http://*:5000
      - name: PYROSCOPE_SERVER_ADDRESS
        value: http://vector-agent.monitoring.svc.cluster.local:4040
    ports:
      - containerPort: 5000
        protocol: TCP
        name: http

Version

0.29.0

Debug Output

No response

Example Data

No response

Additional Context

No response

References

https://github.com/grafana/pyroscope-dotnet/issues/56

jsonarso avatar Jan 29 '24 16:01 jsonarso

HI @jsonarso !

I'm not familiar with Pyroscope, but the error is indicating that Vector isn't receiving length-delimited protobuf messages. Is it possible to configure the client to frame the protobuf messages in that manner?

jszwedko avatar Jan 29 '24 19:01 jszwedko

Hey @jszwedko, thanks for the quick response.

I don't have a lot of experience with it either. It uses an agent based on dd-trace-dotnet to retrieve continuous profiling info from dotnet apps.

I only see some basic environment variables for configuration but nothing related to framing. https://grafana.com/docs/pyroscope/latest/configure-client/language-sdks/dotnet/

Could vector http server framing feature help in this scenario or that happens after decoding... just wondering https://vector.dev/docs/reference/configuration/sources/http_server/#framing

jsonarso avatar Jan 29 '24 20:01 jsonarso

Mmm, yeah, I see. I'm struggling to find some docs on the protocol pyroscope uses to forward data. It might be worth asking their community about it.

jszwedko avatar Jan 30 '24 15:01 jszwedko

I think I'm kind of clarifying things a little bit after reading this: https://grafana.com/docs/pyroscope/latest/configure-server/about-server-api/#pprof-format

What I've found is that the dotnet profiler sends a multipart/form-data HTTP request which contains both a .pprof file in it and a JSON sample type config.

----cpp-httplib-multipart-data-yJjxnAh1Fskvttvw
Content-Disposition: form-data; name="profile"; filename="profile.pprof"

W
O	

 !"#$���	"""""""""""""""""	"	"
"
"""""
"
""""""""""""""""""""""""""""""""""""" " "!"!"""""#"#"$"$* * * * *	 * 
*
 * *	 *
 * * *
 * * * * * * * * * * * * *  *! *" *# *$ *% * & *!' *"( *#) *$* 22nanoseconds2cpu2Microsoft.Extensions.Hosting2MMicrosoft.Extensions.Hosting.Internal!Host.<DisposeAsync>g__DisposeAsync|16_02GMicrosoft.Extensions.Hosting.Internal!Host.<DisposeAsync>d__16.MoveNext2System.Private.CoreLib2RSystem.Runtime.CompilerServices!AsyncMethodBuilderCore.Start..........`
----cpp-httplib-multipart-data-yJjxnAh1Fskvttvw
Content-Disposition: form-data; name="sample_type_config"; filename="sample_type_config.json"

{
  "alloc_samples": {
    "units": "objects",
    "display-name": "alloc_objects"
  },
  "alloc_size": {
    "units": "bytes",
    "display-name": "alloc_space"
  },
  "cpu": {
    "units": "samples",
    "sampled": true
  },
  "exception": {
    "units": "exceptions",
    "display-name": "exceptions"
  },
  "lock_count": {
    "units": "lock_samples",
    "display-name": "mutex_count"
  },
  "lock_time": {
    "units": "lock_nanoseconds",
    "display-name": "mutex_duration"
  },
  "wall": {
    "units": "samples",
    "sampled": true
  },
  "inuse_objects": {
    "units": "objects",
    "display-name": "inuse_objects",
    "aggregation": "average"
  },
  "inuse_space": {
    "units": "bytes",
    "display-name": "inuse_space",
    "aggregation": "average"
  }
}
----cpp-httplib-multipart-data-yJjxnAh1Fskvttvw--

Interesting things I've noticed:

  1. Without any encoding, vector split the request into multiple events/messages.
  2. When setting http server encoding to binary, I receive the whole message in one event.
  3. There some other encoding happening behind the scenes cause vector shows messages as follows:
{
  "message":"----cpp-httplib-multipart-data-HOQwocFKsiFUtIar\r\nContent-Disposition: form-data; name="profile"; filename="profile.pprof"\r\n\r\n\n\u0004\b\u0002\u0010\u0001\u0012\u000b\n\u0003\u0001\u0002\u0003\u0012\u0004���\u0004"\u0006\b\u0001"\u0002\b\u0001"\u0006\b\u0002"\u0002\b\u0002"\u0006\b\u0003"\u0002\b\u0003\u0006\b\u0001\u0010\u0004 \u0003\u0006\b\u0002\u0010\u0005 \u0003\u0006\b\u0003\u0010\u0006 \u00032\u00002\u000bnanoseconds2\u0003cpu2\u0016System.Private.CoreLib2System.Threading!WaitHandle.WaitOneNoCheck2>System.Threading!PortableThreadPool.GateThread.GateThreadStart2%System.Threading!Thread.StartCallbackZ\u0004\b\u0002\u0010\u0001`\u0001\r\n----cpp-httplib-multipart-data-HOQwocFKsiFUtIar\r\nContent-Disposition: form-data; name="sample_type_config"; filename="sample_type_config.json"\r\n\r\n{\n "alloc_samples": {\n "units": "objects",\n "display-name": "alloc_objects"\n },\n "alloc_size": {\n "units": "bytes",\n "display-name": "alloc_space"\n },\n "cpu": {\n "units": "samples",\n "sampled": true\n },\n "exception": {\n "units": "exceptions",\n "display-name": "exceptions"\n },\n "lock_count": {\n "units": "lock_samples",\n "display-name": "mutex_count"\n },\n "lock_time": {\n "units": "lock_nanoseconds",\n "display-name": "mutex_duration"\n },\n "wall": {\n "units": "samples",\n "sampled": true\n },\n "inuse_objects": {\n "units": "objects",\n "display-name": "inuse_objects",\n "aggregation": "average"\n },\n "inuse_space": {\n "units": "bytes",\n "display-name": "inuse_space",\n "aggregation": "average"\n }\n}\r\n----cpp-httplib-multipart-data-HOQwocFKsiFUtIar--\r\n"
}

This seems to be like a get and forward scenario but of course I need to keep http request format as its generated by the profiler... any ideas?

jsonarso avatar Feb 01 '24 03:02 jsonarso

Interesting, thanks for those additional details. I think the best you'll be able to do, with just Vector, is to receive the whole payload as one event (the binary encoding). I'm guessing that won't enable you to do the processing you desire though? It would allow simple pass-thru.

Alternatively, this might be a place where you'd need a sidecar to receive the requests, parse them, and forward them to Vector.

jszwedko avatar Feb 01 '24 19:02 jszwedko