parca icon indicating copy to clipboard operation
parca copied to clipboard

Bug: Parca server panics when receiving OTLP Profile signal

Open AntoxaBarin opened this issue 4 months ago • 2 comments

Panic stacktrace:

*Parca logo*
level=info name=parca ts=2025-08-20T07:39:32.111257566Z caller=factory.go:53 msg="loading bucket configuration"
level=info name=parca ts=2025-08-20T07:39:32.114592063Z caller=badgerlogger.go:36 msg="Set nextTxnTs to 0"
level=info name=parca ts=2025-08-20T07:39:32.11654715Z caller=server.go:90 msg="starting server" addr=:7070
panic: runtime error: index out of range [2] with length 0

goroutine 245 [running]:
github.com/parca-dev/parca/pkg/normalizer.(*labelNames).addOtelAttributesFromTable(0xc00069c980, {0x0, 0x0, 0x0}, {0xc000d398b0, 0x3, 0x3})
        /home/parca/pkg/normalizer/otel.go:88 +0x13c
github.com/parca-dev/parca/pkg/normalizer.getAllLabelNames(0xc00098fa40)
        /home/parca/pkg/normalizer/otel.go:147 +0x3d9
github.com/parca-dev/parca/pkg/normalizer.OtlpRequestToArrowRecord({0x8792850, 0xc0065bf560}, 0xc00098fa40, 0xc0006c70e0, {0x87706a0, 0xb63a260})
        /home/parca/pkg/normalizer/otel.go:51 +0x1b7
github.com/parca-dev/parca/pkg/profilestore.(*ProfileColumnStore).Export(0xc000320540, {0x8792850, 0xc0065bf560}, 0xc00098fa40)
...

From OTLP proto specification (commit 3ec4649f):

message Sample {
  // locations_start_index along with locations_length refers to to a slice of locations in Profile.location_indices.
  int32 locations_start_index = 1;
  // locations_length along with locations_start_index refers to a slice of locations in Profile.location_indices.
  // Supersedes location_index.
  int32 locations_length = 2;
  // The type and unit of each value is defined by Profile.sample_type.
  repeated int64 values = 3;
  // References to attributes in ProfilesDictionary.attribute_table. [optional]
  repeated int32 attribute_indices = 4;

  // Reference to link in ProfilesDictionary.link_table. [optional]
  // It can be unset / set to 0 if no link exists, as link_table[0] is always a 'null' default value.
  int32 link_index = 5;

  // Timestamps associated with Sample represented in nanoseconds. These
  // timestamps should fall within the Profile's time range.
  repeated fixed64 timestamps_unix_nano = 6;
}

pkg/normalizer/otel.go:136-156:

func getAllLabelNames(req *otelgrpcprofilingpb.ExportProfilesServiceRequest) []string {
	allLabelNames := newLabelNames()

	for _, rp := range req.ResourceProfiles {
		allLabelNames.addOtelAttributes(rp.Resource.Attributes)

		for _, sp := range rp.ScopeProfiles {
			allLabelNames.addOtelAttributes(sp.Scope.Attributes)

			for _, p := range sp.Profiles {
				allLabelNames.addOtelAttributesFromTable(sp.Scope.Attributes, p.AttributeIndices)

				for _, sample := range p.Sample {
					allLabelNames.addOtelAttributesFromTable(sp.Scope.Attributes, sample.AttributeIndices)
                                                              ^^^^^^^^^^^^^^^^                                                                                                                          
				}
			}
		}
	}

	return allLabelNames.sorted()
}

Why do we lookup Sample's attribute in InstrumentationScope.attributes? Shouldn't we use ProfilesDictionary.attribute_table instead?

AntoxaBarin avatar Aug 20 '25 07:08 AntoxaBarin

This is a bug, do you want to open the fix?

brancz avatar Aug 20 '25 09:08 brancz

@brancz, tahnk you for response!

I tried to fix it locally, but I haven't seen the profiles in UI:

Image

Debug logs:

level=debug name=parca ts=2025-08-20T10:16:44.904804276Z caller=server.go:303 msg="finished call" protocol=grpc grpc.component=server grpc.service=opentelemetry.proto.collector.profiles.v1development.ProfilesService grpc.method=Export grpc.method_type=unary peer.address=127.0.0.1:49950 grpc.start_time=2025-08-20T13:16:44+03:00 grpc.request.deadline=2025-08-20T13:16:49+03:00 grpc.code=OK grpc.time_ms=0.928

level=debug name=parca ts=2025-08-20T10:16:49.718823892Z caller=server.go:303 msg="finished call" protocol=grpc grpc.component=server grpc.service=opentelemetry.proto.collector.profiles.v1development.ProfilesService grpc.method=Export grpc.method_type=unary peer.address=127.0.0.1:49950 grpc.start_time=2025-08-20T13:16:49+03:00 grpc.request.deadline=2025-08-20T13:16:54+03:00 grpc.code=OK grpc.time_ms=1.551

level=debug name=parca ts=2025-08-20T10:17:00.33797933Z caller=server.go:303 msg="finished call" protocol=grpc grpc.component=server grpc.service=opentelemetry.proto.collector.profiles.v1development.ProfilesService grpc.method=Export grpc.method_type=unary peer.address=127.0.0.1:49950 grpc.start_time=2025-08-20T13:17:00+03:00 grpc.request.deadline=2025-08-20T13:17:05+03:00 grpc.code=OK grpc.time_ms=1.922

Seems like it is a bigger problem, I can't fix it now :(

AntoxaBarin avatar Aug 20 '25 10:08 AntoxaBarin