buck2
buck2 copied to clipboard
API design for BuckEvent data
Today, the build telemetry is done pretty much through Scribe (Meta's internal message queue system) into Scuba (Meta's telemetry platform).
This resulted in Buck2 implementing the EventSink trait for Scribe only:
/// A sink for events, easily plumbable to the guts of systems that intend to produce events consumeable by
/// higher-level clients. Sending an event is synchronous.
pub trait EventSink: Send + Sync {
/// Sends an event into this sink, to be consumed elsewhere. Explicitly does not return a Result type; if sending
/// an event does fail, implementations will handle the failure by panicking or performing some other graceful
/// recovery; callers of EventSink are not expected to handle failures.
fn send(&self, event: BuckEvent);
/// Sends a control event into this sink, to be consumed elsewhere. Control events are not sent to gRPC clients.
fn send_control(&self, control_event: ControlEvent);
/// Collects stats on this sink (e.g. messages accepted, rejected).
fn stats(&self) -> Option<EventSinkStats>;
}
In here:
-
BuckEvent
is defined inbuck_data/data.proto
which is great for backward compatibility -
ControlEvent
is an enum/// An event that can be produced by Buck2 that is not intended to be presented to the user, but rather is used to /// communicate with other parts of Buck2. #[derive(Clone, From)] pub enum ControlEvent { /// A command result, produced upon completion of a command. CommandResult(Box<CommandResult>), /// A progress event from this command. Different commands have different types. PartialResult(PartialResult), }
where
CommandResult
andPartialResult
are both defined inbuck2_cli_proto/daemon.proto
. Not sure yet how backward compatible guarentee this is just yet. -
EventSinkStats
is a struct defined inbuck2_events
crate, not guarantee to be backward compatible.
It would be nice if we could define the telemetry data in a unified interface/data format similar to Bazel's BEP so that third parties could implement the server side for it. At the very least, having the data structures defined in a unified protobuf would help assuring that the future changes are backward compatible.
Is this what you're looking for?
https://github.com/facebook/buck2/blob/main/app/buck2_data/data.proto
You mention the data from daemon.proto, I think this is generally not data we'd expect to be used for telemetry.
I think buck2_data/data.proto is considered a backward compatible specification of the buck2 events. As @krallin noted, the daemon.proto stuff isn't, but that's probably okay for this use case. I'd expect the parts related to the event sink that interact with daemon.proto stuff or rust internal things would just be that we could add a grpc events sink implementation that would be able to send the events to some grpc service (i.e. something similar to https://github.com/googleapis/googleapis/blob/master/google/devtools/build/v1/publish_build_event.proto). I don't think that we'd try to match (or approximately match) the BEP directly, but an analysis of what that would look like could be interesting (i could maybe see someone prototyping some stuff with a simple proxy that converted a subset of buckevents into equivalent BEP events).
In addition, I think just providing a bit of documentation about the buck events (from data.proto) somewhat like that linked BEP page would be useful.
What do you folks think if we just start with creating root//app/buck2_eventserver_proto
with this
syntax = "proto3";
import "google/protobuf/duration.proto";
import "data.proto";
package buck.eventserver;
message BuckEventRequest {
BuckEvent event = 1;
};
message BuckEventResponse {
// TBD
};
service EventServer {
rpc send(stream BuckEventRequest) returns (stream BuckEventResponse);
}
Should be the minimal mapping to EventSink.send()
trait in the current buck2_events
crate.
Once we have a client/server stub in place, the rest could be iterated from there.
Are there any plans for Buck2 to support Build Event Protocol? This protocol is adopted by most remote execution services like EngFlow, BuildBuddy, BuildBarn, and so on.
As we started seeing upticks in Buck2 interest from BuildBuddy side, I sent https://github.com/facebook/buck2/pull/685 to kickstart some discussion about this.
The initial design is intentionally minimal so that we could get a bare minimum POC working. I hope that we can add more to the API as the usage matures over time.
The RPC was heavily inspired by PublishBuildToolEventStream RPC on Bazel side.
As I look more into the current usage of ThriftScribeSink in https://github.com/facebook/buck2/pull/686/files, a few questions/thoughts pop up:
-
It's strange that
send_now
andsend_messages_now
are overly used in several places instead ofsend
. I guess this has something to do with Scribe's queue inside Meta could be slow? -
Those 2 fn are also
async
which made it harder to refactor them into a separate Trait. -
Most likely in the OSS grpc implementation, we will just stub these events to call
send
instead as I don't expect there would be a separate API for queue/non-queue for the initial design.
(I'll respond to that last message in the PR)