Add a single run interface
This is similar to https://github.com/cloudprober/cloudprober/issues/23, or at least requires the same sort of primitives.
Cloudprober is designed to run in a continuous mode, but it will be nice to be to able to run it just once (or given number of times, like -c flag of ping) and generate a report, or pass/fail signal.
For certain probe types it will be more difficult to support single run interface, e.g. PING, UDP, but for others, e.g. HTTP, DNS, GRPC, TCP, EXTERNAL, it should be doable. This issue tracks the implementation of such an interface for such probe types.
Gave more thought to single run interface at the binary level and below:
Interface
- Binary Flags: —single_mode, —count
- Current flow is
cmd/cloudprober → cloudprober.Start → prober.Start → probe.Start, on singleMode it will becmd/cloudprober → cloudprober.Run(count) → prober.Run(count) → probe.Run(count) - Probe will need to support Run interface. We can add another interface called ProbeWithRun.
- In single run mode, we’ll not write metrics to the shared data channel, instead we’ll just return EventMetrics. We’ll summarize EventMetrics after run and return output in JSON as well as human-readable text format.
Comments
Run interface will be hard for ping, and UDP style probes, but that’s okay.. we don’t have to build them all at once.
Probe's run interface will look like this:
Probe {
Init(name string, opts *options.Options) error
Start(ctx context.Context, dataChan chan *metrics.EventMetrics)
Run(ctx context.Context) (bool, []byte, error) // successOrFail, json-formatted metrics, error
}
Older interface will look like this:
ProbeWithoutRun {
Init(name string, opts *options.Options) error
Start(ctx context.Context, dataChan chan *metrics.EventMetrics)
}
I think final result format is going to be interesting here. Do we return metrics in the end, or just success/fail and perhaps overall duration?
Or maybe both? We can do the following:
- Mark exit status successful if all probes succeeded.
- Allow caller to parse the metrics output to figure out more details, e.g. we can return metrics in json format:
[{ name: probe1 dst: target-1 success: total: latency: }, {} ]
One of the usage of this functionality could be to run cloudprober to verify a deployment. It may be useful to extend this functionality with things like:
- Wait for minimum 3 consecutive successes, and wait for up-to 5 min.
- Override probe interval and timeout on command line
Added first draft of single run interface in #1081.