kedro-plugins icon indicating copy to clipboard operation
kedro-plugins copied to clipboard

[Draft] telemetry: Revamp telemetry data collection workflow

Open ankatiyar opened this issue 1 year ago • 5 comments

Description

Want to open this issue to collect feedback and ideas on improving the data collection workflow with kedro-telemetry. TODO: I'll add stuff from my work on the spike https://github.com/kedro-org/kedro/issues/2522.

Context

Why is this change important to you? How would you use it? How can it benefit other users?

Possible Implementation

(Optional) Suggest an idea for implementing the addition or change.

Possible Alternatives

(Optional) Describe any alternative solutions or features you've considered.

ankatiyar avatar Oct 05 '23 14:10 ankatiyar

Related: #333

astrojuanlu avatar Oct 05 '23 15:10 astrojuanlu

In https://github.com/kedro-org/kedro/issues/2519 we're running in circles again to make users install kedro-telemetry without trying too hard. It's something that initially appeared in https://github.com/kedro-org/kedro/issues/2522, although @ankatiyar provided a solution to that problem already.

In my opinion, we should make the telemetry collection mechanism a mandatory dependency of Kedro, while still keeping the current opt-in flow for actually enabling such collection. I know this might ruffle some feathers but as long as we keep the opt-in flow explicit and robust, I don't think we're breaking any promises.

Otherwise I think it's better to not collect any telemetry at all.

astrojuanlu avatar Oct 20 '23 12:10 astrojuanlu

One data point: in the past 30 days, kedro-telemetry had 13 % as many downloads as kedro.

  • https://www.pepy.tech/projects/kedro
  • https://www.pepy.tech/projects/kedro-telemetry

astrojuanlu avatar Dec 18 '23 15:12 astrojuanlu

In kedro-org/kedro#2519 we're running in circles again to make users install kedro-telemetry without trying too hard. It's something that initially appeared in kedro-org/kedro#2522, although @ankatiyar provided a solution to that problem already.

In my opinion, we should make the telemetry collection mechanism a mandatory dependency of Kedro, while still keeping the current opt-in flow for actually enabling such collection. I know this might ruffle some feathers but as long as we keep the opt-in flow explicit and robust, I don't think we're breaking any promises.

Otherwise I think it's better to not collect any telemetry at all.

I agree with @astrojuanlu , I think it will be better to incorporate telemetry directly into the Kedro codebase. This approach would involve prompting users for their opt-in consent during their first command execution if the environment variable hasn’t been set already. We would then record their response within the environment variable.

Currently, the prompt for telemetry participation occurs only after the plugin's installation, leading to confusion among users about the necessity of installing the plugin if they are not interested in participating. I think it would be more logical and user-friendly to inquire about telemetry participation at the first run of a Kedro command and only after obtaining the user's consent proceed with the plugin installation. However, from what I understand, this method might introduce technical uncertainties.

So I think a more reliable approach might be to integrate the telemetry plugin directly into Kedro itself.

DimedS avatar Feb 08 '24 14:02 DimedS

👍🏽 In a first phase, we'll focus on clarifying the current scope of data collection and fix outstanding issues. We will continue working towards that goal later on, and draft our communications plan for users accordingly.

astrojuanlu avatar Feb 11 '24 23:02 astrojuanlu

I will open a new issue with the next steps.

astrojuanlu avatar Jun 05 '24 08:06 astrojuanlu