squared icon indicating copy to clipboard operation
squared copied to clipboard

Count of mentions for taps and targets in slack, overall and per channel

Open MeltyBot opened this issue 3 years ago • 1 comments

Migrated from GitLab: https://gitlab.com/meltano/squared/-/issues/2

Originally created by @aaronsteers on 2021-11-01 23:37:51


@pnadolny13 and @tayloramurphy

In another thread, Derek pulled this metric:

Search slack for psycopg2 there's 110 messages across 5 channels, 7 people.

I'd love to see a dbt transform that takes our slack message history and searches for:

  1. all known taps and targets (a full list can be pulled from Hub json endpoint or yaml source).
  2. all listed variant owners (e.g. transferwise, meltanolabs, etc.; as above, this can be gotten from machine-readable sources).
  3. certain known problem words like python 3.10, psycopg2, etc. when they can be indicative of user issues.

Happy to contribute time to this but wanted to get it logged.

Known caveats: These metrics of course would have lots of false positives, and can't be used on its own for decision making or a "wall of shame" per se (because volume of users on the best taps will naturally also create more mentions). The idea is to provide directional feedback and a lens to observe improvements and changes over time.

Other thoughts:

  • This could be done in small iterations like starting with just 10 or so taps and targets.
  • Deduping: Eventually we'd want to dedupe instances of the keyword in messages to "number of threads" and/or "number of users mentioning". A single message could reference the same tap name 4 times, and the thread would continue to mention it up to 10-40 times. It's more helpful to count each thread as '1 mentioning thread' or as '3 mentioning users', versus mentioning 40 literal mentions of the tap name over the course of a few-hour conversation.
  • Filtering the data by specific channels like #troubleshooting could also give a more clear signal.
  • Variant/maintainer mentions are tricky because there's not a good way to tell the difference between the messages "The acme variant of tap-mysql is failing." versus "Your variant of tap-mysql is old; try acme variant instead." I think it will still be good data, although harder to interpret.

Example use case:

  • After switching to pipelinewise fork for a list of 4 taps, does the number of #troubleshooting mentions go down or up over the 2 months following?
  • If we identify that psycopg2 is a significant issue for new-user experience, how well are we doing at promoting variants that don't require pre-installing?
  • How many people mention dbt in #troubleshooting channels or overall? Is that number growing or shrinking over time?

MeltyBot avatar Nov 01 '21 23:11 MeltyBot