squared
squared copied to clipboard
Count of mentions for taps and targets in slack, overall and per channel
Migrated from GitLab: https://gitlab.com/meltano/squared/-/issues/2
Originally created by @aaronsteers on 2021-11-01 23:37:51
@pnadolny13 and @tayloramurphy
In another thread, Derek pulled this metric:
Search slack for
psycopg2
there's 110 messages across 5 channels, 7 people.
I'd love to see a dbt transform that takes our slack message history and searches for:
- all known taps and targets (a full list can be pulled from Hub json endpoint or yaml source).
- all listed variant owners (e.g.
transferwise
,meltanolabs
, etc.; as above, this can be gotten from machine-readable sources). - certain known problem words like
python 3.10
,psycopg2
, etc. when they can be indicative of user issues.
Happy to contribute time to this but wanted to get it logged.
Known caveats: These metrics of course would have lots of false positives, and can't be used on its own for decision making or a "wall of shame" per se (because volume of users on the best taps will naturally also create more mentions). The idea is to provide directional feedback and a lens to observe improvements and changes over time.
Other thoughts:
- This could be done in small iterations like starting with just 10 or so taps and targets.
- Deduping: Eventually we'd want to dedupe instances of the keyword in messages to "number of threads" and/or "number of users mentioning". A single message could reference the same tap name 4 times, and the thread would continue to mention it up to 10-40 times. It's more helpful to count each thread as '1 mentioning thread' or as '3 mentioning users', versus mentioning 40 literal mentions of the tap name over the course of a few-hour conversation.
- Filtering the data by specific channels like
#troubleshooting
could also give a more clear signal. - Variant/maintainer mentions are tricky because there's not a good way to tell the difference between the messages "The
acme
variant of tap-mysql is failing." versus "Your variant oftap-mysql
is old; tryacme
variant instead." I think it will still be good data, although harder to interpret.
Example use case:
- After switching to
pipelinewise
fork for a list of 4 taps, does the number of#troubleshooting
mentions go down or up over the 2 months following? - If we identify that
psycopg2
is a significant issue for new-user experience, how well are we doing at promoting variants that don't require pre-installing? - How many people mention
dbt
in#troubleshooting
channels or overall? Is that number growing or shrinking over time?