squared icon indicating copy to clipboard operation
squared copied to clipboard

Solving GitHub notifications - like a Meltano Pro! 💪

Open aaronsteers opened this issue 2 years ago • 6 comments

I'm still plagued by GitHub's poor notification system and I think I have a plan to solve it with Meltano:

EL

  1. Create a tap-github EL pipeline to pull all comments ~~and all comment reactions~~ from all repos in the meltano and MeltanoLabs orgs, including private projects.
    • Comments without text body: https://github.com/meltano/squared/issues/399
    • Reactions are probably not possible: https://github.com/MeltanoLabs/tap-github/issues/158
    • Rather than modify the existing tap-github flow, it probably makes sense to add tap-github-comments as derived, with the rules: issue_comments.*, !issue_comments.body
  2. Create a custom mapping function that uses regular expressions to extract only the list of @mentions from each comment, but to not pull in the full text of each comment.
    • https://github.com/meltano/sdk/issues/971
    • This honors the sensitivity of comments while still capturing what's important: who tagged who in a comment, and whether or not the tagged individual has replied or reacted to the comment.
    • A future iteration may detect cc or fyi preceding in the same line as the @mention, and annotate the data to indicate the comment as not necessarily requiring a response.

Transform

  1. Import a seed file with the list of our Meltano team members (if not easily extractable from other sources).
  2. ~~Import a seed file of emote reactions, and how they should be interpreted.~~
    • ~~e.g. :+1: (👍 ) might be counted as a response; :thinking: (🤔) is not.~~
  3. Create a SQL query that shows the following for each team member:
    1. List of @mentions with URL link to each comment.
    2. A status flag on each @mention: answered, acknowledged, unanswered
    3. A time-to-response metric on each mention - which is the simple datemath calculation of time elapsed between command and response.

Report

  1. Create a report in evidence.dev which renders the above as static html.
  2. Whenever there are changes per individual - deliver the report in Slack, Static Site Hosting over VPN, or some other means.

Run all of the above every 4 or 6 hours. Depending on incremental EL cost, this may be runnable even more frequently.

aaronsteers avatar Sep 15 '22 05:09 aaronsteers

@aaronsteers interesting - I can do that EL part for sure, I'm running into challenges with rate limiting again like you did a while ago so thats going to get harder especially as we want more data.

What part of GitHub notifications is still plaguing you haha? I have some challenges too but curious what yours are exactly. Is it that viewing but not answering takes it out of your notifications so you lose track?

pnadolny13 avatar Sep 16 '22 17:09 pnadolny13

What part of GitHub notifications is still plaguing you haha? I have some challenges too but curious what yours are exactly. Is it that viewing but not answering takes it out of your notifications so you lose track?

The challenge is that the mention reason description is not specific to the comment in which the person is mentioned. Every subsequent comment on a ticket where I have been mentioned is also called a mention - even if I've already marked the original mention as done. As far as I am able to determin, GitHub provides no means of distinguishing a direct mention versus a mention prior in the thread. Upshot is that I get tons of spam and it's impossible for me to see just the issues where I am mentioned recently and not already replied or addressed.

What I need is a more accurate representation of direct mention event, and not the GitHub provided definition of mentioned at some prior point in the thread.

aaronsteers avatar Sep 16 '22 17:09 aaronsteers

Here's an example:

On https://github.com/MeltanoLabs/Meta/issues/17?notification_referrer_id=NT_kwDOART0-7M0MzkyNDc5NjI4OjE4MTUwNjUx:

  • 3 days ago you cc'd me here.
  • I had no action so I marked done, (happy for the fyi by the way).
  • Today when you merged, comment, or commit, I get another mention tagged as '5 minutes ago' - but I was not mentioned.
image

aaronsteers avatar Sep 16 '22 17:09 aaronsteers

Documented all possible dismissal approaches and only Unsubscribe works:

  • https://github.com/meltano/squared/issues/404#issuecomment-1249655252

If and only I use the Unsubscribe feature, will the mention reason be dismissed from the issue and not resurface for all subsequent action on the issue.

Also important to note: the Unsubscribe button is only available if I get to the issue from the Notifications screen and not if I arrive at the issue (or PR/discussion) by another means.

aaronsteers avatar Sep 16 '22 18:09 aaronsteers

Note that with our updated table component, you could link back to the original GH issues from the table, if that was helpful datatable-rowlink-external

archiewood avatar Feb 08 '23 18:02 archiewood

@pnadolny13 - I feel like you are naturally making progress towards the goals of this ticket.

Do you think we should bifurcate our extracts into private vs public repos, with one instance of tap-github getting public repo data and one instance of tap-github getting public data? In theory, we could add a config option on tap-github, something like a three-state value: include_private_repos = true|false|'only'.

Alternatively, we could introduce binary inclusion/obfuscation logic to stream maps, which could nullify, obfuscate, or filter data selectively if the parent repo is private, but that might be overly complex. If it is too difficult to debug the stream maps, we could inadvertantly pull data that we don't want to pull.

aaronsteers avatar Feb 08 '23 19:02 aaronsteers