feature request: github link as formatted code
feature request is simple (by description) when someone post a link to a github repo, with a region delimiter (e.g. from line X to line Y) the bot should post the code contained between these lines
e.g. https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs#L396-L412
should be shown in the channel as
// Ensures that the capacity of this list is at least the given minimum
// value. If the current capacity of the list is less than min, the
// capacity is increased to twice the current capacity or to min,
// whichever is larger.
//
private void EnsureCapacity(int min)
{
if (_items.Length < min)
{
int newCapacity = _items.Length == 0 ? DefaultCapacity : _items.Length * 2;
// Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow.
// Note that this check works even when _items.Length overflowed thanks to the (uint) cast
if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength;
if (newCapacity < min) newCapacity = min;
Capacity = newCapacity;
}
}
for people who work with Slack, slack has already this feature in its github bot
Several considerations here:
- Will we support full GitHub content URLs, i.e. ones without regions
- Will we support other GitHub URLs, i.e. if the raw link is posted
- How will we embed GitHub content that exceed Discord embed max length, truncate and link to the original GitHub content?
the feature may have limits that prevent to generate too big content, has just to check before submitting to the channel (and/or could be a setting of the bot) IIR on slack it wasn't shown at all but was just shown the normal link
anyway this feature is oriented to code snippets. Here you can see the slack feature on github, contains some screenshot too https://github.com/integrations/slack
another thing to consider is how is highlighting handled? do we assume csharp code? or attempt to ascertain the language from the url
in my slack experience, an un-highlighted code was enough, because often it's a matter of just showing few lines of code without requiring the user to navigate to the link
We can take the extension from the URL and use it as syntax highlighting language (the x in ```x): in best case it's recognized and it works, in worse case it's not recognized and you get no syntax highlighting
Seems pretty doable. Whipped up a quick proof of concept using Octokit.
https://paste.mod.gg/fuxovubogu.cs
Definitely not perfect, would still need to handle Inzanit's points above, needs better validations, and we'd likely want to use the authenticated GitHub APIs instead of the unauthenticated ones.
Converting from github.com/author/repo/blob/... links to raw.githubusercontent.com/author/repo/... and then fetching the appropriate lines from the resulting raw text file sounds like significantly less code (and doesn't need bringing in a library)
Web scraping is always a tricky topic, and not all services welcome scrapers.
According to GitHub's policies:
You may scrape the website for the following reasons:
- Researchers may scrape public, non-personal information from the Service for research purposes, only if any publications resulting from that research are open access.
- Archivists may scrape the Service for public data for archival purposes.
And it isn't super clear to me that we would qualify as researchers or archivists. But I'm certainly not a lawyer. I just know that using their API is much more likely to be allowed by their terms.
I don't think that even qualifies as scraping, as the github.com to raw.githubusercontent.com replacement can be done locally with a simple regex.
According to their policy, it does:
Scraping refers to extracting data from our Service via an automated process, such as a bot or webcrawler. It does not refer to the collection of information through our API.
According to their policy, it does:
Scraping refers to extracting data from our Service via an automated process, such as a bot or webcrawler. It does not refer to the collection of information through our API.
Perhaps worth reviewing their policy to see if this has changed at all.
That said, many sites often provide scripts in the form of raw GH urls intended to be obtained wget or curl and piped directly into a shell. I don't see how this is much different.
I also think it could be argued that scraping a single file relevant to a discussion could be considered archiving - to retain the context in the channel - think providing information in addition to a link on SO, instead of dropping just a link.
Sounds reasonable to me - let's go with sylveon's idea.
So I guess to summarize:
- Convert to GitHub raw URL
- Use the file extension as the language for the ```lang block
- Truncate once we reach Discord's embed limit
- Probably not worth it to support links without regions at the moment, since we don't know if the code we'd be showing in the embed would be useful (e.g. for C# files, we'd just be embedding a ton of
using ...lines and maybe the very start of a class definition)
This is a really old issue - nowadays discord will show text files as embeds that can be expanded. I suggest just doing that now (get the raw file, upload it on discord)
This is a really old issue - nowadays discord will show text files as embeds that can be expanded. I suggest just doing that now (get the raw file, upload it on discord)
This had crossed my mind, sounds like a better option.