MODiX icon indicating copy to clipboard operation
MODiX copied to clipboard

feature request: github link as formatted code

Open MithrilMan opened this issue 5 years ago • 14 comments

feature request is simple (by description) when someone post a link to a github repo, with a region delimiter (e.g. from line X to line Y) the bot should post the code contained between these lines

e.g. https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs#L396-L412

should be shown in the channel as

// Ensures that the capacity of this list is at least the given minimum
// value. If the current capacity of the list is less than min, the
// capacity is increased to twice the current capacity or to min,
// whichever is larger.
//
private void EnsureCapacity(int min)
{
	if (_items.Length < min)
	{
		int newCapacity = _items.Length == 0 ? DefaultCapacity : _items.Length * 2;
		// Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow.
		// Note that this check works even when _items.Length overflowed thanks to the (uint) cast
		if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength;
		if (newCapacity < min) newCapacity = min;
		Capacity = newCapacity;
	}
}

for people who work with Slack, slack has already this feature in its github bot

MithrilMan avatar Jan 12 '20 23:01 MithrilMan

Several considerations here:

  • Will we support full GitHub content URLs, i.e. ones without regions
  • Will we support other GitHub URLs, i.e. if the raw link is posted
  • How will we embed GitHub content that exceed Discord embed max length, truncate and link to the original GitHub content?

patrickklaeren avatar Jan 12 '20 23:01 patrickklaeren

the feature may have limits that prevent to generate too big content, has just to check before submitting to the channel (and/or could be a setting of the bot) IIR on slack it wasn't shown at all but was just shown the normal link

anyway this feature is oriented to code snippets. Here you can see the slack feature on github, contains some screenshot too https://github.com/integrations/slack

MithrilMan avatar Jan 13 '20 00:01 MithrilMan

another thing to consider is how is highlighting handled? do we assume csharp code? or attempt to ascertain the language from the url

Jay-Madden avatar Jan 13 '20 00:01 Jay-Madden

in my slack experience, an un-highlighted code was enough, because often it's a matter of just showing few lines of code without requiring the user to navigate to the link

MithrilMan avatar Jan 13 '20 00:01 MithrilMan

We can take the extension from the URL and use it as syntax highlighting language (the x in ```x): in best case it's recognized and it works, in worse case it's not recognized and you get no syntax highlighting

sylveon avatar Jan 13 '20 01:01 sylveon

Seems pretty doable. Whipped up a quick proof of concept using Octokit.

https://paste.mod.gg/fuxovubogu.cs

Definitely not perfect, would still need to handle Inzanit's points above, needs better validations, and we'd likely want to use the authenticated GitHub APIs instead of the unauthenticated ones.

Scott-Caldwell avatar Jan 13 '20 03:01 Scott-Caldwell

Converting from github.com/author/repo/blob/... links to raw.githubusercontent.com/author/repo/... and then fetching the appropriate lines from the resulting raw text file sounds like significantly less code (and doesn't need bringing in a library)

sylveon avatar Jan 13 '20 04:01 sylveon

Web scraping is always a tricky topic, and not all services welcome scrapers.

According to GitHub's policies:

You may scrape the website for the following reasons:

  • Researchers may scrape public, non-personal information from the Service for research purposes, only if any publications resulting from that research are open access.
  • Archivists may scrape the Service for public data for archival purposes.

And it isn't super clear to me that we would qualify as researchers or archivists. But I'm certainly not a lawyer. I just know that using their API is much more likely to be allowed by their terms.

Scott-Caldwell avatar Jan 13 '20 04:01 Scott-Caldwell

I don't think that even qualifies as scraping, as the github.com to raw.githubusercontent.com replacement can be done locally with a simple regex.

sylveon avatar Jan 13 '20 05:01 sylveon

According to their policy, it does:

Scraping refers to extracting data from our Service via an automated process, such as a bot or webcrawler. It does not refer to the collection of information through our API.

Scott-Caldwell avatar Jan 13 '20 05:01 Scott-Caldwell

According to their policy, it does:

Scraping refers to extracting data from our Service via an automated process, such as a bot or webcrawler. It does not refer to the collection of information through our API.

Perhaps worth reviewing their policy to see if this has changed at all.

That said, many sites often provide scripts in the form of raw GH urls intended to be obtained wget or curl and piped directly into a shell. I don't see how this is much different.

I also think it could be argued that scraping a single file relevant to a discussion could be considered archiving - to retain the context in the channel - think providing information in addition to a link on SO, instead of dropping just a link.

thaumanovic avatar Jul 01 '23 11:07 thaumanovic

Sounds reasonable to me - let's go with sylveon's idea.

So I guess to summarize:

  • Convert to GitHub raw URL
  • Use the file extension as the language for the ```lang block
  • Truncate once we reach Discord's embed limit
  • Probably not worth it to support links without regions at the moment, since we don't know if the code we'd be showing in the embed would be useful (e.g. for C# files, we'd just be embedding a ton of using ... lines and maybe the very start of a class definition)

Scott-Caldwell avatar Jul 04 '23 14:07 Scott-Caldwell

This is a really old issue - nowadays discord will show text files as embeds that can be expanded. I suggest just doing that now (get the raw file, upload it on discord)

sylveon avatar Jul 05 '23 05:07 sylveon

This is a really old issue - nowadays discord will show text files as embeds that can be expanded. I suggest just doing that now (get the raw file, upload it on discord)

This had crossed my mind, sounds like a better option.

thaumanovic avatar Jul 05 '23 06:07 thaumanovic