DiscordChatExporter icon indicating copy to clipboard operation
DiscordChatExporter copied to clipboard

Allow multiple output formats

Open luckydonald opened this issue 5 years ago • 17 comments

Hey, I really like the html version for the ease of use, but the json version for better processing.

As scraping the discord might be a bad thing I'd rather avoid running it twice for everything.

if I specify -f json -f html I however get the error:

Target type is not enumerable and can't accept more than one value.

Other possible solution would be to have a command making a html file from the json in an offline fashion.

luckydonald avatar Feb 11 '20 22:02 luckydonald

If you could login via a "standard" bot token (not a selfbot), my Bash script might help you: https://github.com/Tyrrrz/DiscordChatExporter/issues/264

It parallelizes exporting a whole Guild using all available cores to accelerate the process. In this case, you may need exporting only a channel (or various of them) in both JSON and HTML.

MatiasMFM2001 avatar Feb 13 '20 06:02 MatiasMFM2001

If this were to be implemented, how should we handle output file naming? Say we are doing an export for a single format. Using the CLI, we specify a full output path for the output file (includes path to the directory + a name for the file, including file extension) using -o. Here's how we would do that: -c <channel id> -t <token> -o C:\Users\Me\Desktop\hello.html -f HtmlDark OK, looks good.

Now say we want to output to multiple formats, and I want to provide names for the files. It doesn't make sense for me to provide a file extension, since each file will have a different extension to match their format. -c <channel id> -t <token> -o C:\Users\Me\Desktop\hello -f htmldark json plaintext Here, I'd like to get hello.html, hello.json, and hello.txt, but, uh-oh, hello is going to be interpreted as a directory, not a filename. The program will compute default filenames for the files and then stick them into the hello directory inside the desktop.

What should we do about this? Do we provide a separate option for directory path, -d (what happens if a full path is provided for -o, but it is incongruent/unreconcilable with the value for -d)? Do we ask the user to stick a placeholder value in for the file extension (more complicated)? Do we ask the user for 3 values for -o to match the 3 values given to -f (that's a lot of copy-pasting of the directory path!).

andrewkolos avatar Oct 13 '20 03:10 andrewkolos

I'd be happy to implement this feature but would like some guidance on the above. I personally am biased to having the new option flag, -d.

andrewkolos avatar Oct 13 '20 20:10 andrewkolos

I think C:\Users\Me\Desktop\hello could be interpreted as a file, and C:\Users\Me\Desktop\hello\ as a folder. This has an obvious disadvantage: Too little margin of error

Do we provide a separate option for directory path, -d (what happens if a full path is provided for -o, but it is incongruent/unreconcilable with the value for -d)?

-d could take priority over -o. So, if -d is specified, -o could just be the output filename(s), without extension

MatiasMFM2001 avatar Oct 13 '20 20:10 MatiasMFM2001

I think C:\Users\Me\Desktop\hello could be interpreted as a file, and C:\Users\Me\Desktop\hello\ as a folder. This has an obvious disadvantage: Too little margin of error

This is something I did not consider. What I do like about it is that it keeps all the output info in one place (i.e. one option), but yeah its behavior wouldn't be obvious and the user could confuse it and mess up. Having the separate option -d (or maybe -od, to keep it more related to -o) makes this option of specifying a directory more discoverable. It could still be potentially confusing to have the behavior of -o modified by another option.

I am not particularly biased to either option, but I think I'll go with the former (ending with \ to signal a directory). Maybe that will change as I am tinkering with it or more opinions on the subject are provided.

andrewkolos avatar Oct 13 '20 21:10 andrewkolos

Now that I think about it again, I am not sure the \ approach is super sound. Unless we make this change only effect multiple-format exports, it would change how single-format exports work. To my understanding, the app currently interprets C:\Users\Me\Desktop\hello as a directory, not a hello file in Desktop. Would this be a bad breaking change?

edit:grammar

andrewkolos avatar Oct 13 '20 21:10 andrewkolos

What I do like about it is that it keeps all the output info in one place (i.e. one option), but yeah its behavior wouldn't be obvious and the user could confuse it and mess up.

Exactly

It could still be potentially confusing to have the behavior of -o modified by another option.

Hmm, without -d default folder could be current working directory (i.e. ./ in Unix), and with -d, the specified folder

EDIT: Full command would be something like this:

# Output = Files "/path/to/output/folder/Filename.json" and "/path/to/output/folder/Filename.html"
-c <channel id> -t <token> -d "/path/to/output/folder" -o "Filename" -f json -f html

# Output = Files "./Filename.json" and "./Filename.html"
-c <channel id> -t <token> -o "Filename" -f json -f html

MatiasMFM2001 avatar Oct 13 '20 21:10 MatiasMFM2001

Now that I think about it again, I am not sure the \ approach is super-sound. Unless we make this change only effect multiple-format exports, it would change how single-format exports work. To my understanding, the app currently interprets C:\Users\Me\Desktop\hello as a directory, not a hello file in Desktop. Would this be a bad breaking change?

Yup, any change would break others scripts... But that could be solved if they get notified

MatiasMFM2001 avatar Oct 13 '20 21:10 MatiasMFM2001

Would this be a bad breaking change?

Yes, it would:

https://github.com/Tyrrrz/DiscordChatExporter/blob/09acfcff59b0b63481fcc8d47958fe8a9c175ae9/DiscordChatExporter.Domain/Exporting/ExportRequest.cs#L80-L85

I would suggest that (in case of multi-format export) the user needs to provide a directory to -o and if the provided output is not a directory, then an error should be thrown.

If you are exporting multiple files, treating lack of extension in the path as a directory is reasonable, since it doesn't make sense to provide a file name in such a case.

Tyrrrz avatar Oct 14 '20 12:10 Tyrrrz

I'm gonna bump this with a suggestion: Instead of specifying it multiple times or having multiple args, they could be comma separated values. So if you wanted csv and json, you'd do -f csv,json. If you wanted htmldark and json, you'd do -f htmldark,json.

solonovamax avatar Feb 14 '21 03:02 solonovamax

You can also do -f csv json instead of -f csv -f json, but you can't do -f csv,json. This is how the underlying CLI library was designed.

Tyrrrz avatar Feb 14 '21 23:02 Tyrrrz

You can also do -f csv json instead of -f csv -f json, but you can't do -f csv,json. This is how the underlying CLI library was designed.

But will this make separate request for csv then json? Being sus to discord?

Any progress to make multiple export on the GUI without re requesting data on each formats?

r1bnc avatar Mar 02 '21 05:03 r1bnc

But will this make separate request for csv then json?

The whole point is to avoid it.

Tyrrrz avatar Mar 02 '21 20:03 Tyrrrz

But will this make separate request for csv then json?

The whole point is to avoid it.

I know. I was asking if this has been done now in the current version. If not yet,

I was thinking of making my own parser for the json or html files. Someone has already made a script to parse html to IRC grepable.

r1bnc avatar Mar 03 '21 09:03 r1bnc

No, in the current version this is not supported.

Tyrrrz avatar Mar 03 '21 15:03 Tyrrrz

@Tyrrrz Putting up a $15 bounty to your buymeacoffee if this gets added, would be invaluable for archival.

rebane2001 avatar Oct 14 '21 07:10 rebane2001

While not a solution to the issue, I thought I'd share this bash oneliner I made:

jq '.messages[] | .timestamp[11:16] + .author.nickname + (if .type != "Default" then " {" + .type + "}" else "" end) + ": " + .content + (if .attachments[0] then (.attachments[] | "\n" + .fileName + " > " + .url) else "" end)' *.json | sed 's/\"/"/g' | sed 's/^"\(.....\)\(.*\)"$/[\x1b[36m\1\x1b[0m] \x1b[31m\2/g' | sed 's/\\n/\n          /g' | sed 's/\(: \|{\)/\x1b[0m&/g' | sed 's/{.*\?}/\x1b[33m&\x1b[0m/g'

It takes a JSON format export and formats it to be readable in a terminal:
image image

rebane2001 avatar Apr 19 '22 09:04 rebane2001

Closing as out of scope since the project no longer accepts new features

Tyrrrz avatar Feb 15 '23 21:02 Tyrrrz