youtube-comment-downloader
youtube-comment-downloader copied to clipboard
recomended program for viewing & editing comments
Hi, newbe question, do you recomend some txt editor (windows) for viewing this json file (in notepad++) looks like mess
btw. ty for scritp its V. usefull
I join the request. Something to view the comments in a well-arranged way would come in hany.
I wrote a small little script for this: https://github.com/sinepgnol/YTB-comments-from-JSON-to-a-well-arranged-way
I wrote same script with jq for CLI (WTFPL):
cat "comment.json" |
sed '$!s/$/,/; 1s/^/[/; $s/$/]/' |
jq -r '
map( .cid |= sub("\\..*";"")) |
group_by(.cid) |
map(
[
(.[0:1] |
map(
.author |= sub("^"; "---\n") |
.text |= gsub("\n"; "\n ") |
.text |= sub("^"; "\n ")
)
)[]
,
(.[1:] |
map(
.author |= sub("^"; " ") |
.text |= gsub("\n"; "\n ") |
.text |= sub("^"; "\n ")
)
)[]
]
)
[][] |
.author + " (" + .votes + " : " + .time + "):" + .text
' |
sed '$G; $s/$/---/'
I missed Windows. cat
and sed
(and also jq
) will not be installed in Windows by default.
I wrote same script with jq for CLI:
cat "comment.json" | sed '$!s/$/,/; 1s/^/[/; $s/$/]/' | jq -r ' map( .cid |= sub("\\..*";"")) | group_by(.cid) | map( [ (.[0:1] | map( .author |= sub("^"; "---\n") | .text |= gsub("\n"; "\n ") | .text |= sub("^"; "\n ") ) )[] , (.[1:] | map( .author |= sub("^"; " ") | .text |= gsub("\n"; "\n ") | .text |= sub("^"; "\n ") ) )[] ] ) [][] | .author + " (" + .votes + " : " + .time + "):" + .text ' | sed '$G; $s/$/---/'
I missed Windows.
cat
andsed
(and alsojq
) will not be installed in Windows by default.
for some reasons your script would print 15 times each comments.
Adding
| sed 's/^ *//g' | awk '{$1=$1}1' | awk ' !x[$0]++' | sed 's/\(.*):\)/\n\1/g'
solved the issue.
Thank you for replying! No problem with my environment (jq-1.6, BSD sed and BSD cat). I think your original json file is 15 loops of original comments. Run grep "some cid" /pass/to/comment.json
. If it is true, adding sort -u
in first of my script and solve the issue.
I'm interested in your addtional scripts. Could you tell me how | sed 's/^ *//g' | awk '{$1=$1}1' | awk ' !x[$0]++' | sed 's/\(.*):\)/\n\1/g'
works?
you are right, the original comment.json was containing multiple times the same comment. I wonder why.
first sed is for deleting space at start of line, because same comments was having different number of spaces at the beginning of the line, then awk can show duplicates lines only one time. Then the last sed is adding a new line before the line with the name of the author to separate two comments. If I used only sort -u at the end, it would messed up everything because the name of the author would not be associated with the right comment (because every lines would be sorted). But I wasn't aware that the problem originated from the json file so adding it at the beginning is better, but it could be an issue because the comments would not be sorted by time.
I learnd awk techniques. Thanks!
That was bug but fixed. See #68.
The space at start of line makes distiniction of comment types. Normal comments has 2 spaces and that's replies has 6 spaces. ~~This script does not sort by time because it will break relation between normal comments and replies.~~
But you are right, sort -u
will break the relation. Use sed -n '1h; 1!G; /^\(.*\)\n\1/q; P'
instead of sort -u
and solve this issue.
~~I think sort -u
does not break the relation because of rule of cid but~~ using sed -n '1h; 1!G; /^\(.*\)\n\1/q; P'
is good idea because sort -u
is slow. Or your json file does not contain any replies because of bug.
Wow, don’t tell me! TextEdit also works.
On 1 Mar 2021, at 19:07, archmord [email protected] wrote:
You can view json file in Firefox browser
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/egbertbouman/youtube-comment-downloader/issues/63#issuecomment-788155416, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKZW6WXF2PUVGSUJUZCPOQDTBPJXRANCNFSM4WEY5MBA.
@terrypa Use VSCode. It can also auto format json files.
Perhaps author can add a command line option to pretty print output. e.g. --pretty 1
The newest version has a --pretty
option, which will change the output format to indented JSON.