new cc format for msnbc?
Carlos:
Can you take a look at this ts file?
If you remember, you created a custom version of ccextractor for the Hauppuage tv tuner.
There now seems to be periods inserted into the text.
Attached are the files.
You can download the ts file here:
https://www.dropbox.com/scl/fi/4b1y86efag39sjnmm65hs/all_in_with_chris_hayes_20250326_1958.ts?rlkey=tyid6blj5hvsbyhg1mxs9nvr8&st=557jrkq8&dl=0
Again, this is from a Hauppauge tv tuner.
I have experienced issues from a potential hacker.
Sincerely, William Johnston
I'm not super involved in the code these days but someone that is currently active will take a look ASAP.
Playing with the ccx_encoders_srt.c file I found this "solution" that removes periods in places like the example you showed @williamj77 . However, it also strips periods from the ends of all sentences, which affects expected output in other cases. Thats why I'm not making a pull request because it can affect other use cases. Still, it might help someone else refine the logic
Hello,
Is the output correct?
If so, can you send me the Windows exe?
I am more of a C#/Java developer.
Sincerely, William Johnston
From: David Sent: Wednesday, April 23, 2025 6:19 PM To: CCExtractor/ccextractor Cc: William Johnston ; Mention Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
321david123 left a comment (CCExtractor/ccextractor#1681) ccx_encoders_srt.c.zip
Playing with the ccx_encoders_srt.c file I found this "solution" that removes periods in places like the example you showed @williamj77 . However, it also strips periods from the ends of all sentences, which affects expected output in other cases. Thats why I'm not making a pull request because it can affect other use cases. Still, it might help someone else refine the logic
Screenshot.2025-04-23.at.5.17.25.PM.png (view on web) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
It does work @williamj77 here's the complete file:
If you want to try it by yourself on other files, you just have to replace the encoder with the one in my last comment and follow the standard build instructions. Hope this helps!.
If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed.
If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix.
Carlos:
The cc is correct for a different tv channel.
And I have been experiencing hacker issues.
Again, I was wondering if the output is correct for the updated code.
Sincerely, William Johnston
From: Carlos Fernandez Sanz Sent: Friday, April 25, 2025 11:31 AM To: CCExtractor/ccextractor Cc: William Johnston ; Mention Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
cfsmp3 left a comment (CCExtractor/ccextractor#1681) If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed.
If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Can anyone build and send me the exe for the updated code?
I am not a native C++ developer anymore.
From: Carlos Fernandez Sanz Sent: Friday, April 25, 2025 11:31 AM To: CCExtractor/ccextractor Cc: William Johnston ; Mention Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
cfsmp3 left a comment (CCExtractor/ccextractor#1681) If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed.
If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Hello William Here is a binary of the latest CCExtractor code. Hope this helps :) CCExtractor latest https://drive.google.com/file/d/1VetQZd559QRFrGFG-_HvB3BKKRIpg39W/view?usp=share_link
On Mon, 28 Apr 2025 at 22:43, William Johnston @.***> wrote:
williamj77 left a comment (CCExtractor/ccextractor#1681) https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2835933830
Can anyone build and send me the exe for the updated code?
I am not a native C++ developer anymore.
From: Carlos Fernandez Sanz
Sent: Friday, April 25, 2025 11:31 AM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
cfsmp3 left a comment (CCExtractor/ccextractor#1681)
If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed.
If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: @.***>— Reply to this email directly, view it on GitHub https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2835933830, or unsubscribe https://github.com/notifications/unsubscribe-auth/BFTBVDOTWCN2JKEVG7NJG3D23ZOURAVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZVHEZTGOBTGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks for the exe.
Can you include this updated code?
From: Vatsal Keshav Sent: Monday, April 28, 2025 4:15 PM To: CCExtractor/ccextractor Cc: William Johnston ; Mention Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
vats004 left a comment (CCExtractor/ccextractor#1681)
Hello William
Here is a binary of the latest CCExtractor code.
Hope this helps :)
CCExtractor latest
<https://drive.google.com/file/d/1VetQZd559QRFrGFG-_HvB3BKKRIpg39W/view?usp=share_link>
On Mon, 28 Apr 2025 at 22:43, William Johnston @.>
wrote:
> williamj77 left a comment (CCExtractor/ccextractor#1681)
> <https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2835933830>
> <br> Can anyone build and send me the exe for the updated code? <br> <br>
> I am not a native C++ developer anymore. <br> <br> From: Carlos Fernandez
> Sanz <br> Sent: Friday, April 25, 2025 11:31 AM <br> To:
> CCExtractor/ccextractor <br> Cc: William Johnston ; Mention <br> Subject:
> Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681) <br>
> <br> cfsmp3 left a comment (CCExtractor/ccextractor#1681) <br> If the
> problem is with the input, we shouldn't do anything. Periods can be
> removed by a post script if needed. <br> <br> If the problem is that
> we're not processing the input correctly, then we should figure out
> what's going on. But if players such as VLC display the periods, then
> they're just there and there's nothing for us to fix. <br> <br> —
> <br> Reply to this email directly, view it on GitHub, or unsubscribe. <br>
> You are receiving this because you were mentioned.Message ID:
> @.> <br>
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2835933830>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BFTBVDOTWCN2JKEVG7NJG3D23ZOURAVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZVHEZTGOBTGA>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: @.>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: @.>
Sure, @321david123 's code seems to output the same captions as before. It'll be work great with a little refining. Till then, here's a post-processing-binary for removing unnecessary periods [srt-cleaner] (https://drive.google.com/file/d/1YsXf_y5mRu7JNSSFwxR9eXeVaHofQuUR/view?usp=share_link) github
Use like this :
$srt-cleaner your_input.srt your_output.srt
Edit :
I was trying running it with if (ccx_options.hauppauge_mode){//period-removing-logic} but @321david123's logic works without that.
Here is the binary of code as of 29 Apr 2025 + period removing logic contributed by @321david123 :
updated ccxr
Thanks again.
But please take a look at the srt file with added periods.
Again, can you create an exe with the updated code?
From: Vatsal Keshav Sent: Tuesday, April 29, 2025 7:19 AM To: CCExtractor/ccextractor Cc: William Johnston ; Mention Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
vats004 left a comment (CCExtractor/ccextractor#1681)
Sure, @321david123 's code seems to output the same captions as
before. It'll be work great with a little refining.
Till then, here's a post-processing-binary for removing unnecessary periods [srt-cleaner]
(https://drive.google.com/file/d/1YsXf_y5mRu7JNSSFwxR9eXeVaHofQuUR/view?usp=share_link)
[](github
: https://github.com/vats004/srt-cleaner)
Use like this :
$srt-cleaner your_input.srt your_output.srt
On Tue, 29 Apr 2025 at 02:18, William Johnston @.>
wrote:
> williamj77 left a comment (CCExtractor/ccextractor#1681)
> <https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2836554254>
> <br> Thanks for the exe. <br> <br> Can you include this updated code? <br>
> <br> <br> <br> From: Vatsal Keshav <br> Sent: Monday, April 28, 2025 4:15
> PM <br> To: CCExtractor/ccextractor <br> Cc: William Johnston ; Mention
> <br> Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue
> #1681) <br> <br> vats004 left a comment (CCExtractor/ccextractor#1681) <br>
> Hello William <br> Here is a binary of the latest CCExtractor code.
> <br> Hope this helps :) <br> CCExtractor latest <br>
> &lt;
> https://drive.google.com/file/d/1VetQZd559QRFrGFG-_HvB3BKKRIpg39W/view?usp=share_link&gt;
> <br> <br> On Mon, 28 Apr 2025 at 22:43, William Johnston
> @.&gt; <br> wrote: <br> <br> &gt;
> williamj77 left a comment (CCExtractor/ccextractor#1681) <br>
> &gt; &lt;
> https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2835933830&gt;
> <br> &gt; &lt;br&gt; Can anyone build and send me the exe
> for the updated code? &lt;br&gt; &lt;br&gt; <br>
> &gt; I am not a native C++ developer anymore. &lt;br&gt;
> &lt;br&gt; From: Carlos Fernandez <br> &gt; Sanz
> &lt;br&gt; Sent: Friday, April 25, 2025 11:31 AM &lt;br&gt;
> To: <br> &gt; CCExtractor/ccextractor &lt;br&gt; Cc:
> William Johnston ; Mention &lt;br&gt; Subject: <br> &gt;
> Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
> &lt;br&gt; <br> &gt; &lt;br&gt; cfsmp3 left a
> comment (CCExtractor/ccextractor#1681) &lt;br&gt; If the <br>
> &gt; problem is with the input, we shouldn&amp;#39;t do anything.
> Periods can be <br> &gt; removed by a post script if needed.
> &lt;br&gt; &lt;br&gt; If the problem is that <br>
> &gt; we&amp;#39;re not processing the input correctly, then we
> should figure out <br> &gt; what&amp;#39;s going on. But if
> players such as VLC display the periods, then <br> &gt;
> they&amp;#39;re just there and there&amp;#39;s nothing for us to
> fix. &lt;br&gt; &lt;br&gt; — <br> &gt;
> &lt;br&gt; Reply to this email directly, view it on GitHub, or
> unsubscribe. &lt;br&gt; <br> &gt; You are receiving this
> because you were mentioned.Message ID: <br> &gt;
> @.&amp;gt; &lt;br&gt; <br> &gt; <br>
> &gt; — <br> &gt; Reply to this email directly, view it on
> GitHub <br> &gt; &lt;
> https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2835933830&gt;,
> <br> &gt; or unsubscribe <br> &gt; &lt;
> https://github.com/notifications/unsubscribe-auth/BFTBVDOTWCN2JKEVG7NJG3D23ZOURAVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZVHEZTGOBTGA&gt;
> <br> &gt; . <br> &gt; You are receiving this because
> you are subscribed to this thread.Message <br> &gt; ID:
> @.&gt; <br> &gt; <br> <br> — <br> Reply to
> this email directly, view it on GitHub, or unsubscribe. <br> You are
> receiving this because you were mentioned.Message ID: @.> <br>
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/CCExtractor/ccextractor/issues/1681#issuecomment-2836554254>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BFTBVDM6Q4O7G4RHU5NQDAL232HZ3AVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZWGU2TIMRVGQ>
> .
> You are receiving this because you commented.Message ID:
> @.>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: @.***>
Hi, if you're using windows, then you could easily just run the docker build for testing out different files.
Here's the instructions to run the main branch, you can just replace the \path\to\video\ with the location of your file and then copy and paste into terminal.
For testing another file, just run docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest <YOURFILE> --hauppauge -o output.srt
git clone https://github.com/CCExtractor/ccextractor.git
cd ccextractor\docker
docker build --platform linux/amd64 -t ccextractor .
copy \path\to\video\all_in_with_chris_hayes_20250326_1958.ts .
docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest all_in_with_chris_hayes_20250326_1958.ts --hauppauge -o output.srt
If you wanted to run it with 321david123's new SRT encoder, I've made a branch for the updated code(credit for the code goes to 321david123)
git clone https://github.com/steel-bucket/ccextractor/ -b 321david123-FIX
cd ccextractor\docker
docker build --platform linux/amd64 -t ccextractor .
copy \path\to\video\all_in_with_chris_hayes_20250326_1958.ts .
docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest all_in_with_chris_hayes_20250326_1958.ts --hauppauge -o output.srt
This is for testing files, if there's need for exe, we can prepare one.
Hello,
I don’t like docker on my machine.
Is there a workaround?
From: Deepnarayan Sett Sent: Tuesday, April 29, 2025 1:13 PM To: CCExtractor/ccextractor Cc: William Johnston ; Mention Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
steel-bucket left a comment (CCExtractor/ccextractor#1681) Hi, if you're using windows, then you could easily just run the docker build for testing out different files. Here's the instructions to run the main branch, you can just replace the \path\to\video\ with the location of your file and then copy and paste into terminal. For testing another file, just run docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest <YOURFILE> --hauppauge -o output.srt
git clone https://github.com/CCExtractor/ccextractor.git cd ccextractor\docker docker build --platform linux/amd64 -t ccextractor . copy \path\to\video\all_in_with_chris_hayes_20250326_1958.ts . docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest all_in_with_chris_hayes_20250326_1958.ts --hauppauge -o output.srt
If you wanted to run it with 321david123's new SRT encoder, I've made a branch for the updated code(credit for the code goes to 321david123)
git clone https://github.com/steel-bucket/ccextractor/ -b 321david123-FIX cd ccextractor/docker docker build --platform linux/amd64 -t ccextractor . copy \path\to\video\all_in_with_chris_hayes_20250326_1958.ts . docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest ./all_in_with_chris_hayes_20250326_1958.ts --hauppauge -o output.srt — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>