ccextractor appears to ignore -xmltv parameter
In console mode, both versions 0.94 and 0.89, with the following command
.\ccextractorwinfull.exe C:\F\TestFullTS.ts -xmltv N,
where N=1, 2, or 3, produce only an .srt file, the same as if the -xmltv is omitted. If this is as designed, what is missing from my syntax to get an EPG.xmltv file?
Supposedly your command syntax is correct: .\ccextractorwinfull.exe C:\F\TestFullTS.ts -xmltv N
However, XMLTV generation requires specific conditions to be met:
Primary Requirements for XMLTV Output---->
- EPG Data Must Be Present: The transport stream file must contain actual EPG (Electronic Program Guide) data in the correct format: - DVB streams: EPG data in PID 0x12 (EIT - Event Information Tables) - ATSC streams: EPG data in PIDs ≥ 0x1000
- Stream Must Contain Required Tables: - SDT (Service Description Table): Contains channel/service information - EIT (Event Information Table): Contains program scheduling data
What's Likely Happening---->
The most probable reason you're not getting an XMLTV file is that your TestFullTS.ts file doesn't contain EPG data. When there's no EPG data to process, CCExtractor will:
- Still process the captions (creating the .srt file)
- Skip XMLTV generation (since there's no EPG data to convert)
Solution (Most probably this should do the thing) ---->
XMLTV generation is completely dependent on the source file containing EPG data. If your TS file doesn't have embedded Electronic Program Guide information, CCExtractor will only extract captions (.srt) and won't generate any XMLTV output.
Try testing with a different TS file that you know contains EPG data,or verify that your current file actually has program guide information embedded in the transport stream.
Thanks for your reply. I apologize for not supplying more information about file TestFullTS.ts which was created via hdhomerun_config.exe by specifying only the channel and saving the full TS. TSReader confirmed just now that TestFullTS.ts does have the needed tables, but I see that its events (now several days past) are not included in TSReader's html report.
So I created a new file, tested it with TSReader (see attached report) and the -xmltv parameter still does not output a .xml XMLTV file but rather just the .srt file (or files, if I specify also -multiprogram). Are you sure that this functionality currently works?
Hmm. I see that you mention "SDT" as a necessary table, whereas my TS appears to have service descriptions instead in "VCT" and "TVCT" tables. Can this be the problem?
Dealing with different broadcast standards---->
- DVB (European standard): Uses SDT (Service Description Table) for
channel information - ATSC (North American standard): Uses VCT/TVCT (Virtual Channel
Table/Terrestrial Virtual Channel Table) for channel information
The codebase has different levels of support for each---->
- DVB/SDT support: Mature and well-tested
- ATSC/VCT support: Present but less robust
What's Happening---->
- The codebase is designed for both standards but the ATSC implementation has gaps
- VCT tables are being detected and parsed (channel info extracted)
- The mapping between VCT data and EIT events may be failing
- No XMLTV output because the codebase can't associate events with
channels properly.
Bottom Line---->
VCT/TVCT vs SDT might be the problem. XMLTV functionality works well with DVB/SDT streams but has limitations with ATSC/VCT streams. This is not an user error supposedly.
Thanks, that explanation makes sense. Can I help you to improve ATSC support by supplying more sample TS files or other information? E.g., I could supply links to several-minute HDHR-saved files or more html outputs from TSReader, etc.
Also, my knowledge of C is limited and I know nothing of Rust, but I am a competent coder in other languages so if you can pinpoint the source code area where the VCT/TVCT decoding is done I may be able to spot problems.
Please advise.
Let me look into it? Is it okay with you?
Let me look into it? Is it okay with you?
Go for it
Problem---->
-ATSC streams with programs (nb_program > 0) would NOT generate XMLTV files -EPG data was stored in TS_PMT_MAP_SIZE fallback but never output -Result: Empty XMLTV files despite having valid EPG data
After solution---->
-ATSC streams with programs WILL generate XMLTV files -EPG data from TS_PMT_MAP_SIZE fallback would now be included in output -Result: Complete XMLTV files with channel and program information
Expected User Experience---->
After the fix, when running: .\ccextractorwinfull.exe C:\F\TestFullTS.ts -xmltv 1
The user should get---->
-SRT file (as before) -EPG.xmltv file
The XMLTV file would contain both the regular channel/program data AND the ATSC-specific data that was previously being ignored.
This is the plan for the changes to be made. Is there any other problem that needs to be solved? If yes, then please inform the problem. If no, then I would like to proceed.
Your solution appears perfect. For clarification: I think that the EPG.xmltv file content will be limited to events for the program of the -pn parameter (or the automatically selected program if no -pn). Is that correct?
I look forward to testing this new version, thanks!
I think that the below was posted here by mistake and then deleted by tmdeveloper007. Correct?
================================= Subject: Re: [CCExtractor/ccextractor] ccextractor appears to ignore -xmltv parameter (Issue #1759)
https://avatars.githubusercontent.com/u/221017557?s=20&v=4 tmdeveloper007 left a comment (CCExtractor/ccextractor#1759) https://github.com/CCExtractor/ccextractor/issues/1759#issuecomment-3506556155
- GitHub issue #1759 https://github.com/CCExtractor/ccextractor/issues/1759 : ATSC XMLTV generation failed for ATSC streams with VCT/TVCT tables
- CMake build system couldn't integrate Rust components due to Corrosion dependency failures
- Rust toolchain incompatibility (edition 2024 vs Rust 1.73)
- Build system fragmentation between C and Rust components
Came around these problems while reviewing the codebase. If you have any other problems, please list them so they can be solved.
Just wanted to know if anybody else was facing any problems.
Is there any progress on this issue? Can I supply any data to help?
Hi @TPeterson94070! I would like to work on a fix for this issue. Could you please re-upload the sample channel5FullTS.ts (or a short clip)? The TSReader report is visible, but the actual transport stream isn’t available anymore, and it would help verify the fix properly. Thanks!
Welcome!
Here is a link to the original sample: https://drive.google.com/file/d/1iC2jDCGJOr_XvKrCi3MAdxBbhaN7m6Jl/view?usp=sharing
And here is another old full TS sample link: https://drive.google.com/file/d/1DlqcrplHXaUb9DLZfIYfmXVKj4Uu4hMt/view?usp=sharing
These both contain EIT packets, but of course they're all in the past and would not appear in an EPG table. When you are ready to test a sample with "live" events I can provide a new link.
Thanks for providing those samples! I'll surely let you know when I'm ready to test with live events.
Hi @TPeterson94070 ,
I've implemented full support for ATSC EIT (0xCB) and VCT (0xC8) parsing, which restores XMLTV generation for ATSC streams. Both channel5FullTS.ts and ch12FullTS.ts now produce valid, populated XMLTV output.
Running:
./ccextractor channel5FullTS.ts --xmltv 1
now produces a valid XMLTV file containing:
- channel listings extracted from the VCT
- correct program schedules from EIT-0/1/2/3
- proper start/stop UTC timestamps
- titles and subtitles
- unique ts-meta-id values matching the EIT event IDs
Changes made:
- Fixed inverted CHECK_OFFSET logic that prevented ATSC EIT parsing completion
- Modified EPG_output() to always output events from fallback storage, not just when nb_program==0
- Extended support for all ATSC EIT tables (0xCB-0xD0) and Cable VCT (0xC9)
I'm attaching the generated XMLTV output for reference.
There are still a couple of accuracy issues visible in the output that are outside the scope of this fix, for example:
-
one program entry contains an incorrect date/time (2047…)
-
channel IDs currently appear as numeric values (3, 4, 5, etc.) instead of full channel names from the VCT (e.g. "ABC7", "Localish", etc.)
-
a program mapped to channel="0" near the end
These likely require additional work in:
- VCT channel name extraction, and EIT time conversion / MJD handling
I think addressing those would be best done in a follow-up PR, since they are not directly related to recognizing and parsing the ATSC tables.
I'm also ready to test these changes against live broadcast streams to confirm behavior beyond the provided sample file and let me know if you'd like me to continue with that work next.
Hi, @x15sr71 !
That looks like excellent progress! Here is a link to a new full TS file for a channel with 5 programs and EIT data from now (10:00 PST 26 Nov) to 10:00 PST 28 Nov:
ch29FullTS.ts . And here is the TSReader html view of that file: ch29FullTS.htm. I chose this channel because it has longer EIT records than most seem to have, so it gives the longest "future" to the .ts file. This is how it appears in TSReader:
Please let me know how I can try out your new version.
Hi @TPeterson94070, Thanks for providing the sample TS files and the TSReader HTML output. I’ve tested them on my end, and I’m now getting the expected .xml XMLTV output.
I’ve opened this PR so you can verify the behavior with your own test streams as well. Please let me know what results you get or if you notice anything that still needs adjustment. I’m happy to make any additional changes based on your findings.
Hi @x15sr71 !
Thanks for your great work on this. I am anxious to try it out on more samples but, IIUC, I need to build an executable from your repo to test it locally. Unfortunately, I don't have experience with the tools used to make that build. Am I overlooking a shortcut? If so, please point me in the right direction to find it.
@x15sr71 , I think there may be an EIT parsing error in the current fix. All of the <program> items in the test TS file xml outputs have the same values for <title> and <sub-title>. I would expect such fields to be distinct in general. Also, note that the TSReader html output of an EIT event seems to have different names and fields, with "Name" corresponding to <title> I think and a "Description" rather than a <sub-title>. See the following example html item:
Starts: 11/26/2025 10:00:00 AM
Length: 01:00:00
EIT Source: n/a
Name: The Price Is Right
Description: A Thanksgiving spectacular overflowing with cash, cars and luxury vacations.
Descriptor: ATSC Content Advisory Descriptor
ATSC Content Advisory Descriptor:
Region 1 Rating: TV-G Description: TV-G
Descriptor: ATSC AC-3 audio Descriptor
ATSC AC3 Descriptor
Sample Rate: 48 or 44.1 or 32 KHz Bitrate: 384 Kbps (exact)
Bitstream mode: complete main Audio Coding Mode: 3/2 5 L, C, R, SL, SR
Descriptor: ATSC AC-3 audio Descriptor
ATSC AC3 Descriptor
Sample Rate: 48 or 44.1 or 32 KHz Bitrate: 192 Kbps (exact)
Bitstream mode: dialogue Audio Coding Mode: 2/0 L, R
Hi @x15sr71 !
With the help of numerous AI chats I've learned how to run the build scripts in the repo's Actions. I've discovered that the Build for Windows script fails because of a hash mismatch from one of its referenced resource downloads, so I wasn't able to test using Windows. (I see that you reported the Windows build issue in PR1769) However, the Linux build did work, so I was able to run PR1773 using WSL and confirm the same results as you've reported.
It seems that there is still significant work remaining to get the EPG items to comport with what TSReader shows. Are you still interested in fixing this issue?
Hi @TPeterson94070,
I apologize for the delayed response - I've been dealing with university end-semester exams but am now fully back and committed to resolving this issue.
Progress Update
I've successfully implemented ATSC ETT (Extended Text Table) parsing and fixed the subtitle duplication issue you reported. The XMLTV output now correctly generates:
-
<title>for event names (from EIT title_text) -
<desc>for extended descriptions (from ETT extended_text_message) - No duplicate subtitle fields
Important: The original codebase incorrectly routed table_id 0xCC (ETT) to the EIT decoder, which is why extended descriptions were never extracted. I've implemented a dedicated ETT parser from scratch.
Current Status with ch29FullTS.ts
When processing your test file, the XML output currently shows no <desc> tags. This is expected behavior due to timing - here's why:
Root Cause: Event ID Timing Mismatch
ATSC broadcasters transmit EIT and ETT tables on different schedules:
- EIT repeats every few seconds with events for the next 3-12 hours
- ETT cycles slowly through descriptions for a subset of events
Your 2-minute sample captured:
-
EIT events:
0x13C6,0x1413,0x1414,0x1416, etc. (currently airing programs) -
ETT descriptions:
0x0F52,0x0F5A,0x0FD6,0x0F12, etc. (past/future programs)
Zero event ID overlap = No matched descriptions
Implementation Verification (Synthetic Test)
To validate my new ETT implementation, I temporarily injected a synthetic event with ID 0x00020F12 (matching one of the ETT messages in your stream). Results:
-> ETT MATCHED: full_id=0x00020F12 in pmt_map=1 title='SYNTHETIC TEST EVENT' -> ETT TEXT: Lorelai and Rory work at the diner while Luke arra... (lang=eng)
The XML correctly generated:
<desc lang="eng">Lorelai and Rory work at the diner while Luke arranges a funeral.</desc>
This validates my implementation:
- ETT table parsing (new functionality)
- Event matching by
(source_id << 16) | event_id - Text extraction from
multiple_string_structureformat - XML
<desc>tag generation
Request for Testing
Could you provide a 15-30 minute recording sample? I have the complete ETT implementation on my local machine that needs validation before pushing to the PR. Once I can confirm it works with a longer sample that has overlapping EIT/ETT cycles, I'll push the complete implementation for you to test on your hardware.
What's Currently in the PR
The draft PR currently contains:
- Corrected EIT bounds checks (fixes
< offset_end→> offset_endlogic errors) - Extended EIT table ID support (
0xCD–0xD0) - XMLTV formatting improvements (proper
programme/title/desctags) - Fallback storage checks for ATSC streams
What's Ready Locally (Not Yet Pushed)
-
Brand new ETT parser (
EPG_ATSC_decode_ETT()- previously missing from codebase) - ETT text extraction from
multiple_string_structureformat - Event matching between EIT and ETT by
full_event_id - Proper
<desc>tag generation from ETT extended text - Table routing fix (separates 0xCC from EIT cases)
I'm confident the new implementation is correct based on my testing, but I'd like to validate it against a real-world sample with overlapping EIT/ETT data before pushing to the PR. This will allow us to see real <desc> tags populated from broadcast ETT data.
Hi @x15sr71 !
Welcome back. I hope your exams went well.
I'll generate and post a new full-TS 30-minute sample file today.
When you repost PR1773 for me to test, please let me know which, if any, of the other 32 pending PR I need to add to complete your fix. I hope somebody can fix the Windows-build script issue soon!
I've posted a new 30-minute full-TS file here from the same channel as the previous clip.
Hi @TPeterson94070,
XMLTV results from your 20251205ch29FullTS.ts (30-min sample):
- 331 programs (5 VCT channels)
- 322 titles (97% EIT coverage)
- 5 descriptions
Please review. If good, I'll push complete fix to #1773.
Note - Only 5 descriptions is expected. ATSC ETT tables are sparse, broadcasters transmit extended text intermittently and usually only for select programmes.
This will remain a self-contained PR, no additional PRs or dependencies required.
Hi @x15sr71 !
Thanks for the update. I've compared your xml with TSReaderLite's html export, 20251205ch29FullTS.htm, that I created today from the 30-minute file and there are some significant differences. Unfortunately, the html only shows the EIT data for programs that are current or future, so I should have preserved it yesterday when I captured it. But the attached one made just now shows 234 "Description:" entries (one for each program, with "n/a" for those not having data) and only 25 "Description: n/a" entries. This means that PR1773 must be still missing many descriptions. {EDIT: I discovered that "Description:" occurs more than once/EIT item. The actual number of programs, determined from "Starts:" or "EIT source:" is 156}
For further testing, I'll replace the existing 30-minute file with another today and will generate the html file immediately so you can have a complete version for comparison with PR1773.
I've made a new 30-minute capture from rf channel 29 and posted it at: 20251206ch29FullTS.ts.
Here is the TSReaderLite html export, 20251206ch29FullTS.htm, from it, showing 298 programs with 42 "n/a" descriptions.
@TPeterson94070,
Thanks for the detailed comparison with TSReader! You've caught something important that I need to explain.
What's Actually Happening
TSReader is showing you EIT data (the short descriptions that come with every event in table_id 0xCB-0xD0). Those 234 "Description:" fields you're seeing are actually from the EIT packets themselves - they're the brief summaries broadcasters include with each program listing.
My implementation is specifically looking for ETT data (Extended Text Table, table_id 0xCC), which contains longer, more detailed descriptions. These are transmitted separately and much more sparsely than the EIT data.
EIT vs ETT - The Key Distinction
| Tool | Data Source | Sample 1 (Dec 5) | Sample 2 (Dec 6) |
|---|---|---|---|
| TSReader | EIT (table_id 0xCB-0xD0) |
209 descriptions (234 total, 25 n/a) | 256 descriptions (298 total, 42 n/a) |
| CCExtractor | ETT (table_id 0xCC) |
5 descriptions | 5 descriptions |
Both tools are correct, we're just looking at different ATSC tables!
TSDuck Analysis - Sample 1: 20251205ch29FullTS.ts
I ran TSDuck on your first stream (30-min) to extract all ETT sections:
Command used:
tstables --usa --tid 204 20251205ch29FullTS.ts | grep -i "extended|text|ett" > ett_analysis_20251205.txt
Attached: ett_analysis1205.txt - contains complete dump of ALL ETT sections from your 30-minute stream
The results show hundreds of ETT sections cycling through the broadcast, but during your 30-minute capture window, only 5 of them happened to match up with EIT events that were also present in the capture. This is completely normal for ATSC - broadcasters cycle through ETT descriptions slowly, so you only get matches when the timing lines up.
The 5 descriptions that matched (Dec 5):
- "Jack trusts Nikki with a secret..." (ETM 0x00014F0E, PID 0x1E00)
- "In the deaths of a wealthy Southern..." (ETM 0x00024EDE, PID 0x1E00)
- "Realizing that she can no longer..." (ETM 0x00034F76, PID 0x1E00)
- "Back Pain? Hip? Sleep well..." (ETM 0x00044EFE, PID 0x1E06)
- "Two medieval knights escort..." (ETM 0x00054F62, PID 0x1E0A)
TSDuck Analysis - Sample 2: 20251206ch29FullTS.ts
Command used:
tstables --usa --tid 204 20251206ch29FullTS.ts | grep -i "extended|text|ett" > ett_analysis_20251206.txt
Attached: ett_analysis1206.txt - complete ETT section dump
The 5 descriptions that matched (Dec 6):
- "Braden Smith and the No. 1 Boilermakers..." (ETM 0x00014F0E, PID 0x1E00)
- "A tycoon's theme-park plans end in murder." (ETM 0x00024EDE, PID 0x1E00)
- "A billiards hall is at risk of closing..." (ETM 0x00034F76, PID 0x1E00)
- "A series of slashings is linked to a scalpel..." (ETM 0x00044EFE, PID 0x1E06)
- Similar pattern for 5th entry (ETM 0x00054F62, PID 0x1E0A)
Why the Difference?
- TSReader counts: 234/256 EIT descriptions (short summaries included with every event)
- My output shows: 5 ETT descriptions that matched with EIT events present in the capture window
Overlap explanation for the first sample file’s XML output.”
"Back Pain..." and "Two medieval knights..." appear in both EIT and ETT because broadcasters often reuse short EIT summaries as ETT extended text. My parser correctly extracts only the ETT version (table_id 0xCC) for <desc> tags.
If you'd also like the shorter EIT descriptions in XMLTV output (in addition to ETT), I can add that as an enhancement, but the current implementation follows the ATSC standard where:
-
<title>= EIT event name -
<desc>= ETT extended description (when available)
Let me know how you'd like to proceed, should I extend the parser to include both EIT and ETT descriptions, or would you prefer I keep the current ETT-only behavior and push the fix so you can test it?
@x15sr71 , thanks for the detailed explanation. My ultimate use-case is to use ccextractor's xmltv output to construct an EPG for PVR use. As such, including the short descriptions augmented with extended ones, when available, would be most useful. So, I'd like to have both if possible.
BTW, I understand your likely reflexive spelling of "programme", but since we're talking about ATSC, I think that the spelling in your first xml ("program") was more appropriate. ;)
@TPeterson94070,
Thanks for clarifying your use-case, that's really helpful context. Based on your PVR requirements, I'll add EIT short-description extraction while keeping the output XMLTV-compliant.
XMLTV structure:
-
<title>— Event name (from EIT) -
<sub-title>— Short description (from EIT, what TSReader shows as "Description:") -
<desc>— Extended description (from ETT, when available)
I'll push the complete implementation to #1773 soon so you can test against your latest samples and verify the output matches your expectations.
And yes, good catch on "programme" vs "program"! 😄 I'll keep <programme> in the XMLTV since that's what the spec requires, but I appreciate the ATSC irony!