ccextractor
ccextractor copied to clipboard
[FEAT] Port all the `common_*` files, teletext and networking portions of `libccx` to rust
In raising this pull request, I confirm the following (please check boxes):
- [x] I have read and understood the contributors guide.
- [x] I have checked that another pull request for this purpose does not exist.
- [x] I have considered, and confirmed that this submission will be valuable to others.
- [x] I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
- [x] I give this submission freely, and claim no ownership to its content.
- [ ] I have mentioned this change in the changelog.
My familiarity with the project is as follows (check one):
- [ ] I have never used CCExtractor.
- [ ] I have used CCExtractor just a couple of times.
- [ ] I absolutely love CCExtractor, but have not contributed previously.
- [x] I am an active contributor to CCExtractor.
This PR aims to port most of the functions in files common_*
, teletext.h
, telxcc.c
, networking.c
and certain parts of utility.c
(such as logging) to rust. Certain parts of the codebase is left blank if they are dependant on some code which this PR has not aimed to convert. Such parts will have to be revisited once their dependencies are also converted to rust.
I have placed the new rust code inside a new crate named as lib_ccxr
, which is a normal rust library crate. This was because a static library crate does not allow doctests. This would also help in seperating ffi code to idiomatic Rust code. Since the Rust approach and C approach to solve a problem are very different, there are 3 layers of indirection I have used for porting:
1. The C-FFI layer
This layers will have function names same as defined in C but with the prefix of ccxr_
. These functions will handle the conversion between C-native types and Rust-native types and then pass it on to the C-like Rust layer. These are the functions defined in the ccx_rust
crate under libccxr_exports
module.
2. The C-like Rust layer
This layer will have function names same as defined in C but work with Rust-native types. The code written using these functions would still be procedural but in Rust, hence the name C-like Rust. The entire task of these functions will be to translate the procedural code to idiomatic Rust code. These are the functions defined in the c_functions.rs
file in appropriate modules inside the lib_ccxr
crate.
3. The Idiomatic Rust layer
This layer will be the code written as one is supposed to write in idiomatic Rust. It will have complete documentation and tests. This code will will be situated in the lib_ccxr
crate.
@PunitLodha Can you review this PR? I have split the PR into multiple meaningful commits, so that atleast each individual commit is easy to review.
@PunitLodha Can you review this PR? I have split the PR into multiple meaningful commits, so that atleast each individual commit is easy to review.
Could you make each commit a separate PR? If you can't, that means each change is not correctly isolated from the rest.
We need each commit to be self-contained (at least don't break anything) and be reversible by itself - we don't want to merge a huge PR with lots of commits that must exist together.
(we're still paying for playing fast-and-loose in the past)
CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results:
Report Name | Tests Passed |
Broken | 13/13 |
CEA-708 | 2/14 |
DVB | 2/7 |
DVD | 3/3 |
DVR-MS | 2/2 |
General | 22/27 |
Hauppage | 3/3 |
MP4 | 3/3 |
NoCC | 10/10 |
Options | 81/87 |
Teletext | 7/21 |
WTV | 13/13 |
XDS | 31/34 |
It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).
Your PR breaks these cases:
- ccextractor -autoprogram -out=srt -latin1 f1422b8bfe...
- ccextractor -datapid 5603 -autoprogram -out=srt -latin1 -teletext 85c7fc1ad7...
- ccextractor -autoprogram -out=srt -latin1 85271be4d2...
- ccextractor -autoprogram -out=ttxt -latin1 1974a299f0...
- ccextractor -autoprogram -out=ttxt -latin1 132d7df7e9...
- ccextractor -autoprogram -out=ttxt -latin1 99e5eaafdc...
- ccextractor -autoprogram -out=ttxt -latin1 -ucla 7aad20907e...
- ccextractor -autoprogram -out=ttxt -latin1 -ucla 70000200c0...
- ccextractor -autoprogram -out=ttxt -latin1 c0d2fba8c0...
- ccextractor -autoprogram -out=ttxt -latin1 006fdc391a...
- ccextractor -autoprogram -out=ttxt -latin1 e92a1d4d2a...
- ccextractor -autoprogram -out=ttxt -latin1 7e4ebf7fd7...
- ccextractor -autoprogram -out=ttxt -latin1 9256a60e4b...
- ccextractor -autoprogram -out=ttxt -latin1 27d7a43dd6...
- ccextractor -autoprogram -out=ttxt -latin1 297a44921a...
- ccextractor -autoprogram -out=ttxt -latin1 efbe129086...
- ccextractor -autoprogram -out=ttxt -latin1 eae0077731...
- ccextractor -autoprogram -out=ttxt -latin1 e2e2b501e0...
- ccextractor -autoprogram -out=ttxt -latin1 c6407fb294...
- ccextractor -autoprogram -out=ttxt -latin1 -datets dcada745de...
- ccextractor -autoprogram -out=srt -latin1 -tpage 398 5d5838bde9...
- ccextractor -autoprogram -out=srt -latin1 -teletext -tpage 398 3b276ad8bf...
- ccextractor -autoprogram -out=ttxt -xds -latin1 -ucla e274a73653...
- ccextractor -autoprogram -out=ttxt -latin1 -ucla -xds b22260d065...
- ccextractor -autoprogram -out=ttxt -latin1 -ucla -xds 88cd42b89a...
- ccextractor -svc 1 -out=txt -nobom -noru ea83ff7bcb...
- ccextractor -svc 1 -out=txt f17524b53f...
- ccextractor -svc 1 -out=txt 80848c45f8...
- ccextractor -svc 1 -out=txt -nobom -noru b5d6aad89f...
- ccextractor -svc 1[EUC-KR] -out=txt -noru b5d6aad89f...
- ccextractor -svc 1 -out=srt da904de35d...
- ccextractor -svc 1 -out=sami da904de35d...
- ccextractor -svc 1[EUC-KR] b5d6aad89f...
- ccextractor -svc 1[EUC-KR] -noru b5d6aad89f...
- ccextractor -svc all da904de35d...
- ccextractor -svc all[EUC-KR] b5d6aad89f...
- ccextractor -svc 1,2[UTF-8],3[EUC-KR],54 -out=txt da904de35d...
- ccextractor -svc 1 c83f765c66...
- ccextractor --capfile /repository/Dictionary/MattS_dictionary.txt c83f765c66...
- ccextractor -608 -out=srt c83f765c66...
- ccextractor -708 -out=srt c83f765c66...
- ccextractor -xdsdebug -out=srt c83f765c66...
- ccextractor -in=es dc7169d7c4...
- ccextractor -stdout -quiet -nofc 79a51f3500...
- ccextractor -stdout -quiet -nofc 767b546f96...
Check the result page for more info.
CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results:
Report Name | Tests Passed |
Broken | 13/13 |
CEA-708 | 2/14 |
DVB | 4/7 |
DVD | 3/3 |
DVR-MS | 2/2 |
General | 24/27 |
Hauppage | 3/3 |
MP4 | 2/3 |
NoCC | 10/10 |
Options | 79/87 |
Teletext | 21/21 |
WTV | 1/13 |
XDS | 26/34 |
It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).
Your PR breaks these cases:
- ccextractor -autoprogram -out=srt -latin1 85271be4d2...
- ccextractor -autoprogram -out=ttxt -latin1 1974a299f0...
- ccextractor -autoprogram -out=ttxt -latin1 132d7df7e9...
- ccextractor -autoprogram -out=ttxt -latin1 99e5eaafdc...
- ccextractor -out=srt -latin1 f23a544ba8...
- ccextractor -out=srt -latin1 97cc394d87...
- ccextractor -out=srt -latin1 10f0f77cf4...
- ccextractor -out=srt -latin1 df3b4d62d3...
- ccextractor -out=srt -latin1 d7e7dbdf68...
- ccextractor -out=srt -latin1 76734ac4a7...
- ccextractor -out=srt -latin1 c791382c94...
- ccextractor -out=srt -latin1 f673b2f916...
- ccextractor -out=srt -latin1 da75bdee47...
- ccextractor -out=srt -latin1 bd6f33a669...
- ccextractor -out=srt -latin1 0e5e6b26be...
- ccextractor -out=srt -latin1 a226cc302d...
- ccextractor -autoprogram -out=smptett -latin1 -ucla e274a73653...
- ccextractor -autoprogram -out=ttxt -xds -latin1 -ucla e274a73653...
- ccextractor -autoprogram -out=ttxt -latin1 -ucla -xds b22260d065...
- ccextractor -autoprogram -out=srt -latin1 -ucla b22260d065...
- ccextractor -autoprogram -out=ttxt -latin1 -xds -ucla c813e713a0...
- ccextractor -autoprogram -out=srt -latin1 -ucla c813e713a0...
- ccextractor -autoprogram -out=srt -latin1 -ucla c8dc039a88...
- ccextractor -autoprogram -out=ttxt -latin1 -ucla -xds 88cd42b89a...
- ccextractor -svc 1 -out=txt -nobom -noru ea83ff7bcb...
- ccextractor -svc 1 -out=txt f17524b53f...
- ccextractor -svc 1 -out=txt 80848c45f8...
- ccextractor -svc 1 -out=txt -nobom -noru b5d6aad89f...
- ccextractor -svc 1[EUC-KR] -out=txt -noru b5d6aad89f...
- ccextractor -svc 1 -out=srt da904de35d...
- ccextractor -svc 1 -out=sami da904de35d...
- ccextractor -svc 1[EUC-KR] b5d6aad89f...
- ccextractor -svc 1[EUC-KR] -noru b5d6aad89f...
- ccextractor -svc all da904de35d...
- ccextractor -svc all[EUC-KR] b5d6aad89f...
- ccextractor -svc 1,2[UTF-8],3[EUC-KR],54 -out=txt da904de35d...
- ccextractor -autoprogram -out=srt -latin1 -1 a65d39ccb3...
- ccextractor -svc 1 c83f765c66...
- ccextractor -out=spupng c83f765c66...
- ccextractor --capfile /repository/Dictionary/MattS_dictionary.txt c83f765c66...
- ccextractor -nobi c83f765c66...
- ccextractor -608 -out=srt c83f765c66...
- ccextractor -goppts -out=srt c83f765c66...
- ccextractor -in=es dc7169d7c4...
- ccextractor -autoprogram -out=srt -bom -latin1 8849331dda...
- ccextractor -stdout -quiet -nofc 79a51f3500...
- ccextractor -stdout -quiet -nofc 767b546f96...
Check the result page for more info.
@PunitLodha @cfsmp3 I have tried to split this into multiple PRs: #1551, #1552, #1553, #1554, #1555, #1556, #1557, #1558, #1559, #1560.