Add option to check R-W pack parity during refine
This implements a --refine-subchannel-rw option that enables R-W pack P parity checking.
This provides some small improvement when refining CD+G discs: Over 6 refines on an old disc (Lou Reed - New York), it went from 627 errored packs (237 CD+G) to 554 (205). Without the R-W check it only got down to 621 (230) over 6 refines. (These pack error counts are from redumper-extract-rw, which will correct the errors.)
A pack is interleaved across three sectors, but if there are only one or two errors in a pack it is possible to locate it so only the affected sector needs to be retried. These errors could be corrected anyway, but it will let us get some more use out of refine for discs that use R-W.
That's a lot of code. I will be checking that out on a weekend.
On a high level, what is the benefit of checking if R-W pack is valid (across 3 sectors) vs subchannel Q check (across 1 sector)?
I'll explain: If Q is damaged, it's crc doesn't match e.g. for each sector we check Q crc and mark it to refine if it's incorrect. Due to the way subchannel is bit interleaved, bad Q crc usually means that part of R-W pack is also bad. Rereading that sector's subchannel until we get Q crc match usually fixes it's 1/3 R-W counterpart.
To summarize, there are two ways to check if whole subchannel is valid:
- check Q crc of a given sector
- check R-W parity (we'll get to that a bit later but for the simplicity I omit the details here) of 3 sectors
So far, I don't understand why to do (2) that is much more complicated if (1) is enough to establish that subchannel is incorrect?
Another question, CD-TEXT is stored in R-W (lead-in). Internally, CD-TEXT is structured into the packs and there is same 16-bit CRC field, example: https://github.com/superg/redumper/blob/main/cd/toc.ixx#L421
Is there any correlation of this CD-TEXT structured data and R-W packs structure?
Thanks for looking over it!
Due to the way subchannel is bit interleaved, bad Q crc usually means that part of R-W pack is also bad. Rereading that sector's subchannel until we get Q crc match usually fixes it's 1/3 R-W counterpart.
It's true that rereading for bad Q might fix R-W, but my goal is to retry sectors with only R-W errors.
From an example dump: The initial dump had Q: 156, R-W: 547 (sectors with errors of each type). A few refine passes later it's down to Q: 98, R-W: 463. That's 58 fewer Q, 84 fewer R-W. Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried.
Statistical aside: If your model is that errors occur by replacing a whole byte by another byte at random, there's still a ~1/2 chance of the Q bit being correct when P-W is replaced (127 same Q / 255 wrong bytes), while that chance is only 1/85 for R-W (3 same R-W / 255 wrong bytes), suggesting ~40 times as many R-W errors. The odds are a lot better than that, though, since the error is roughly independent for each bit. With error probability p for one bit you'll expect 1-(1-p)**6 probability of error in a 6 bit symbol, (1-(1-p)**6)/p approaches 6 at low error rates (around a reasonable 1/100,000). (The limit is also 6 when considered over a whole 96 byte subcode sector.) In practice I see ~3-5 times more sectors with R-W errors than Q.
Is there any correlation of this CD-TEXT structured data and R-W packs structure?
I haven't implemented checking CD-Text, but as I understand it CD-Text doesn't use this same R-W pack structure:
There is no interleave or sector crossing, each 24 byte pack of CD-Text is self-contained, the sector is divided evenly into 4 of these. CD-Text data is bit packed directly into the 6 bits of R-W, so a 24 byte pack has 24*6/8=18 bytes of CD-Text payload.
As you say there is only the 16 bit CRC; this is error detecion, not correction. CD-Text handles errors through repetition since there is so little data: Each pack is repeated several times and the drive picks one with each sequence number that has a good CRC. I think this is similar to how TOC is encoded in Q?
So in theory you could use that to detect R-W errors where CD-Text is present. It's a bit simpler because you don't need to consider the interleave, you only need to re-read exactly the sector with the error.
Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried.
Also note that this isn't a case of R-W being corrected due to a re-read of a neighboring sector. We're able to mark just one of those sectors bad for R-W if we can locate the error within the pack, and that was the case with all errors in this dump. In practice even 2 errors in a pack is rare on my discs.
From an example dump: The initial dump had Q: 156, R-W: 547 (sectors with errors of each type). A few refine passes later it's down to Q: 98, R-W: 463. That's 58 fewer Q, 84 fewer R-W. Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried.
Statistical aside: If your model is that errors occur by replacing a whole byte by another byte at random, there's still a ~1/2 chance of the Q bit being correct when P-W is replaced (127 same Q / 255 wrong bytes), while that chance is only 1/85 for R-W (3 same R-W / 255 wrong bytes), suggesting ~40 times as many R-W errors. The odds are a lot better than that, though, since the error is roughly independent for each bit. With error probability p for one bit you'll expect 1-(1-p)**6 probability of error in a 6 bit symbol, (1-(1-p)**6)/p approaches 6 at low error rates (around a reasonable 1/100,000). (The limit is also 6 when considered over a whole 96 byte subcode sector.) In practice I see ~3-5 times more sectors with R-W errors than Q.
All makes sense, thanks for the clarification.
There is no interleave or sector crossing, each 24 byte pack of CD-Text is self-contained, the sector is divided evenly into 4 of these. CD-Text data is bit packed directly into the 6 bits of R-W, so a 24 byte pack has 24*6/8=18 bytes of CD-Text payload.
I see, so this is yet another structure within R-W.
As you say there is only the 16 bit CRC; this is error detecion, not correction. CD-Text handles errors through repetition since there is so little data: Each pack is repeated several times and the drive picks one with each sequence number that has a good CRC. I think this is similar to how TOC is encoded in Q?
Exactly, each TOC entry is repeated 3 times and additionally tracks cycle repeatedly until pre-gap (LBA -75). I think drive uses Q crc to choose which TOC entries are correct. It makes sense that in parallel in those same TOC R-W subchannel data there is CD-TEXT encoded. We are able to extract whole raw subchannel for lead-in using PLEXTOR but I actually never looked into R-W there as we get CD-TEXT data parsed more easily using read_toc SCSI command.
Now that I understand it a bit more I will go over the code and get that merged. Sorry for this taking so long, I'm really busy lately and just too tired to see the meaningful code end of the day, conversations are easier :).
Couple questions: GF64 math - did you backport that from Rust? I am asking this to make sure I don't taint redumper with an incompatible license, if there is such.
Also thanks for writing the "test", there is literally no good testing infra here, I just slapped something quick to make sure I don't introduce regressions in core components like MSF handling and descrambling.
Now that I understand it a bit more I will go over the code and get that merged. Sorry for this taking so long, I'm really busy lately and just too tired to see the meaningful code end of the day, conversations are easier :).
No rush, thanks a lot whenever you can get to it. I have the CI builds for the rips I wanted to do in the meantime.
Couple questions: GF64 math - did you backport that from Rust? I am asking this to make sure I don't taint redumper with an incompatible license, if there is such.
It's not ported. I mostly linked to gf256 because it has a great tutorial explaining the data types and algorithms, especially the Galois field and Reed-Solomon modules. I found those very helpful, I occasionally also used the Wikipedia articles I linked (though they tend to be too general), and Berlekamp's original "Nonbinary BCH decoding" paper on Berlekamp-Massey was helpful (even if I still don't completely understand the algorithm).
Also thanks for writing the "test", there is literally no good testing infra here, I just slapped something quick to make sure I don't introduce regressions in core components like MSF handling and descrambling.
Errors are fairly rare so I didn't have confidence I'd implemented it right until I tested every detectable error location and a few unrecoverable error patterns. The two error case can take a while to run in a debug build, though.
There was no update here since January. Dump code was rewritten since so this PR requires rework, feel free to open another one PR if you're still interested to get that in.