LIB-11: raw (low level) ISRC reads
Most of the implementation for raw reading of CDs to get the Q sub-channels and the ISRCs in them is platform independent.
The only thing that is platform specific is scsi_cmd to issue a direct scsi command.
Raw reads are only optional in the spec (so I would keep other options as backup solution) and we might need to test for the support somehow. Additionally: virtually all programs use the (more high level) "sub-channel command" approach. So that approach is tested and our "raw approch" would need more testing.
We also have to test and probably fix building again since we use new files and have new internal dependencies.
First tests show, that raw reading really fixes LIB-11
Sub-Tasks:
- [x] platform independent part
- [x] Linux
- [x] Windows
- [ ] OS X / Darwin
- [x] Check for Command availability
- [x] Check CRCs
note to testers: There is a discisrc command you can use to test libdiscid without external tools.
For Windows testers there is a binary available: libdiscid-raw-win32.zip
On Linux we used SG_IO, on Windows we probably should use IOCTL_SCSI_PASS_THROUGH_DIRECT.
No clue how it would work for Darwin, but it should work.
Like I said, implementing scsi_cmd for every platform is enough. We could als get MCNs on platforms where we don't have an easier solution. (But there seems to be no data quality problem with MCNs).
I also tried read isrcs using the normal (READ SUB-CHANNEL) method, but trying multiple times.
That actually helped with some of the problems, but not all of the test cases.
Not really sure why github shows this pull-request still having 6 commits. 4 of these are already in master as can also be seen in https://github.com/metabrainz/libdiscid/compare/master...isrc_raw
Found some info about the CRC: The CRC-16 is on CONTROL, ADR and DATA-Q. MSB is "first out". On disc parity bits are inverted, syndrome compared with 0, polynomial: x^16+x^12+x^5+1
That should be enough information to code something, but I didn't start, since I first have to look up how CRC actually works.
Windows code is updated now, so scsi_cmd on Windows can be implemented.
I also added the test suite from master, making breakage more evident.
I implemented CRC checking and also changed to reading each sector separately.
Interestingly both changes enhance the quality considerably. I didn't observe CRC mismatches after switching to separate sector reads and we now read the whole track (worst case) to find an ISRC with a matching CRC. An empty ISRC is "valid" in that context and turns up as an empty string (as it did before).
Reading sectors separately improves the performance considerably, since we stop reading sectors when an ISRC is found. Previously we read 150-279 sectors at once and checked for ISRCs in bulk.
Full reading (raw ISRCs + TOC) now takes ca. 6 seconds, which is not more (sometimes even a second less) than reading ISRCs "non-raw" (tested on Linux, isrc_raw vs. master). So when the raw read is available, we should use it.
Testing for raw read availability (or just falling back to non-raw if it fails) is still TODO. Implementing scs_cmd for Windows and possibly Mac is also TODO.
This is a follow up to my comment at https://github.com/metabrainz/libdiscid/pull/38#issuecomment-22335115, but as my findings are not Windows specific I continue the discussion here.
I have tested the CD drive, which was unable to read the ISRCs on Windows, on Linux (with the same disc). It shows the same behavior: Long reads, no ISRCs, no error messages. It looks like the calls SCSI all succeed, but it does not return any ISRCs (probably it doesn't return any subchannel data).
This shows a second problem, which I verified with a CD containing no ISRCs: Currently this implementation reads all sectors of each track (see https://github.com/metabrainz/libdiscid/pull/6/files#L4R292) until it finds a ISRC. If there is no ISRC this leads to reading basically all sectors on the disc which results in the long reading time. I think the number of sectors tried for a track should be limited. Is there some rule on how many sectors should hold the ISRC in the Q sub-channel?
Did you try the drive again on Windows after I added the debug out? I should also add this debug out to the main isrc_raw branch. Anyways, that should show if no data is returned or all data is zero.
About releases without ISRCs. I was under the impression the ISRC field is mandatory, but can be empty. Maybe that isn't the case. I'll have to check that.
The spec says there should be one ISRC every 100 sectors (I thought always, but that should be when there are non-zero ISRCs given) so we can just break when we read like 150 sectors without seeing an ISRC.
I can reproduce something like that with one of my discs. Interestingly there are some CRC errors reported, but not enough that this would be for every 100th sector: The last track then does display an error message Warning: Cannot get ISRC code for track 14 and one about a CRC mismatch for all zeros. The other CRC mismatches are all non-zero Oo.
This is the output I get with the drive on Windows for every sector read:
scsi cmd bytes returned: 0
zero data returned by scsi_cmd
So there is no data returned. Do you have any idea already on how to perform some kind of drive feature detection? Or what I could check for my drive?
I've also tested reading the disc without ISRCs with my working drive, this time limiting the sectors read to 150 per track. discisrc finished in about 10 seconds, which is even faster than the old implementation (about 1 minute, I'm happy we implemented the feature selection). It's possible we will miss some ISRCs on some discs, but I suggest we start with a lower number of sectors and increase it when there are problems reported. The discs I tested all gave good results with 150 sectors.
See pull request #39
Unfortunately I didn't read much about scsi command availability detection yet. I remember reading something about that function being part of some MMC suite and that you could check if that suite is available somehow. This possibly works together with the "sense data" returned for scsi commands, which isn't yet implemented on Windows (it works a bit weird there), but on Linux. (Have a look at scsi_cmd there). Googling for "mmc_r10a" should yield a document about multimedia scsi commands. That is the spec the commands are defined in, but I am not sure if that was the spec were I read about feature detection.
However, the tests show that we should be fine just checking for the case when no bytes are returned, which possibly means the command is not supported, but definately means we are not getting ISRCs (out of no data).
Discs without ISRCs should take like 10 seconds now, rather than 5 minutes before. The spec says there should be one every 100 sectors (valid CRC or not) so we stop when there isn't one the first 110 sectors).
I merged the Windows code for raw ISRCs into this branch now.
@phw: You had a drive that could find ISRCs on a disc with the normal method, but not the raw method. Can you retest this now? Relevant changes: When no ISRCs (even invalid ones) are found in the first 110 sectors, then we don't search for more. So there shouldn't be a 5 minute read. I also added some debug output in bc102db0794fd120c94d0600ccfc25e2c26e9097. Hopefull that will give some output indicating that something with the command itself isn't working (wrong return type, wrong amount of data returned, all data is zero).
The code changs look good and work as expected. Now reading discs without ISRCs finishes fast.
But no news for the bad drive, it still finds no ISRCs with the raw method. Did you add anything new to the debug output? I didn't see anything and I still get the same result for every read:
scsi cmd bytes returned: 0
zero data returned by scsi_cmd
Not sure how to debug that further.
@phw: We fall back to normal reads now when the scsi command doing raw ISRCs fails. Please retest.
Currently warnings are printed (to stderr) for every track where raw ISRCs fail, since we don't actually know why it fails. The command can possibly fail for tracks due to weird offsets given in the TOC or other weird things.
The fallback is implemented in mb_scsi_read_track_isrc_raw() and falls back to mb_scsi_read_track_isrc(). So non-scsi isrc functions could be dropped. However, I don't want to do that until we have more testing done.
I also implemented "real" scsi feature detection now.
@phw: If the command for raw read really isn't available, then only one warning is shown and non-raw isrc reads are used. However, from the specs I got the impression that any device supporting CDs should also support that command. So possibly your problem is a different one. A bug in the driver?
Either way, I would like to know what the output is with your drive. The detection, fall-back and check for zero length data returned are implemented for Linux and Windows. Possibly the results are different (if these are driver issues).
That looks good, good work. Here is the debug output for my bad drive (in case you wonder, I have put the feature detection outside the for loop as I noted in the comments):
data requested, but none returned
Warning: could not fetch features
data requested, but none returned
Warning: could not fetch features
Warning: raw ISRCs not available, using ISRCs given by subchannel read
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
WARNING: can't read subchannel data!
So for that drive it falls through to the old ISRC reading code. As expected my second drive prints no warnings and just works well with the raw SCSI code.
I have not yet tested the bad drive under Linux, I will do that later.
Stalker-X (IRC) has the same warnings on windows:
data requested, but none returned
Warning: could not fetch features
Warning: raw ISRCs not available, using ISRCs given by subchannel read
WARNING: can't read subchannel data!data requested, but none returned
Warning: could not fetch features
WARNING: can't read subchannel data!data requested, but none returned
Warning: could not fetch features
WARNING: can't read subchannel data!data requested, but none returned
Warning: could not fetch features
(using libdiscid-w32-scsi.zip today) (Plextor px-708a and px-716a)
The drive having problems for phw was MATSHITA DVD-RAM UJ892.
I currently think the issues with some drives could be due to alignment problems. I will try to implement gathering and using of the alignment of the adapter.
The same drive that doesn't work for phw on windows, also doesn't work with raw ISRCs on Linux. Scsi commands to work, the read reaw feature is detected, but no ISRCs can be found. No warning is displayed, so the "no data returned" detection either doesn't trigger or doesn't work on Linux.
I should probably check if all returned data is 0 on linux or similar.
Just adding this here: for the most recent MMC specs, google for mmc3r10g (MMC-3), mmc4r05a (MMC-4), mmc5r04 (MMC-5) and mmc6r02g (MMC-6). Later versions mainly add wording for bluray etc, not many actual changes. However, READ SUB-CHANNEL was actually dropped from the standard in MMC-5; it just refers you back to MMC-4. I'm not sure whether this is a case of "it's stable" or "it's only for cd's so just about obsolete", but it may mean that devices are no longer required to support them to comply with the MMC standard.
In my .NET implementation of libdiscid (https://github.com/zastai/MusicBrainz), I switched the Linux implementation to use SG_IO for all requests (i.e. also for the MCN and TOC) since this allowed reusing the MMC-based structures I used on Windows. I tried using SPTI on Windows (so I could share the CDB structures too), but couldn't get it to work; no great loss, and I'll probably try again when the rest of the platforms work. BSD is up next; hopefully that will allow MMC-based operation too (SCIOCCOMMAND seems promising, although it requires write access to the device I think).
Side note 1: BSD (free/open/net) seem to have ioctls for subchannel data too (CDIOCREADSUBCHANNEL). I'll look into them and maybe submit a PR if I get it to work.
Side note 2: libdiscid's device detection for Net/OpenBSD cd devices is suboptimal; it should use getrawpartition() from libutil to find out which letter identifies the "whole disk" partition (in my case, this returns 2 ('c') on openbsd and 3 ('d') on netbsd) and then enumerate /dev/rcdnx based on that (with n between 0 and some maximum; 9 is probably ok). Systems with more than 2 cd devices are probably relatively rare (and in fact require explicitly running MAKEDEV to create the nodes for cd2 and up), but it would be good to support them. When I get the C# suff working, I'll look at updating libdiscid and submitting a PR.