HCADecoder
HCADecoder copied to clipboard
Loop start/end position may be wrong (slightly off)
Currently it seems to be:
- loop start sample offset = loop start block index * 0x400
- loop end sample offset = loop end block index * 0x400
The loop
header section seems to be interpreted as:
- loop start block index: big endian unsigned 32-bit integer
- loop end block index: big endian unsigned 32-bit integer
- loop cycle count (when it equals to 128 it means infinite): big endian unsigned 16-bit integer
- loop r01 (sorry, but I can't understand what "r01" means): big endian unsigned 16-bit integer
However, according to VGAudio:
https://github.com/Thealexbarney/VGAudio/blob/9d8f6ea04c83cccccb3dd7851a631bbd53a8dbbe/src/VGAudio/Codecs/CriHca/HcaInfo.cs#L35
public int LoopStartSample => LoopStartFrame * 1024 + PreLoopSamples - InsertedSamples;
public int LoopEndSample => (LoopEndFrame + 1) * 1024 - PostLoopSamples - InsertedSamples;
https://github.com/Thealexbarney/VGAudio/blob/9d8f6ea04c83cccccb3dd7851a631bbd53a8dbbe/src/VGAudio/Containers/Hca/HcaReader.cs#L189
private static void ReadLoopChunk(BinaryReader reader, HcaStructure structure)
{
structure.Hca.Looping = true;
structure.Hca.LoopStartFrame = reader.ReadInt32();
structure.Hca.LoopEndFrame = reader.ReadInt32();
structure.Hca.PreLoopSamples = reader.ReadInt16();
structure.Hca.PostLoopSamples = reader.ReadInt16();
structure.Hca.SampleCount = Math.Min(structure.Hca.SampleCount, structure.Hca.LoopEndSample);
}
(Sorry for mistyping something above. Now it should be corrected)
I'm not very sure which interpretation of the loop
header is correct. Or maybe both make some sense?
@Thealexbarney
Well, think about it a little. If your first two points are true than no HCA file could have a loop that's not a multiple of 0x400. This would be extremely restrictive and result in funky loop points if they couldn't be multiples of 0x400.
The structure of the loop block is
int LoopStartFrame:
int LoopEndFrame;
short PreLoopSamples;
short PostLoopSamples;
The pre-loop samples are the number of samples in the loop start frame that come before the loop point. The post-loop samples are the number of samples in the loop end frame that come after the loop point.
Or maybe both make some sense? Nah, that decoder's interpretation of those values in the structure is completely wrong.
My decoder/encoder should be completely correct and has been thoroughly tested against CRI's decoders/encoders that are available.
If your first two points are true
Well, actually it's not "my" point - I didn't know HCA at all until I came across https://github.com/y2361547758/hca.js, which is TypeScript port of this project (https://github.com/Nyagamon/HCADecoder). Even for now I still have no idea how stock/official HCA decoder works (which should require reverse engineering).
no HCA file could have a loop that's not a multiple of 0x400. This would be extremely restrictive and result in funky loop points if they couldn't be multiples of 0x400.
That's actually exactly what I had thought of. However I have been unable to imagine where the more accurate loop start/end pointers could be put at, until I came across your project (https://github.com/Thealexbarney/VGAudio).
I think Nyagamon's interpretation of loop header may make some sense because every (although the number is very few) HCA (which is infinitely looped in game) I have examined seems to have loop.PreLoopSamples
== 0x0080. Therefore I guess maybe it makes sense that 0x0080 probably means "loop count is infinite".
By the way, I wonder it's signed or unsigned integers? It doesn't seem to make sense to use nagetive values here.
Even for now I still have no idea how stock/official HCA decoder works (which should require reverse engineering).
I've reverse engineered the HCA encoder/decoder. The implementation in VGAudio is functionally the same as the official one is, producing the exact same data output for both encoding and decoding, so it makes a good reference.
(Note: I replaced the IMDCT implementation CRI uses with a faster one. The only difference in the output from the current master VGAudio build will be due to tiny rounding differences. When using the IMDCT implementation CRI uses the outputs are identical.)
I think Nyagamon's interpretation of loop header may make some sense because every (although the number is very few) HCA (which is infinitely looped in game) I have examined seems to have loop.PreLoopSamples == 0x0080. Therefore I guess maybe it makes sense that 0x0080 probably means "loop count is infinite".
No, that's because of how the encoder works. The encoder inserts a subframe of audio at the start because decoding a subframe requires some of the data from the previous subframe. Then the encoder adds enough samples to align the loop start to the beginning of a frame so the minimum amount of processing is needed to seek to the loop point.
This results in the loop start being one subframe past the start of a frame since the decoder needs data from the previous subframe to decode the next one.
BTW, be sure to account for the InsertedSamples
and AppendedSamples
when doing everything. These are empty samples added to the beginning and end of the actual audio because encoding to HCA requires the number of samples to be a multiple of the frame size.
By the way, I wonder it's signed or unsigned integers? It doesn't seem to make sense to use nagetive values here.
They're signed, but it doesn't really matter since they won't get anywhere near the limit of the signed types.
Thank you very much!
decoding a subframe requires some of the data from the previous subframe.
-
Is a subframe 128-sample long (and, a frame consists of 8 subframes)? I once observed this in hex editor but I'm still not sure how long the "influence" would last.
-
Is any successive (following) data also needed to decode one frame? I once heard that (I)MDCT has reference to both previous and successive data.
To be honest I know almost nothing about signal processing etc... I feel sorry if my noob questions occupied your time. However these two questions should be the last ones I want to ask.
Again, thanks a lot!
- Is a subframe 128-sample long (and, a frame consists of 8 subframes)?
Yes. This is true for all HCA files.
Side thought: Oops, I just noticed that naming inconsistency
public const int SubframesPerFrame = 8;
public const int SubFrameSamplesBits = 7;
- Is any successive (following) data also needed to decode one frame? I once heard that (I)MDCT has reference to both previous and successive data
Oversimplifying enough to answer your question, decoding subframe(SF) N requires only the encoded data from both SF N-1 and SF N. It doesn't need any data from any other SF, so it doesn't need data from either SF N-2 or SF N+1.
For example, decoding the audio in subframe 4 requires only the encoded data from both SF 3 and SF 4. It doesn't need any data from SF 2, SF 5 or any other SF.
This is why an extra subframe is inserted at the beginning during encoding and thrown out as garbage when decoding.