wav icon indicating copy to clipboard operation
wav copied to clipboard

Write() only accepts an IntBuffer?

Open asciifaceman opened this issue 5 years ago • 13 comments

Having only a truncated IntBuffer feels extremely limiting, writing wav should accept a Float buffer - otherwise why are you using a lossless format.

asciifaceman avatar May 01 '19 05:05 asciifaceman

I agree that we should support writing a float buffer, at the time I wrote the API that wasn't my main concern since I mainly deal with int PCM data and couldn't find an elegant API. I am more than open to suggestions, as a matter of fact we have/had an issue for that. Your input would be greatly appreciated.

However, I think your argument about lossless vs lossy format is bit off. A lossy format would compress data resulting in signal loss. That's not the same problem as ints vs floats which offer more resolution/dynamic and only when the source is a float PCM.

mattetti avatar May 01 '19 05:05 mattetti

I suppose my argument was that the truncation of the float intrinsically affects the resultant waveform since the rounding effect changes (to a factor of a whole integer vs. a floating sub whole number round) - but poorly worded. Loss and increased noise floor can come from rounding errors without encoding, and floats contain a higher dynamic range inherently (a higher noise floor can be introduced by a bad int conversion).

I've been thinking about this, doing a lot of playing with the azul3d audio engine. Their floating model involves (typically) a PCM range from -1 to 1, however I've seen this broken on the occasional wav file. However, they are silently truncating to int16 and was unfinished / left in a half state.

I suppose an intermediary step would be to do similarly, accept a buffer of any (acceptable) types and then silently convert to int in the Writer to at least allow support to start - then investigate if they can be written raw.

var pi float64 = math.Pi
binary.Write(buf, binary.LittleEndian, pi)

binary should be able to LE float64s too , according to godoc

asciifaceman avatar May 01 '19 06:05 asciifaceman

Yes we also have a transform that converts the format but I agree that we should support f32 wav/aiff files. It's more a matter of API than anything else.

mattetti avatar May 01 '19 06:05 mattetti

Here is a similar conversation with more details about how to convert https://github.com/go-audio/audio/issues/18

mattetti avatar May 01 '19 06:05 mattetti

I'm kind of working on the interim solution in a fork, and there might be a change required to the upstream audio package.

My idea is to default to float64 and a new Interface for Data and using getter/setters so you only have to deal with its actual type once you get to the switch statements for adding buffer. Truncation only has to happen if you leave float64. I can whip up some introductory PRs, starting with the changes to the audio library so you can see what I mean (I realize this would be cascading changes, just want to hopefully show the idea)

asciifaceman avatar May 01 '19 06:05 asciifaceman

One thing to keep in mind is that float64 is extremely rare in audio and it might be annoying/weird to ask to copy a float32 buffer to float64 just because of the API. I'd love to see what you have I in mind, also be careful about performance. If most is cases rely on ints, we want to be careful about not using a much slower data type (especially on devices like the Pi)

mattetti avatar May 01 '19 06:05 mattetti

Hmm, I would wonder if float32 default would be better then.

I would argue that float32 is considered faster for processing audio, which is why most DAWs (including audacity / protools / ableton etc) has migrated to Float PCM data from fixed integers over the years.

Obviously I look forward to testing this, vs appealing to authority.

asciifaceman avatar May 01 '19 06:05 asciifaceman

Also it could mean collapsing the multiple buffers into a single buffer that just accepts Data interface

asciifaceman avatar May 01 '19 06:05 asciifaceman

Realized quickly how uphill of a battle it would be to refactor, so I'm going to instead try to spike an alternative as an example to show you sometime in the near future when it's done

asciifaceman avatar May 01 '19 07:05 asciifaceman

I had made some changes in package aiff but it didn't work well . After encoding I heard lots of noise. Do you have any ideas? https://github.com/chfanghr/aiff/blob/master/encoderf32.go

chfanghr avatar May 04 '19 02:05 chfanghr

I don't see you using the PCMScale transform defined here: https://github.com/go-audio/transforms/blob/master/pcm_scale.go#L27

You have values that are between -1 and +1 as floats and you want to encode them as int based on the bitdepth set on the encoder. You therefore need to convert/transform your buffer to scale it to the int range. That's what this function does.

mattetti avatar May 04 '19 05:05 mattetti

Addendum:

the more I think about it, I still think the interface should be generalized to accept any of the options. While I personally believe float should be the base format the lib operates in, to maintain parity with modern audio software, at least accepting and silently dealing with f32 would be good so the operator could choose what they were working with at least. Changing the underlying structure to f32 would be daunting (I've spent my time reading through and seeing just how wide of an impact that would be and it basically rewrites your entire library)

Conclusion (imho):

Write() should accept a buffer interface, discover type and convert silently as needed - preferably starting from float since it has the highest precision factoring for rounding errors. I opened a PR against go-audio/audio implementing a Type() method on the Buffer interface so that packages such as Wav can discover the type to prevent needless conversions. Then, within wav it would be implemented something like:

// Write encodes and writes the passed buffer to the underlying writer.
// Don't forger to Close() the encoder or the file won't be valid.
func (e *Encoder) Write(buf audio.Buffer) error {
	if !e.wroteHeader {
		if err := e.writeHeader(); err != nil {
			return err
		}
	}

	var intBuf audio.IntBuffer
	// Quietly convert to int
	if buf.Type() != reflect.Int {
		intBuf := buf.AsIntBuffer()
	} else {
		intBuf = buf
	}

	

	if !e.pcmChunkStarted {
		// sound header
		if err := e.AddLE(riff.DataFormatID); err != nil {
			return fmt.Errorf("error encoding sound header %v", err)
		}
		e.pcmChunkStarted = true

		// write a temporary chunksize
		e.pcmChunkSizePos = e.WrittenBytes
		if err := e.AddLE(uint32(42)); err != nil {
			return fmt.Errorf("%v when writing wav data chunk size header", err)
		}
	}

	return e.addBuffer(intBuf)
}

What do you think Matt? I tried to consider the least amount of changes to at least support Write() accepting FloatBuffer without actually changing much - I suppose a first step in the right direction

asciifaceman avatar May 04 '19 06:05 asciifaceman

@mattetti I think you misunderstood what I try to do. According to aiff documentation aiff can store pcm data in float32 . So what I tried to do is to directly store float32 pcm data into a aiff file implemented in encodef32.go as I mentioned above.

chfanghr avatar May 05 '19 01:05 chfanghr

Best regards, can I ask a simple question here? Why is my WAV not processing?

`package main

import ( "fmt" "log" "os"

"github.com/go-audio/audio"
"github.com/go-audio/wav"

)

func main() {

file, err := os.Open("test2.wav")
if err != nil {
	log.Fatal(err)
}
defer file.Close()

wavDecoder := wav.NewDecoder(file)
_, err = file.Seek(0, 0)
if err != nil {
	fmt.Println("Error opening the WAV file:", err)
	return
}

buffer := audio.IntBuffer{}
samples, err := wavDecoder.PCMBuffer(&buffer)
if err != nil {
	fmt.Println(err)
	return
}

floats := make([]float64, len(buffer.Data))
for i := 0; i < len(buffer.Data); i++ {
	floats[i] = float64(buffer.Data[i])
}
fmt.Printf("Is this file valid: %t\n", wavDecoder.IsValidFile())
fmt.Printf("PCM data was accessed: %t\n", wavDecoder.WasPCMAccessed())
fmt.Println(samples)
fmt.Println(len(buffer.Data))
fmt.Println(floats)

} ` output: Is this file valid: true PCM data was accessed: true 0 0 []

segur7g avatar Mar 10 '23 02:03 segur7g

I haven't touched this issue in a long time and it's being misused now so I'm going to close it.

Since I opened this issue go generics have come out with 1.18 which would be a much better way to go about what was being discussed here than anything I came up with back then. I might revisit in the future and if I do it will be in another issue that is more targeted.

asciifaceman avatar Mar 10 '23 03:03 asciifaceman