pd-lua icon indicating copy to clipboard operation
pd-lua copied to clipboard

preserve signal pointers

Open ben-wes opened this issue 1 year ago • 29 comments

closes #62

... not sure if this is the best way to solve the issue - but it was obviously not a good idea to forward the whole t_signal struct pointer to the perform function since those pointers seem to possibly get overwritten there.

this PR adds a new sig_info struct to the object struct which holds the vector pointer and channel count of the signals.

keeping this as a draft for now. it fixes the issue for me - but it might not be a good solution?

ben-wes avatar Oct 17 '24 19:10 ben-wes

obviously, i'll need to put back the compile-time checks for PD_MULTICHANNEL - will do.

ben-wes avatar Oct 17 '24 21:10 ben-wes

put back the compile-time checks for PD_MULTICHANNEL

done. tested with pd0.53-2.

ben-wes avatar Oct 17 '24 21:10 ben-wes

This doesn't work for me. The basic multichannel.pd example (included in pdlua/examples/multichannel) now fails with this error message:

test~.pd_lua: 9: signal_setmultiout: invalid signal pointer. must be called from dsp method

I compiled against vanilla 0.55.1, but I get the same error when compiling and running with purr-data.

Now there might be something wrong with the example, but it's a very simple example and worked without a hitch before.

agraef avatar Oct 17 '24 22:10 agraef

i'll check. thanks for the feedback!

ben-wes avatar Oct 17 '24 22:10 ben-wes

should be fixed. x->sp assignment in dsp was missing after i removed it in a previous attempt - things got botched a bit here. hope it's ok now - but i'll check once more tomorrow.

ben-wes avatar Oct 17 '24 22:10 ben-wes

Ok, thx, looks good to me now. Maybe @timothyschoen can have a look at well. Tim?

agraef avatar Oct 17 '24 23:10 agraef

i'm unsure about the getbytes and freebytes of x->sig_info, i admit. see here for example:

https://github.com/agraef/pd-lua/pull/63/files#diff-be05f819a0338ff6475de5edabb88815ee6956c1944dd154823801493d4b4539R1189-R1193

probably, it doesn't make sense to free the current size in the dsp function - this should be the previous size, right? i'd change this then if that sounds right.

ben-wes avatar Oct 18 '24 00:10 ben-wes

Yes, so you also need to keep track of the allocated size somewhere. AFAICT, that's for Pd's own internal tracking of allocated memory if it was compiled with DEBUGMEM.

Or use calloc/free instead. That's what Pd itself uses internally, but it won't allow Pd to track the memory usage.

agraef avatar Oct 18 '24 04:10 agraef

need to keep track of the allocated size somewhere

done with 67f679f

ben-wes avatar Oct 18 '24 06:10 ben-wes

Works for me. Not sure what to test with the example in #63, though. Did you check that the clone stuff still works?

agraef avatar Oct 18 '24 09:10 agraef

Did you check that the clone stuff still works?

yep. it does behave as expected now for me. what i'm doing in this demo here crashed pd with the previous version if only the second subpatch was connected:

https://github.com/user-attachments/assets/b7657f9c-5cb5-4604-96a1-9a7b7f123f12

ben-wes avatar Oct 18 '24 09:10 ben-wes

Hold the horses. In purr-data, the test-crashd.pd from #63 actually crashes when closing the patch:

free(): invalid pointer
Pd: signal 6

This doesn't happen in vanilla, so I suspect that there's some difference between purr-data's clone and vanilla's. But I don't know how to debug this. :(

agraef avatar Oct 18 '24 09:10 agraef

The patch in your video looks different from the one you posted in #63. Can you please provide it? Does it use any 3rd party external?

agraef avatar Oct 18 '24 09:10 agraef

i think i might have freed that sig_info array in the wrong context (pdlua_free instead of pdlua_object_free). changed with 5027f71

will attach this new test in a minute. hold on ...

ben-wes avatar Oct 18 '24 09:10 ben-wes

this should include the 3 required files (test patch, clone patch and lua object):

ben-wes avatar Oct 18 '24 09:10 ben-wes

Well, test-crashd.pd also crashes in purr-data without your changes, so there's probably something wrong in purr-data's clone.

But your most recent change, moving the deallocation to pdlua_object_free, looks reasonable anyway.

agraef avatar Oct 18 '24 09:10 agraef

Thanks for the updated test and the video, that clarifies things.

However, try as I might, I can't reproduce the crash that you described with Pd 0.55-1 running stock pd-lua 0.12.22.

Do you have a screencast of Pd actually crashing without the changes in this PR?

agraef avatar Oct 18 '24 10:10 agraef

Your new example also seems to work fine for me in purr-data with pd-lua 0.12.22. (Apart from crashing purr-data when I close the patch, which is obviously a separate issue in purr-data's clone; I'll look into this asap.)

agraef avatar Oct 18 '24 10:10 agraef

Do you have a screencast of Pd actually crashing

there you go:

https://github.com/user-attachments/assets/969c556c-21dd-45ab-abd2-1e0f570feced

... it's well possible that my fixes here aren't complete btw. the root problem causing this PR was that (seemingly) when i forwarded just the sp pointer via dsp_add (which is the case in 0.12.22), its data was not always as expected (and presumably caused the crashes) in the perform function.

maybe, the current solution is also not good and i should go back to dsp_addv and forward the signal vectors and a channel counts array?

ben-wes avatar Oct 18 '24 10:10 ben-wes

maybe, the current solution is also not good and i should go back to dsp_addv and forward the signal vectors and a channel counts array?

@agraef : i don't think it'll make a difference in your tests - but anyway: could you give the version in https://github.com/ben-wes/pd-lua/tree/fix/stablesignals_simple a try?

ben-wes avatar Oct 18 '24 12:10 ben-wes

there you go:

Thanks, I can reproduce the crash now.

could you give the version in https://github.com/ben-wes/pd-lua/tree/fix/stablesignals_simple a try?

That branch is broken for me, in both vanilla and purr-data. Specifically, the osci3d~ example suddenly doesn't work any more. Instead, it spits out this strange error message as soon as you turn on dsp:

lua: clock dispatcher: osci3d~.pd_lua: 133: attempt to perform arithmetic on a nil value (local 'y')

The fix/stablesignals branch doesn't do that. But admittedly I didn't really test that branch very thoroughly either so far.

So I think that your PR needs a lot more scrutiny than I can muster right now. I'm away over the weekend, so I'll keep this PR open until I return. I'd recommend that you really put the PR through its paces during this time, testing all existing functionality, including older single-channel, and even non-signal pd-lua code, pdx live-coding etc., anything you can think of. It doesn't matter whether breakage results from the basic multi-channel support or this PR, our aim should be to really get it right this time, ok?

agraef avatar Oct 18 '24 13:10 agraef

Code looks good to me, except I think that we're leaking sig_info when the object is deleted. But other than that, I think this good!

I'll test this in plugdata now.

timothyschoen avatar Oct 18 '24 14:10 timothyschoen

It works inside plugdata too. I've tested with AddressSanitizer enabled, no issues :)

timothyschoen avatar Oct 18 '24 14:10 timothyschoen

I think that we're leaking sig_info when the object is deleted. But other than that, I think this good!

I don't think we do, have a look at pdlua.c:1934.

agraef avatar Oct 18 '24 16:10 agraef

I also found and fixed the purr-data clone issue (which, as I suspected, was unrelated).

Will do some more testing of this PR with the other examples tonight if I find the time.

Ben, what's your take, is this ready to be merged? Or would you prefer to have another go at the simplified version?

agraef avatar Oct 18 '24 16:10 agraef

It works inside plugdata too. I've tested with AddressSanitizer enabled, no issues :)

Good to know. Thanks for testing!

agraef avatar Oct 18 '24 16:10 agraef

would you prefer to have another go at the simplified version?

oh! glad to see how things are moving here and thanks to both of you! ... i made one more force push on the other branch which is just this commit ahead of the one we're discussing here: https://github.com/agraef/pd-lua/commit/cbe4d87928af48bebe4ee279ff5c790ca1105252

... i would prefer that other version for simplicity since that sig_info struct feels a bit overcomplicated for what i wanted to achieve. so if that one looks good to you, i'd add it to this branch here as well for the PR. both versions are fine for me though and might have advantages (maybe this one here is more "readable").

ben-wes avatar Oct 18 '24 17:10 ben-wes

ah ... i didn't test everything yet probably - but i made builds with and without mc (against pd0.53-2 and 0.55-1) and tested with these versions as well. all 4 vanilla tests worked for me here.

ben-wes avatar Oct 18 '24 17:10 ben-wes

i made one more force push on the other branch which is just this commit ahead of the one we're discussing here: https://github.com/agraef/pd-lua/commit/cbe4d87928af48bebe4ee279ff5c790ca1105252

But I already tested this and found it to break osci3d~, see above.

agraef avatar Oct 18 '24 18:10 agraef

But I already tested this and found it to break osci3d~

sorry, I didn't make that clear enough. I reproduced that issue and the force-pushed version includes a fix for this. at least here, osci3d worked then.

ben-wes avatar Oct 18 '24 19:10 ben-wes