Adobe-Runtime-Support icon indicating copy to clipboard operation
Adobe-Runtime-Support copied to clipboard

AIR Runtime on macOS corrupts SecureRTMPChannel

Open busitech opened this issue 1 year ago • 62 comments

The AIR Runtime corrupts SecureRTMPChannel connections on macOS, but not on Windows. Our application makes RemoteObject calls to the server over this channel, which begins to hang forever on macOS after a few transactions. The initial connection is usually successful, and sometimes even a few successful remote calls can be made before the hang occurs.

Compiler: AIR SDK 32.0.0.116 AIR Runtime: 32.0.0.116 to 50.2.1.1 Platform: macOS Ventura 13.3.1

busitech avatar Apr 26 '23 04:04 busitech

@ajwfrost Is there a particular reason this serious bug report has received zero interaction in almost one month's time?

busitech avatar May 23 '23 02:05 busitech

Hi

Apologies for the delay, we did see this one but there have been a lot of other activities going on; there is quite an effort to set up a reproduction for this issue so it's not yet been investigated. If you're able to provide any sample code/SWF file that we could use to run this on a mac and to make it easier to reproduce, it would greatly help.

thanks

Andrew

ajwfrost avatar May 23 '23 05:05 ajwfrost

Hello @ajwfrost,

Thank you very much for your reply. Please help us understand what you have, and what you are missing, so we can help fill in the gaps so you can quickly recreate this.

When a fresh build of the AIR Runtime is compiled, are there regression tests of the RTMP and RTMPS communication protocols that must pass before release? If so, I am wondering, do these tests actually connect out and transfer data as part of the test? In our tests, almost everything fails, which is why I was hoping this would be easy to recreate with only the bare minimum of existing tests that you might already have at your disposal.

I think the fix for the issue will be in the low level RTMPS communication routines of the AIR Runtime, in an area of the code which is specific to macOS, because the application runs perfectly on Windows. RTMP works perfectly on macOS, but RTMPS hangs almost immediately, sometimes not even completing the handshake that bootstraps the encryption by negotiating the cipher suite and exchanging keys. Therefore, the problem must be in the area of encryption and decryption, since RTMPS is just the encrypted version of RTMP.

In order to actually use RTMPS the way we do, you would need a server that speaks the protocol, and the SWC that activates the RTMPS protocol in the AIR Runtime. Do you have a preference in this regard?

In our case, this layer is the Adobe LiveCycle Data Services (LCDS) product, and of course, we only have the binary version. Do you have access to the Adobe LCDS product, and if so, do you use it when regression testing the AIR Runtime?

Adobe LCDS was distributed with sample applications. I haven't tried those yet, but I assume that those will be a good starting point for recreating the issue quickly, if you can use LCDS on your end.

We are ready to assist once we understand what you need to proceed.

Thank you,

Matthew

busitech avatar May 24 '23 04:05 busitech

Hi

In order to actually use RTMPS the way we do, you would need a server that speaks the protocol, and the SWC that activates the RTMPS protocol in the AIR Runtime. Do you have access to the Adobe LCDS product?

This is our main problem... we do not have any instance of LiveCycle or AMS which means that we aren't able to run the RTMP based testing. It's an area we've not really touched and it's not actually widely used now..

I notice that you've mentioned the issue also existed in the 32.0 runtime, which at least makes sense in terms of it not being a regression that we'd inadvertently introduced. Given your descriptions above, probably the best thing is if I can talk to Adobe and see if they can support us with LiveCycle, or if this is also something we could test with AMS then we can talk with Veriskope..

Although it might actually be that the quickest/best option is for us to work directly with you for debugging, if you have the environment and reproduction already set up. Your analysis has highlighted the particular area of code where we'd need to do the digging so this could be an option perhaps? If we sent you a build of the macOS runtime that output a lot of logging for this, we may be able to pinpoint the issue or progressively narrow it down to fix it..?

thanks

Andrew

ajwfrost avatar May 24 '23 09:05 ajwfrost

Yes, we'd be happy to run the debug build here and send you the logs produced.

busitech avatar May 24 '23 09:05 busitech

@ajwfrost I have more information for you. There are actually two bugs. The client side is falling back to RTMPT because of a second Adobe server side bug that prevents RTMPS from connecting. It is the RTMPT fallback strategy which does not work well on the macOS side, while the Windows platform falls back to RTMPT successfully.

busitech avatar Jun 01 '23 07:06 busitech

Okay thanks -> curious as I'm not 100% sure what code there is platform-specific. We'll look into it and try to get you a debug build of the macOS runtime shortly..

ajwfrost avatar Jun 05 '23 08:06 ajwfrost

@ajwfrost It has been close to 6 months since your last message about getting us a debug build. We were looking forward to using your debug build of AIR for macOS to assist you with debugging.

As the macOS operating system continues to advance, the AIR runtime for macOS seems to be getting worse as time goes on. RTMP and RTMPS connections are becoming unstable and practically usable.

Is it intended that these protocols will be maintained?

busitech avatar Nov 29 '23 02:11 busitech

@busitech apologies for that! This one had completely fallen off our radar I'm afraid, do please feel free to chase up on things. From our perspective we would have hoped that these protocols would continue to work as-is, so if there are changes in the OS that are impacting them, then I'd hope we can work around them. The other option we'd be looking at is to pull the RTMP-related functionality out of the private Adobe source code so that it could be implemented in a separate, open source library (and kept up to date with what Veovera are doing).

Anyway -> let me kick this activity off again and try to get you something that outputs more details around this area...

thanks

ajwfrost avatar Nov 29 '23 20:11 ajwfrost

Hi @busitech

Would you be able to download a test version of the AIR runtime from the below link, which has a fair amount of debug trace added around the RTMP/RTMPS functionality. This may not immediately help, but it should show us whereabouts things are going wrong so that we can add some more targeted debugging in the specific area. https://transfer.harman.com/message/3XvxIKiDFG0LJkNN3lp2K0

thanks

ajwfrost avatar Dec 12 '23 15:12 ajwfrost

Hello @ajwfrost

Thank you for sending the test version of the AIR runtime. It produced the debug trace log shown below:

[0:02.937] Starting mac network detector on thread 0x600002358180
[0:21.666] NetConnection - connection request for rtmps://rtmpserver:2443
[0:21.666] Connection: local 0, protocol 4, domain rtmpserver, app name 
[0:21.666] NetConnectionIO -> connect rtmpserver:2443, protocol 4
[0:21.666] Proxy type is best
[0:21.666] Broadcast fpad on port 67
[0:21.666] Net connection succeeded
[0:21.668] NetConnection IO thread 0x6000023c6140 server rtmpserver port 2443 protocol 4
[0:21.668] NetConnection IO - connecting with RTMPS, native 1 (1), try connect 0 (1)
[0:21.668] OSX SSL socket connecting(rtmpserver, 2443, 0)
[0:21.698] Sending SWF hash 0 0x3bdb44f79db0
[0:21.721] OSX SSL socket initating SSL(rtmpserver)
[0:21.743] Sending SWF hash 0 0x3bdb44f79db0
[0:21.785] Sending SWF hash 0 0x3bdb44f79db0
[0:21.826] Sending SWF hash 0 0x3bdb44f79db0
[0:21.827] OSX SSL socket - SSL connection results 8
[0:21.827] NetConnectionIO::DoConnect rtmps protocol, nagle 0, native SSL(1), connect failed.

[0:21.827] SSL result = (null)
[0:21.827] NetConnection IO - connecting with RTMPS, native 1 (1), try connect 1 (1)
[0:21.827] NetConnectionIO::DoConnect rtmps protocol, CONNECT method proxy, tunnel open failed.

[0:21.827] SSL result = (null)
[0:21.827] NetConnection IO - connecting with RTMPS, native 0 (1), try connect 0 (1)
[0:21.869] Sending SWF hash 0 0x3bdb44f79db0
[0:21.909] Sending SWF hash 0 0x3bdb44f79db0
[0:21.952] Sending SWF hash 0 0x3bdb44f79db0
[0:21.993] Sending SWF hash 0 0x3bdb44f79db0
[0:22.031] NetConnection IO connected=0
[0:22.031] Sending SWF hash 0 0x3bdb44f79db0
[0:36.702] NetConnection - connection request for rtmps://rtmpserver:2443
[0:36.702] Connection: local 0, protocol 4, domain rtmpserver, app name 
[0:36.702] NetConnectionIO -> connect rtmpserver:2443, protocol 4
[0:36.702] Proxy type is http
[0:36.702] Broadcast fpad on port 67
[0:36.702] Net connection succeeded
[0:36.702] NetConnection IO thread 0x6000023c7340 server rtmpserver port 2443 protocol 4
[0:36.703] NetConnection IO - connecting with RTMPS, native 0 (1), try connect 0 (1)
[0:36.742] Sending SWF hash 0 0x3bdb44f79db0
[0:36.783] Sending SWF hash 0 0x3bdb44f79db0
[0:36.827] Sending SWF hash 0 0x3bdb44f79db0
[0:36.869] Sending SWF hash 0 0x3bdb44f79db0
[0:36.869] NetConnection IO connected=0
[0:36.909] Sending SWF hash 0 0x3bdb44f79db0
[0:36.911] [FaultEvent fault=[RPC Fault faultString="Send failed" faultCode="Client.Error.MessageSend" faultDetail="Channel.Connect.Failed error undefined url:'rtmps://rtmpserver:2443'"] messageId="89FCA69C-730E-CDF9-0343-8A7A915E0979" type="fault" bubbles=false cancelable=true eventPhase=2]

busitech avatar Dec 21 '23 09:12 busitech

Thanks for that. So it looks like we're first trying to connect using "native" SSL within the RTMP handling code, but that this fails: OSX SSL socket - SSL connection results 8 So then we try to connect via HTTPS, we've not go so much logging in that part but I suspect it's the same issue as caused a problem for the RTMPS.

Result 8 is a self-signed certificate... we did change some handling here a while ago for people to be able to decide for themselves whether to accept self-signed certs. Looking at the code here though, it originally accepted this (within the RTMPS handling) and we have the same override. So actually, with an SSL connection result of 8, we should have accepted that as connected...

Although I think I've also just spotted a separate bit of code that has picked up that response value and causing the socket to shut down...

We can review this part again and see if there's an issue with the connection being killed off due to the self-signed certificate on the server.

thanks

ajwfrost avatar Dec 21 '23 18:12 ajwfrost

@ajwfrost We won't be able to work on the problem this issue was opened to solve until this new problem of not being able to connect at all in 50.2.4.0 is resolved. We opened this issue when 50.2.1.1 was still current, and that version would connect just fine but then the connection would soon go down randomly after a few minutes. We are eager to work on that issue, but this new issue in the later build will need to come first since it prevents connection. I might be able to set up a server with a real certificate and possibly get a head start on that issue, if that allows 50.2.4.0 to connect. If it would be easy for you to send a debug version of 50.2.1.1, that would also work. Thank you!

busitech avatar Dec 21 '23 19:12 busitech

Okay thanks, understood - I did wonder at this because you'd mentioned it was a more sporadic issue and had been around for a while, sorry I'd missed the fact that we'd just made it worse and immediately reproducible!

I'll upload an updated version of the latest runtime that allows the SSL connection, and we can then start delving into why it may be dropping after a while.

thanks

ajwfrost avatar Dec 21 '23 19:12 ajwfrost

Hi again Here's a version that should work with self-signed certificates on the server.. at least on macOS. Looking at it, I think the same thing may now be happening on Windows too, we can correct for that in our next proper release though. thanks https://transfer.harman.com/message/5HdEE9yYMHhk8JaZYpMUSf

ajwfrost avatar Dec 21 '23 22:12 ajwfrost

@ajwfrost It looks like no progress was made with today's version. Nevertheless, thank you for giving it a shot so quickly.

[0:05.424] Starting mac network detector on thread 0x600003c74300
[0:11.003] NetConnection - connection request for rtmps://localhost:2443
[0:11.003] Connection: local 0, protocol 4, domain localhost, app name 
[0:11.003] NetConnectionIO -> connect localhost:2443, protocol 4
[0:11.003] Proxy type is best
[0:11.003] Broadcast fpad on port 67
[0:11.003] Net connection succeeded
[0:11.006] NetConnection IO thread 0x7000020a5000 server localhost port 2443 protocol 4
[0:11.006] NetConnection IO - connecting with RTMPS, native 1 (1), try connect 0 (1)
[0:11.006] OSX SSL socket connecting(localhost, 2443, 0)
[0:11.006] OSX SSL socket initating SSL(localhost)
[0:11.039] Sending SWF hash 0 0x21bf1b101db0
[0:11.081] Sending SWF hash 0 0x21bf1b101db0
[0:11.081] OSX SSL socket - SSL connection results 8
[0:11.081] OSX SSL socket - CLOSING
[0:11.081] TCP connecting via SSL -> 8
[0:11.081] NetConnectionIO::DoConnect rtmps protocol, result 8, nagle 0, native SSL(1), connect succeeded.

[0:11.081] NetConnection IO connected=1
[0:11.081] NetConnection IO - starting read thread 0x700002128000
[0:11.082] NetConnection IO - bytes written = 1, message 0x0
[0:11.121] Sending SWF hash 0 0x21bf1b101db0
[0:26.042] NetConnection - connection request for rtmps://localhost:2443
[0:26.042] Connection: local 0, protocol 4, domain localhost, app name 
[0:26.042] NetConnectionIO -> connect localhost:2443, protocol 4
[0:26.042] Proxy type is http
[0:26.042] Broadcast fpad on port 67
[0:26.042] Net connection succeeded
[0:26.043] NetConnection IO thread 0x7000020a5000 server localhost port 2443 protocol 4
[0:26.043] NetConnection IO - connecting with RTMPS, native 0 (1), try connect 0 (1)
[0:26.080] Sending SWF hash 0 0x21bf1b101db0
[0:26.121] NetConnection IO connected=0
[0:26.121] Sending SWF hash 0 0x21bf1b101db0
[0:26.121] [ChannelFaultEvent faultCode="Channel.Connect.Failed" faultString="error" faultDetail="undefined url:'rtmps://localhost:2443'" channelId="my-rtmps" type="channelFault" bubbles=false cancelable=false eventPhase=2]

busitech avatar Dec 22 '23 01:12 busitech

In the mean time, I have installed a CA-signed certificate on a server, successfully connected, and was able to reproduce another issue, but I'm not sure if it is a new issue, or our original main issue.

What I found easy to reproduce today is an infinite loop and hang when a debugger is connected, such as IntelliJ IDEA, and RemoteObject calls are made. I reproduced this issue when connecting with both RTMPS and RTMP using the AIR Runtime v50.2.4.0. Previously, when using the AIR Runtime v50.2.1.1, we were able to revert to RTMP to avoid our main issue. This makes me wonder if this could be a new issue, or just a really easy way to get the same issue to happen, that now affects both network protocols.

In any event, this one is extremely consistent, I can repro in two minutes.

I hope what is below helps you get closer to the root cause, and we can test further as needed.

After RTMP or RTMPS connection, I see this line in the console being logged continuously, even though no network traffic is occurring (the message implies sending). You will notice how these messages are milliseconds apart:

[0:42.666] Sending SWF hash 0 0x19403743e348
[0:42.708] Sending SWF hash 0 0x19403743e348
[0:42.751] Sending SWF hash 0 0x19403743e348
[0:42.793] Sending SWF hash 0 0x19403743e348
[0:42.836] Sending SWF hash 0 0x19403743e348
[0:42.877] Sending SWF hash 0 0x19403743e348

After one or two successful server RemoteObject calls, all debug messages stop being logged, and CPU utilization goes sky high on the adl process. There is apparently an infinite loop in progress now, inside the runtime. The server has just finished responding to a request from the client, and the client received the message, processed it, and updated the UI. The bug is sensitive to timing, because the same number of tries is not always

The debugging functions are stopped or hung now, and the app is totally malfunctioning.

While in this state, the application is still able to send and receive Command messages (subscribe, unsubscribe) in response to certain UI events, and the server acknowledges, and even one remoting call from every remote object will send, but incoming remote object messages seem to be going into a queue that is no longer processing what is being added to the queue.

When the mouse hovers over the grid control, the background color change will follow the mouse as it usually does, so the UI itself is not frozen. The app seems hung with respect to incoming messages. Any UI events which normally fire more outbound network traffic will do so, and the server even responds, as long as the object that sends the message does not have an outstanding request. Once a given remote object has sent a message that expects a response, the response is received from the server, but since the message never gets processed, that remote object becomes hung too.

The only thing the debugger can do now is kill the app. Pause doesn't work, and of course breakpoints don't work either.

I am having trouble reproducing this if I start adl from the command line, but since that one trace message is constantly being logged, the logging fills up Scout's memory over time, so we don't have a very good shot at using Scout over a long period of time. I turned off some of the unnecessary recording just to get more time, and I'm going to leave a session run overnight and see what happens.

Here are the last trace messages that are seen in the console:

[trace] Sending SWF hash 0 0x3da12ed4c348 <---- 1,000's of these messages above....
[trace] Sending SWF hash 0 0x3da12ed4c348
[trace] Sending SWF hash 0 0x3da12ed4c348
[trace] Sending SWF hash 0 0x3da12ed4c348
[trace] Sending SWF hash 0 0x3da12ed4c348
[trace] Sending SWF hash 0 0x3da12ed4c348
[trace] NetConnection call via data stream 512 bytes
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection IO - bytes written = 1, message 0x6000017f4980
[trace] NetConnection IO - read 4096 bytes
[trace] NetConnection IO - read 4096 bytes
[trace] NetConnection IO - read 2350 bytes
[trace] NetConnection call via data stream 512 bytes
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection IO - bytes written = 1, message 0x60000178f340
[trace] NetConnection IO - read 156 bytes
[trace] NetConnection call via data stream 512 bytes
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection IO - bytes written = 1, message 0x0
[trace] NetConnection call via data stream 512 bytes <----This is the last message that will be logged.

busitech avatar Dec 22 '23 07:12 busitech

Thanks for the quick feedback! An updated version again: https://transfer.harman.com/message/lBH7PdBBpGHoRiiCkYvAim This one should hopefully correct the RTMPS ssl set-up... apologies for not quite getting that right last time!

The "Sending SWF hash" is actually not sending a hash (the '0' value stops this) so I've moved that so it's only reported if we do actually send a hash. This forms part of the 'idle cycle' i.e. the stuff we check within the runtime's main thread loop which also drives the AS3 events and animation cycle, hence it appearing quite a lot!

The issues you're describing sound like something is stuck within the 'read' thread, i.e. we are receiving data, maybe adding it to a queue but not processing that, and maybe then something is in a tight loop...

Would it be possible to enable core dumps on your machine, and then whilst it's hanging, kill the process from a terminal? you should be able to click on "reply" on the above message link to be able to then upload a dump file to return so that we can check the call stacks...

thanks

ajwfrost avatar Dec 22 '23 16:12 ajwfrost

You have RTMPS working with this update, thank you! I have a full core dump saved (9.5GB), and I have sent you the backtrace of all threads, by replying to the message. Please let me know if you need more information.

busitech avatar Dec 23 '23 00:12 busitech

@ajwfrost I believe we have the intermittent issue recreated, and the log messages captured have explained more about what is going on under the covers.

The NetConnection logs an "Address changed" event, but the address being connected to during our test was localhost, and this address never changes.

[57:53.356] NetConnection IO - processing read buffer, message 0x6000009bdc40, remaining 0
[57:53.356] NetConnection IO - message id 17, stream 0
[58:09.221] Address changed (thread %p)    <---- Impossible for localhost !!!
[58:09.221] NetConnection IO - read thread is quitting
[58:09.276] NetConnection - connection request for rtmps://localhost:2443  <---- reconnecting to localhost
[58:09.276] Connection: local 0, protocol 4, domain localhost, app name 

In the log, we also notice that the memory address of the NetConnection IO thread and the address of the next "starting read thread" are always the same values each time the same messages are logged, even though there is also a message indicating that the read thread is quitting. I'm not sure if this is interesting or not, but I thought I'd mention it.

busitech avatar Dec 27 '23 04:12 busitech

Hi

Thanks for checking this, apologies for the delay due to the Christmas/New Year period! With that hint about the "address changed", it seems a little odd, because the socket should only be reset if we're no longer able to pick up the remote connection. The callback comes from the OS itself but we should ignore this as long as we're still active - and there's even code there that checks if it's a local/loopback address, in which case it should definitely then continue!

So, a little more debug needed I'm afraid .. below is an updated version with some additional debugging around this area. https://transfer.harman.com/message/vbUOL53Iup1t2QhmXBg1zh

many thanks

Andrew

ajwfrost avatar Jan 15 '24 16:01 ajwfrost

Andrew, I have your updated debugging information:

NetConnection IO - processing read buffer, message 0x6000020aca00, remaining 0
NetConnection IO - message id 17, stream 0
Address changed (thread %p)
NetConnection IO - NetworkChanged callback
********************* NETWORK CHANGED tSocketIO: a81ab600******************************

** tcpAddress: [::1]:51172 working: 0

Send network change AS3 event
NetConnection IO - read thread is quitting
NetConnection - connection request for rtmps://localhost:2443
Connection: local 0, protocol 4, domain localhost, app name 
NetConnectionIO -> connect localhost:2443, protocol 4
Proxy type is best
Broadcast fpad on port 67
Net connection succeeded
NetConnection IO thread 0x700002136000 server localhost port 2443 protocol 4
NetConnection IO - connecting with RTMPS, native 1 (1), try connect 0 (1)
OSX SSL socket connecting(localhost, 2443, 0)
OSX SSL socket initating SSL(localhost)
OSX SSL socket - SSL connection results 8
TCP connecting via SSL -> 8
NetConnectionIO::DoConnect rtmps protocol, result 8, nagle 0, native SSL(1), connect succeeded.

NetConnection IO connected=1
NetConnection IO - starting read thread 0x7000022bf000
NetConnection IO - bytes written = 1, message 0x0
NetConnection IO - read 3073 bytes

I wanted to clarify that the issue happens to both localhost connections and connections to external addresses, in case there is any doubt about that.

Also, just a reminder that we also shared the stack trace which explains how the runtime crashes when the debugger is attached, which is a separate issue. We have to run the adl command from the command line to test with the later versions of the runtime. (Attaching IDEA causes a hang within a few seconds of running the program.)

Thank you.

busitech avatar Jan 16 '24 01:01 busitech

Thanks for this.

So the bit where it says working: 0 is concerning, but then we're not seeing the subsequent traces that we had put in where it's looking to see whether or not to close the connection based on whether the remote address is still accessible. So I'm wondering if something else is blocking it or disconnecting it .. do you have any AS3 handler for the NetConnection.Connect.NetworkChange event?

We've added a bit more checking of the network situation here and trying to find out exactly why it's dropping out of the loop. And we're not sending out that AS3 event now.

Sorry to ask for another attempt, thanks for your time on this so far! https://transfer.harman.com/message/1i4nhN1oPhxGdKxLPga7qN

The crash with debugger we will look at shortly, that may well just be because of changes we've added in for this debugging session..

thanks

ajwfrost avatar Jan 16 '24 18:01 ajwfrost

Hi @busitech - just wondering if you'd been able to capture some logs of this going wrong? We've done some testing here, the network changed event happens e.g. if we add a VPN connection, but the original network path still shows as valid (i.e. we've not been able to get it to show working: 0)

It may be an IPv4 vs IPv6 thing, but if you have the logs it would be handy, I would hope this is the final one needed! and I was then hoping we could get a fix in for our next release, which I've been postponing slightly in case we can squeeze this in!

thanks

ajwfrost avatar Jan 19 '24 09:01 ajwfrost

Hi @ajwfrost we went to work immediately on the last release you provided, and we are wondering if one of the changes you made could have resolved the issue, because we haven't seen the issue again since installing that last debug version you provided.

busitech avatar Jan 19 '24 23:01 busitech

Interesting! I will double-check the diffs, but my recollection was that the only functional change we made there was to swallow that NetConnection.Connect.NetworkChange event. So if you're able to look if there's a handler added for the NetStatus event to do something there, it may be good.

It would still be useful if we could see the log around that "NETWORK CHANGED" message, so that we can double-check how the IPv6 addresses are being handled - I still can't quite see how we would have ended up with working: 0 in the log, which happens prior to that network change as3 event... and I feel is probably the key to the whole thing!

thanks

ajwfrost avatar Jan 19 '24 23:01 ajwfrost

Not only are we not seeing disconnects in this version of the debug runtime, we also haven't seen the messages you're looking for either, which is why we've had nothing to send your way. The interesting log entries are not showing up now, but we're watching for them nonetheless, and will send one if we see it. You might want to send a debug version with more problems and less changes that are acting like fixes, because it is like we've lost our ability to recreate the issue.

busitech avatar Jan 20 '24 00:01 busitech

Hello @ajwfrost ,

We have some log data for you!

[trace] ********************* NETWORK CHANGED tSocketIO: 3e194c00******************************
[trace] MacNetworkDetector: checking if 0x7fb43e194df8 is a working local interface
[trace] IPv6 - checking address is set 1
[trace] IPv6 - family is 30 (INET 2, INET6 30)
[trace] IP string is [::1]
[trace] Storage family is %d
[trace] Comparing [fd7a:bcc9:1c03:0:0:0:0:2] with [::1] (ipv6)
[trace] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[trace] Returning 0 for 'valid' from 'is working local interface'
[trace] ** tcpAddress: [::1]:64762 working: 0
[trace] DON'T Send network change AS3 event
[trace] Not shutdown and not using HTTP
[trace] Not using RTMFP
[trace] Not RTMFP, we have a remote address, local = 1
[trace] Finished network changed handler
[trace] Address changed (thread %p)
[trace] NetConnection IO - NetworkChanged callback
[trace] Checking remote address of our TCP socket...
[trace] Got a remote address
[trace] Set IP address: loopback 1, link local 0
[trace] ********************* NETWORK CHANGED tSocketIO: 3e194c00******************************
[trace] MacNetworkDetector: checking if 0x7fb43e194df8 is a working local interface
[trace] IPv6 - checking address is set 1
[trace] IPv6 - family is 30 (INET 2, INET6 30)
[trace] IP string is [::1]
[trace] Storage family is %d
[trace] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[trace] Returning 0 for 'valid' from 'is working local interface'
[trace] ** tcpAddress: [::1]:64762 working: 0
[trace] DON'T Send network change AS3 event
[trace] Not shutdown and not using HTTP
[trace] Not using RTMFP
[trace] Not RTMFP, we have a remote address, local = 1
[trace] Finished network changed handler
[trace] Address changed (thread %p)
[trace] NetConnection IO - NetworkChanged callback
[trace] Checking remote address of our TCP socket...
[trace] Got a remote address
[trace] Set IP address: loopback 1, link local 0
[trace] ********************* NETWORK CHANGED tSocketIO: 3e194c00******************************
[trace] MacNetworkDetector: checking if 0x7fb43e194df8 is a working local interface
[trace] IPv6 - checking address is set 1
[trace] IPv6 - family is 30 (INET 2, INET6 30)
[trace] IP string is [::1]
[trace] Storage family is %d
[trace] Comparing [fd93:2ede:9e4f:0:0:0:0:2] with [::1] (ipv6)
[trace] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[trace] Returning 0 for 'valid' from 'is working local interface'
[trace] ** tcpAddress: [::1]:64762 working: 0
[trace] DON'T Send network change AS3 event
[trace] Not shutdown and not using HTTP
[trace] Not using RTMFP
[trace] Not RTMFP, we have a remote address, local = 1
[trace] Finished network changed handler
[trace] Address changed (thread %p)
[trace] NetConnection IO - NetworkChanged callback
[trace] Checking remote address of our TCP socket...
[trace] Got a remote address
[trace] Set IP address: loopback 1, link local 0
[trace] ********************* NETWORK CHANGED tSocketIO: 3e194c00******************************
[trace] MacNetworkDetector: checking if 0x7fb43e194df8 is a working local interface
[trace] IPv6 - checking address is set 1
[trace] IPv6 - family is 30 (INET 2, INET6 30)
[trace] IP string is [::1]
[trace] Storage family is %d
[trace] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[trace] Returning 0 for 'valid' from 'is working local interface'
[trace] ** tcpAddress: [::1]:64762 working: 0
[trace] DON'T Send network change AS3 event
[trace] Not shutdown and not using HTTP
[trace] Not using RTMFP
[trace] Not RTMFP, we have a remote address, local = 1
[trace] Finished network changed handler

busitech avatar Jan 20 '24 06:01 busitech

Good evening @ajwfrost

Here are a couple more with the timestamps enabled. We're unsure why the problem seems to be back.

[1:19.115] NetConnection IO - read 159 bytes
[1:19.115] NetConnection IO - processing read buffer, message 0x6000029c9400, remaining 0
[1:19.115] NetConnection IO - message id 17, stream 0
[1:19.115] Read thread is looping
[2:12.712] Address changed (thread %p)
[2:12.712] NetConnection IO - NetworkChanged callback
[2:12.712] Checking remote address of our TCP socket...
[2:12.712] Got a remote address
[2:12.712] Set IP address: loopback 1, link local 0
[2:12.712] ********************* NETWORK CHANGED tSocketIO: 428faa00******************************

[2:12.712] MacNetworkDetector: checking if 0x7fe8428fabf8 is a working local interface
[2:12.712] IPv6 - checking address is set 1
[2:12.712] IPv6 - family is 30 (INET 2, INET6 30)
[2:12.712] IP string is [::1]
[2:12.712] Storage family is %d
[2:12.712] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[2:12.712] Comparing [fd1f:c3e5:bb49:0:0:0:0:2] with [::1] (ipv6)
[2:12.712] Returning 0 for 'valid' from 'is working local interface'
[2:12.712] ** tcpAddress: [::1]:55554 working: 0

[2:12.712] DON'T Send network change AS3 event
[2:12.712] Not shutdown and not using HTTP
[2:12.712] Not using RTMFP
[2:12.712] Not RTMFP, we have a remote address, local = 1
[2:12.712] Finished network changed handler
[2:42.752] Address changed (thread %p)
[2:42.752] NetConnection IO - NetworkChanged callback
[2:42.752] Checking remote address of our TCP socket...
[2:42.752] Got a remote address
[2:42.752] Set IP address: loopback 1, link local 0
[2:42.752] ********************* NETWORK CHANGED tSocketIO: 428faa00******************************

[2:42.752] MacNetworkDetector: checking if 0x7fe8428fabf8 is a working local interface
[2:42.752] IPv6 - checking address is set 1
[2:42.752] IPv6 - family is 30 (INET 2, INET6 30)
[2:42.752] IP string is [::1]
[2:42.752] Storage family is %d
[2:42.752] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[2:42.752] Returning 0 for 'valid' from 'is working local interface'
[2:42.752] ** tcpAddress: [::1]:55554 working: 0

[2:42.752] DON'T Send network change AS3 event
[2:42.752] Not shutdown and not using HTTP
[2:42.752] Not using RTMFP
[2:42.752] Not RTMFP, we have a remote address, local = 1
[2:42.752] Finished network changed handler

busitech avatar Jan 20 '24 06:01 busitech

Thanks for those! It seems fairly certain that this code isn't working how we would want it to:

[2:12.712] MacNetworkDetector: checking if 0x7fe8428fabf8 is a working local interface
[2:12.712] IPv6 - family is 30 (INET 2, INET6 30)
[2:12.712] IP string is [::1]
[2:12.712] Comparing [fe80:0:0:0:c49:f376:b658:2c53] with [::1] (ipv6)
[2:12.712] Comparing [fd1f:c3e5:bb49:0:0:0:0:2] with [::1] (ipv6)
[2:12.712] Returning 0 for 'valid' from 'is working local interface'
[2:12.712] ** tcpAddress: [::1]:55554 working: 0

This is handling some change in the local network conditions and should be giving us the IP addresses of the available network interfaces, and then checking that the one we were using for our rtmp socket is still avaialble.

The problem seems to be that our existing socket has a local address of ::1.. although potentially, that is correct and it's the list that is missing this value, judging by the result of ifconfig on a mac here..

Anyway - let us check on this aspect, I think we should have enough details to find/fix it now! Many thanks for your help with the debugging :-)

ajwfrost avatar Jan 20 '24 07:01 ajwfrost