devtools icon indicating copy to clipboard operation
devtools copied to clipboard

Do not mark service extensions as 'added' so eagerly

Open srawlins opened this issue 6 months ago • 1 comments

Fixes https://github.com/flutter/devtools/issues/9203

The code here had a race condition. After an app performs a hot restart, we enter _addServiceExtension twice for each service extension that is to be re-added. I'm not sure why twice. But I believe the code at the top of _addServiceExtension:

if (!_serviceExtensions.add(name)) {
  // If the service extension was already added we do not need to add it
  // again. This can happen depending on the timing between when extension
  // added events were received and when we requested the list of all
  // service extensions already defined for the isolate.
  return;
}

is here specifically so that the second call just quietly short circuits. However, https://github.com/flutter/devtools/issues/9203 is the result of the situation where we add to _serviceExtensions too eagerly: in the first call, we add a service extension, say "ext.dart.io.httpEnableTimelineLogging", to _serviceExtensions, and then we find out that _isolateManager.mainIsolate.value is null, and we short-circuit return! When the second call comes in, and _isolateManager.mainIsolate.value is not null, we've already added to _serviceExtensions so we short circuit again.

The fix is to make some of these functions return whether or not they successfully added/restored a service extension, as they all have several conditions that result in a short circuit. If they return true, then we can:

_serviceExtensions.add(name);
_hasServiceExtension(name).value = true;

srawlins avatar Jun 24 '25 21:06 srawlins

CC @bkonyi @kenzieschmoll

srawlins avatar Jun 24 '25 21:06 srawlins

@kenzieschmoll have you seen these bots time out at 10 hours before? Have I likely introduced a regression?

srawlins avatar Jun 30 '25 22:06 srawlins

have you seen these bots time out at 10 hours before? Have I likely introduced a regression?

I have not seen this. Usually things timeout within 20-30 min if they are going to timeout.

kenzieschmoll avatar Jun 30 '25 22:06 kenzieschmoll

have you seen these bots time out at 10 hours before? Have I likely introduced a regression?

I have not seen this. Usually things timeout within 20-30 min if they are going to timeout.

It looks like a couple of the hanging bots have some test failures in the inspector. Can you try to run these tests locally to see if you can reproduce the hang?

kenzieschmoll avatar Jun 30 '25 22:06 kenzieschmoll

It looks like a couple of the hanging bots have some test failures in the inspector. Can you try to run these tests locally to see if you can reproduce the hang?

I only see a sea of green checkboxes. What test failures do you see? Where do I find "in the inspector?"

srawlins avatar Jun 30 '25 23:06 srawlins

I only see a sea of green checkboxes. What test failures do you see? Where do I find "in the inspector?"

Weird I don't see them anymore. devtools/packages/devtools_app/test/screens/inspector/layout_explorer/flex/arrow_test.dart and devtools/packages/devtools_app/test/screens/inspector_v2/inspector_error_navigator_test.dart had some failures in the logs when I looked earlier (although the check box was still green).

If you run flutter test test/screens/inspector/ test/screens/inspector_v2/ locally, that will run all inspector tests. Be sure you are on the same version of Flutter that the bots use dt update-flutter-sdk --update-on-path. If all those pass locally, try running all tests locally flutter test test/, and then perhaps cancel the CI workflows and run them again.

kenzieschmoll avatar Jun 30 '25 23:06 kenzieschmoll

Thanks @kenzieschmoll ! There was a bug in my fix; in _restoreExtensionFromDevice, I had this:

if (!extensions.serviceExtensionsAllowlist.containsKey(name)) {
  return false;
}

because this early return means we are not actually restoring the service extension. But this triggers timeouts in those inspector tests.

Since the bug I'm addressing is specifically about the preparedness of the isolates in the test app, I've flipped this to return true, and I've changed the comments to note what the return value means.

srawlins avatar Jul 01 '25 15:07 srawlins

Fixes #9203

The code here had a race condition. After an app performs a hot restart, we enter _addServiceExtension twice for each service extension that is to be re-added. I'm not sure why twice. But I believe the code at the top of _addServiceExtension:

if (!_serviceExtensions.add(name)) {
  // If the service extension was already added we do not need to add it
  // again. This can happen depending on the timing between when extension
  // added events were received and when we requested the list of all
  // service extensions already defined for the isolate.
  return;
}

is here specifically so that the second call just quietly short circuits. However, #9203 is the result of the situation where we add to _serviceExtensions too eagerly: in the first call, we add a service extension, say "ext.dart.io.httpEnableTimelineLogging", to _serviceExtensions, and then we find out that _isolateManager.mainIsolate.value is null, and we short-circuit return! When the second call comes in, and _isolateManager.mainIsolate.value is not null, we've already added to _serviceExtensions so we short circuit again.

The fix is to make some of these functions return whether or not they successfully added/restored a service extension, as they all have several conditions that result in a short circuit. If they return true, then we can:

_serviceExtensions.add(name);
_hasServiceExtension(name).value = true;

service upload for some files & documents to control our access and controls to detact the android app development the documentary this rules to make the money easily and tech controversy

canvaprolifetime877-bit avatar Nov 10 '25 08:11 canvaprolifetime877-bit