ferry icon indicating copy to clipboard operation
ferry copied to clipboard

extremely slow build

Open shreyas1599 opened this issue 3 years ago • 18 comments

Running flutter pub run build_runner build --delete-conflicting-outputs at least 15-20 minutes for every change in a graphql file. I tried modifying build.yaml and adding generate_for but it does not seem to make any difference.

ferry_generator|req_builder:
        generate_for:
          - lib/graphql/*.* 

I assumed using generate_for will skip running on files not within the specified directory but I still get the following:

[WARNING] No actions completed for 42.2s, waiting on: <files_not_in_generate_for>

Is there any field in options that I can use to run build_runner only for a specific directory or any recommendations to speed up the build? Here's the entire build.yaml.

Thanks a lot!

shreyas1599 avatar Jan 28 '21 09:01 shreyas1599

Running flutter pub run build_runner build --delete-conflicting-outputs at least 15-20 minutes for every change in a graphql file.

Wow. I thought my build was taking too long at ~3 minutes.

I believe there are two factors contributing to the slow build times:

  1. built_value generates a lot of boilerplate (most of which is generated by the built_value_generator)
  2. build_runner isn't great about caching / selectively rebuilding only what has changed

With respect to "1", if and when Dart gets records, we'd likely be able to eliminate built_value altogether. However, that probably won't happen very soon.

"2" is really beyond the scope of ferry. There may already be some solutions that ferry users can use to improve this, but I haven't spent much time investigating it. If you find that generate_for or any other configuration enables you to reduce your build time, we should definitely add a note to the docs.

smkhalsa avatar Jan 28 '21 11:01 smkhalsa

Ah, I see. Thanks for your reply. Another problem with the 15 minutes is that my computer hangs and if I try to do anything else while it is running (one tab open in the browser) the build takes a few minutes more. Maybe my computer is slow(but I've found it bearable with two emulators open and running and it is unusable while I run the build :/).

generate_for doesn't seem to be working. I'll try and see if I can find something to speed up the process and comment here. 15 minutes is just too long to get anything done :/

(Sidenote: ferry is great! Thanks for this!)

shreyas1599 avatar Jan 28 '21 11:01 shreyas1599

@shreyas1599 If it helps you, I moved my graphql api in a separate dart project. That way I only have to the re-run build_runner for ferry when the actual grapql-related code changes, which is not that often (at least for me). Also, it prevents dev-dependency conflicts with other code generation packages.

I also use this script to put all the generated graphql code in a single library (but look out for naming clashes):

import 'dart:io';


void main() {
  final suffix = RegExp(r'^.+(req|var|data)\.gql\.dart$');

  if(Directory.current.path.endsWith('bin')){
    Directory.current = Directory.current.parent;
  }

  final files = Directory('lib')
      .listSync(recursive: true)
      .whereType<File>()
      .where((element) => suffix.hasMatch(element.path));

  print('');
  // export the schema file 
  print("export 'src/schema/schema.schema.gql.dart';");

  for(final file in files){
    print("export '${file.path.substring(4)}';");
  }
}

knaeckeKami avatar Feb 11 '21 11:02 knaeckeKami

I was able to reduce my build time from 15 minutes back to 3 minutes by measuring and then reducing the memory consumption in VSCode.

friebetill avatar Mar 10 '21 14:03 friebetill

With some of the suggestions mentioned here, I was able to get it down to under 7 minutes. Seems manageable. I don't think I can get it down further as my computer seems to have slowed down due to age. Closing this. Thanks for the help.

shreyas1599 avatar Jun 12 '21 07:06 shreyas1599

Hi everyone,

I also use ferry for one project with a 11.000 lines schema (still growing). The last generation took about 10 hours with almost 100% RAM and 2 of 8 kernels 100% used. Finally the build broke with a message like "out of memory".

I also tried to run the generation process inside a github runner - same result.

VSCode RAM reducing techniques were applied.

Is there a way to change the generator source code to cope with growing schema files? Ferry is really nice but it wants my RAM and CPU to scale with the schema file size, which is quite a limitation.

If there are ideas on how to solve this problem please let me know. If you could tell me where to start, I'd also be happy to contribute to the ferry project.

All the best, Thomas

Edit 1: Schema on its own takes about 30+ minutes. With about 25 queries and 15 mutations it takes about 4 hours. Versions are:

  • Flutter: 2.10.4
  • ferry_generator: ^0.5.0-dev.10
  • gql_code_builder: ^0.5.1-alpha+1645425888395
  • build_runner: ^2.1.7

Generation always stops with logs trying to blame "built_value_generator:built_value"

Any idea on how we can save RAM in this build process?

TMInnovations avatar Mar 30 '22 17:03 TMInnovations

I have been seeing similar long codegen lengths and posted this issue, and a few things I tried to resolve it.

Edit:

For anyone looking for steps to repro my performance measurement approach. Try the running the following script:

export BUILD_MAX_WORKERS_PER_TASK=1

echo "Clearing old build_runner state"
flutter clean &> /dev/null

echo "Removing old generated files"
rm -rf lib/__generated__ \
       lib/src/graphql/resolvers/__generated__ \
       lib/src/graphql/resolvers/fragments/__generated__

echo "Installing packages"
# Note you need these packages in your dev_dependencies to run build_runner with `--track-performance`
# build_test: any
# build_web_compilers: any
flutter pub get

echo "Running build runner with $BUILD_MAX_WORKERS_PER_TASK parallel workers"
flutter pub run build_runner serve --delete-conflicting-outputs --track-performance

# Open a browser window to http://localhost:8080/$perf
# Note the page will only actually load once build_runner completes

michael-golfi avatar Apr 11 '22 17:04 michael-golfi

there was just a new release of built_value that might help somewhat.

if this is still an issue for you, could you try

built_value_generator: ^8.4.3

and see if this improves things?

knaeckeKami avatar Jan 11 '23 13:01 knaeckeKami

It will be a significant bottleneck for this awesome project. Do we have any plan or alternative?

dehypnosis avatar Aug 04 '23 15:08 dehypnosis

Unfortunately, to me, it seems the bottleneck is either built_value or build_runner with big files in general.

Potentially it could be feasible to move away from built_value and let ferry generate all code, but this would be a big task which I do not have the capacities for right now.

knaeckeKami avatar Aug 04 '23 16:08 knaeckeKami

@knaeckeKami Thank you for your answer.

But what I think is that the code generation strategy has a point to be improved. I want to share you a point found while I am digging this issue.

In GraphQL, using a Fragment repeatedly is common. And some fragments can include other fragments too. Currently gql_code_builder generates all the fields with selectionSet as a separated built value class. It makes graph-way approach inefficient and requires too many code generation without reusing models of exactly same logics.

for exampe..

fragment FooFragment on Foo {
   a
   b
   c
}

fragment BarFragment on Bar {
   a
   b
   c
   foo {
      ...FooFragment
   }
}

In this case, any model with BarFragment will generates duplicate foo { ...FooFragment }. With more complexity, amount of code generation increases exponentially.

for example, my case is..

scheme $ wc -l *
      51 query.admin.graphql
     258 query.community.graphql
     565 query.graphql
      67 scalar.dart
      11 schema.dart
    1095 schema.graphql
    2047 total
generated $ wc -l *
     364 query.admin.ast.gql.dart
    1810 query.admin.data.gql.dart
   11845 query.admin.data.gql.g.dart
     292 query.admin.req.gql.dart
    1663 query.admin.req.gql.g.dart
     122 query.admin.var.gql.dart
     643 query.admin.var.gql.g.dart
    3520 query.ast.gql.dart
    1661 query.community.ast.gql.dart
   58155 query.community.data.gql.dart
  348729 query.community.data.gql.g.dart
    1216 query.community.req.gql.dart
    7084 query.community.req.gql.g.dart
     563 query.community.var.gql.dart
    3129 query.community.var.gql.g.dart
   18976 query.data.gql.dart
  112482 query.data.gql.g.dart
    2654 query.req.gql.dart
   15563 query.req.gql.g.dart
    1204 query.var.gql.dart
    6348 query.var.gql.g.dart
    8735 schema.ast.gql.dart
    1613 schema.schema.gql.dart
    9783 schema.schema.gql.g.dart
    2829 serializers.gql.dart
    6097 serializers.gql.g.dart
  627080 total

about 2000 line of gql code generates 627080 line of dart codes.. even though some of my query is quite complex, it is hard to say that it efficiently generate codes.

I am working on refactoring part ofgql_code_builder: https://github.com/gql-dart/gql/blob/e021a3774f77701a534c427d702df4b7781c4a66/codegen/gql_code_builder/lib/src/operation/data.dart#L120 to reduce redundant code generations.

FYI, I just modified code to not to generate a data class for field with a single FragmentSpreadNode selectionSet. And make that field reuse already generated ..FragmentData class. And the result is..

generated $ wc -l *
     364 query.admin.ast.gql.dart
     982 query.admin.data.gql.dart
     292 query.admin.req.gql.dart
    1663 query.admin.req.gql.g.dart
     122 query.admin.var.gql.dart
     643 query.admin.var.gql.g.dart
    3520 query.ast.gql.dart
    1661 query.community.ast.gql.dart
    2517 query.community.data.gql.dart
    1216 query.community.req.gql.dart
    7084 query.community.req.gql.g.dart
     563 query.community.var.gql.dart
    3129 query.community.var.gql.g.dart
    4569 query.data.gql.dart
   29187 query.data.gql.g.dart
    2654 query.req.gql.dart
   15563 query.req.gql.g.dart
    1204 query.var.gql.dart
    6348 query.var.gql.g.dart
    8735 schema.ast.gql.dart
    1613 schema.schema.gql.dart
    9783 schema.schema.gql.g.dart
     787 serializers.gql.dart
     927 serializers.gql.g.dart
  105126 total

It reduced code generation to 16% of original one. And originally a build took about 15-20min, but this build takes me only 1-2min.

However this is not finished yet and I don't have much background of GraphQL AST and gql-dart/built-value/ferry specifications to make it complete. Could you consider such approach together? I think It might not take a lot of time.

dehypnosis avatar Aug 05 '23 18:08 dehypnosis

Thanks for your analysis. I knew that this caused some duplication but I did not think about the exponential increase in case of heavy nesting.

Indeed, I am open to fixing that.

We need to consider the case where there are additional fields, other than the fragment

   foo {
      ...FooFragment
      additionalField
   }

For now, it's probably enough to just optimize the common case of a single inline fragment spread and leave the rest as is.

I you could share the work you did, it would be appreciated.

Note that I am in the process of moving to a new apartment, so I have a lot going on right now, it might take some time until I respond.

knaeckeKami avatar Aug 05 '23 20:08 knaeckeKami

Good! I am considering that cauclating hash of selection set (field/spreading names and types) to create and reuse unique model. And about nested fields..

  • I can replace all GAny_nested_foo symbols to GFooFragmentData. it seems most efficient. But not backward compatible in application level.
  • ~~Or I can still generating redundant nested field data models but with using mixin or extending reusable model.~~

I want to hear your idea about this.

I am stuck in making the fine design and there needs some changes for exising built class creating logic to extend or mixin them. I will share any result ASAP. Thank you for the consideration and wish good luck for your moving.

dehypnosis avatar Aug 06 '23 04:08 dehypnosis

Thanks to the contributions of @dehypnosis , the change to reuse classes for selections with only a single inline fragment spread is merged and can be opted in by using ferry_generator 0.8.2-dev.1 and adding

      data_class_config:
            reuse_fragments: true

to the config of ferry_generator|graphql_builder

knaeckeKami avatar Aug 14 '23 13:08 knaeckeKami

@knaeckeKami Thank you for the integration work. I am already testing this feature from your dev release. While I was using this version in my project, I found a bug related to generated data codes for InlineFragments, which was a kind of type conversion runtime error during unserialization. And I have fixed that issue in my forked repository, and still testing with my actively developing project. Until the end of this month, I am going to test this feature with my project to make it stable. And If you have any release plan before then, please let me know to make a PR for the fix.

FYI, this is the commit for the fix.


And I have a question about current implementation of InlineFragment when extension. It seems, now WhenExtension generates when/maybeWhen method requiring mapper functions for each InlineFragmentSpreadNodes. And here can be some Interface nodes included too.

But Interface doesn't have __typename, so I think It doesn't make sense to require mapper functions for Interface inline fragment spreads. What do you think about this?

dehypnosis avatar Aug 15 '23 13:08 dehypnosis

Thank you for the integration work. I am already testing this feature from your dev release. While I was using this version in my project, I found a bug related to generated data codes for InlineFragments, which was a kind of type conversion runtime error during unserialization. And I have fixed that issue in my forked repository, and still testing with my actively developing project. Until the end of this month, I am going to test this feature with my project to make it stable. And If you have any release plan before then, please let me know to make a PR for the fix.

Cool! Feel free to open PRs when you're ready!

And I have a question about current implementation of InlineFragment when extension. It seems, now WhenExtension generates when/maybeWhen method requiring mapper functions for each InlineFragmentSpreadNodes. And here can be some Interface nodes included too. But Interface doesn't have __typename, so I think It doesn't make sense to require mapper functions for Interface inline fragment spreads. What do you think about this?

Oh yeah, I didn't think about the when extensions yet.

Generally, I think with Dart 3.0, they are not as valuable anymore since we have pattern matching. But still, I would like to keep them to avoid breaking users.

And I only chose to use a switch over the typename to avoid a chain of is-checks, I think now with Dart 3.0 we can rewrite this to use pattern matching instead of typename-checks.

knaeckeKami avatar Aug 17 '23 19:08 knaeckeKami

Stumbled accross the same issue and here's a very simple yet effective solution from all the workaround that should be included in the doc.

All I did was tweaking build.yaml, nothing else :

targets:
  $default:
    builders:
      ferry_generator|graphql_builder:
        enabled: true
        generate_for:
          - "lib/*.graphql"
          - "lib/**/*.graphql"
        options:
          schema: project|lib/schema.graphql
          data_class_config:
            reuse_fragments: true
      ferry_generator|serializer_builder:
        enabled: true
        generate_for: 
          - "lib/*"
          - "lib/**"
        options:
          schema: project|lib/schema.graphql

So changes are that I use reuse_fragments and also limit generated stuff to avoid scanning pub files.

Both changes were necessary together, otherwise, no significant improvement was seen. I was able to divide my compilation time by 5 and I guess it can be even more if you have several fragments.

Masadow avatar Mar 12 '24 20:03 Masadow

There's a new version of built_value_generator which significantly reduces build time. to use, add

built_value_generator: ^8.9.2 

to your dev_dependencies

knaeckeKami avatar Apr 04 '24 12:04 knaeckeKami