casa icon indicating copy to clipboard operation
casa copied to clipboard

Fix slow factory_bot factory

Open compwron opened this issue 4 years ago • 3 comments

Screen Shot 2021-02-17 at 7 42 17 PM

compwron avatar Feb 18 '21 03:02 compwron

@compwron This looks like a red herring.

I pulled in TestProf to investigate the issue a little bit and altered a test to just create a judge:

it { create(:judge) }

Output:

Slow factory: casa_org takes 0.847061394 seconds using strategy create
Slow factory: judge takes 0.867943986 seconds using strategy create
.How many objects did factory_bot create? (probably too many- let's tune some factories...)
{:casa_org=>{:total=>1, :create=>1}, :judge=>{:total=>1, :create=>1}}


Finished in 0.98594 seconds (files took 0.18834 seconds to load)
1 example, 0 failures

Randomized with seed 55736

[TEST PROF INFO] Factories usage

 Total: 2
 Total top-level: 1
 Total time: 00:00.870 (out of 00:01.241)
 Total uniq factories: 2

   total   top-level     total time      time per call      top-level time               name

       1           1        0.8706s            0.8706s             0.8706s              judge
       1           0        0.8472s            0.8472s             0.0000s           casa_org

OK, so casa_org is actually what's taking the time here, not judge itself. Let's re-focus:

it { create(:casa_org) }
Slow factory: casa_org takes 0.845811442 seconds using strategy create
.How many objects did factory_bot create? (probably too many- let's tune some factories...)
{:casa_org=>{:total=>1, :create=>1}}


Finished in 0.96789 seconds (files took 0.18845 seconds to load)
1 example, 0 failures

Randomized with seed 16713

[TEST PROF INFO] Factories usage

 Total: 1
 Total top-level: 1
 Total time: 00:00.852 (out of 00:01.222)
 Total uniq factories: 1

   total   top-level     total time      time per call      top-level time               name

       1           1        0.8528s            0.8528s             0.8528s           casa_org

Confirmed that it's casa_org. So what's taking time? The factory itself is really simple, with no associations, so what about any hooks?

Only one I can see:

  before_create :set_slug
  # ...
  def set_slug
    self.slug = name.parameterize
  end

Looks fine, but let's try commenting it out anyway:

.How many objects did factory_bot create? (probably too many- let's tune some factories...)
{:casa_org=>{:total=>1, :create=>1}}


Finished in 0.15149 seconds (files took 0.20526 seconds to load)
1 example, 0 failures

Randomized with seed 1050

[TEST PROF INFO] Factories usage

 Total: 1
 Total top-level: 1
 Total time: 00:00.031 (out of 00:00.435)
 Total uniq factories: 1

   total   top-level     total time      time per call      top-level time               name

       1           1        0.0319s            0.0319s             0.0319s           casa_org

So that was it, but why? Well, without going for a deeper dive, it looks like the first call on ActiveSupport::Inflector does some rather expensive setup which isn't the case in further runs.

See:

it do
  ActiveSupport::Inflector.parameterize('hi')
   create(:casa_org)
end
.How many objects did factory_bot create? (probably too many- let's tune some factories...)
{:casa_org=>{:total=>1, :create=>1}}


Finished in 0.9714 seconds (files took 0.19682 seconds to load)
1 example, 0 failures

Randomized with seed 41786

[TEST PROF INFO] Factories usage

 Total: 1
 Total top-level: 1
 Total time: 00:00.029 (out of 00:01.244)
 Total uniq factories: 1

   total   top-level     total time      time per call      top-level time               name

       1           1        0.0293s            0.0293s             0.0293s           casa_org

The actual factory creation time is down to negligible levels, whilst the full test (i.e. including the call to parameterize) accounts for the vast majority.

Whilst the casa_case factory is considerably more complex, it too has the same set_slug method with similar results:

it do
  ActiveSupport::Inflector.parameterize('hi')
  create(:casa_case)
end
Finished in 1.05 seconds (files took 0.1898 seconds to load)
1 example, 0 failures

Randomized with seed 24297

[TEST PROF INFO] Factories usage

 Total: 4
 Total top-level: 1
 Total time: 00:00.084 (out of 00:01.311)
 Total uniq factories: 4

   total   top-level     total time      time per call      top-level time               name

       1           1        0.0842s            0.0842s             0.0842s          casa_case
       1           0        0.0178s            0.0178s             0.0000s           casa_org
       1           0        0.0110s            0.0110s             0.0000s       hearing_type
       1           0        0.0102s            0.0102s             0.0000s              judge

shkm avatar Jan 21 '22 13:01 shkm

Wow, this is a great analysis! :)

We are (barely, slowly) moving to use slugs as URLs instead of incrementing integers, for usability and also to avoid revealing our incrementing integers... so I am not sure how to fix this issue.

I guess it's time to think hard about how to do set_slug faster, or skip some slow part of it in tests, or preload... something. 😅 If you have any ideas, I am very very interested!

compwron avatar Jan 21 '22 15:01 compwron

I don't think this is much of an issue, @compwron. The takeaway, if I'm not incorrect, is that it just makes the numbers coming from factory creation misleading. If that's a problem, we could always do whatever it is the inflector does at the start of our test runs instead.

shkm avatar Jan 21 '22 15:01 shkm

closing as a pain for now, we can come back to this thanks shkm for looking into it!

compwron avatar Apr 07 '23 18:04 compwron