cue icon indicating copy to clipboard operation
cue copied to clipboard

evalv3: Excessive Slowness or Potential Infinite Loop During Evaluation

Open massive opened this issue 1 year ago • 3 comments
trafficstars

What version of CUE are you using (cue version)?

$ cue version

cue version v0.9.1-0.20240610152000-475f692480d6

go version go1.21.10
      -buildmode exe
       -compiler gc
     CGO_ENABLED 1
          GOARCH arm64
            GOOS darwin
cue.lang.version v0.9.0

Does this issue reproduce with the latest stable release?

No, that outright crashes.

What did you do?

This new issue is related to the issue at https://github.com/cue-lang/cue/issues/3194.

The commit addressing the mentioned issue resolved the panic that was reported. However, the same code utilizing evalv3 is still not functioning properly. It appears that the cue code is either running extremely slowly or stuck in an infinite loop.

Without evalv3, the code completes in about 3 seconds on a M2 Mac laptop.

________________________________________________________
Executed in    2.56 secs    fish           external
   usr time    4.71 secs   51.00 micros    4.71 secs
   sys time    0.11 secs  460.00 micros    0.11 secs

With evalv3 it does not seem to finish at all, or at least in a reasonable timeframe:

time CUE_EXPERIMENT=evalv3 cue export tests/fixtures/kafka_topics.cue
^C
________________________________________________________
Executed in  331.93 secs    fish           external
   usr time  344.85 secs   89.00 micros  344.85 secs
   sys time   18.43 secs  704.00 micros   18.42 secs

The issue can be reproduced with the private code supplied to @mvdan during the investigation of #3194. I can provide also a new sample upon request.

What did you expect to see?

JSON output

What did you see instead?

Hanging process

massive avatar Jun 10 '24 17:06 massive

This appears to still occur as of c683420c0797090a1445b453e3318a4570fc4742; I gave up after one minute. Memory usage also seems to slowly grow; it had already grown to 3GiB after one minute.

mvdan avatar Aug 13 '24 09:08 mvdan

Reproduces still with cue version v0.11.0-alpha.1

ghost avatar Sep 10 '24 11:09 ghost

Still seems to happen as of 860906a7f025d8ea766aa29623c4ce4381740f89. It seems to be some sort of infinite loop at this point, because memory usage is pretty stable - I couldn't notice a continuous increase after two minutes - whereas CPU usage remains at 100% for a single core. Just as before, I gave up after two minutes, because evalv2 still finishes after three seconds.

So, even though the bug remains, it seems like at least the memory usage problem is gone.

mvdan avatar Oct 17 '24 13:10 mvdan

In 0.11.0, running the command no longer results to infinite processing, but instead a panic:

❯ CUE_EXPERIMENT=evalv3 cue eval tests/fixtures/kafka_topics.cue
panic: incDependent: already closed: 0x1400e7e7b00 [recovered]
	panic: incDependent: already closed: 0x1400e7e7b00 [recovered]
	panic: incDependent: already closed: 0x1400e7e7b00 [recovered]
	panic: incDependent: already closed: 0x1400e7e7b00 [recovered]
	panic: incDependent: already closed: 0x1400e7e7b00 [recovered]
	panic: incDependent: already closed: 0x1400e7e7b00 [recovered]
	panic: incDependent: already closed: 0x1400e7e7b00

massive avatar Dec 09 '24 06:12 massive

We are tracking what is likely the same panic at https://github.com/cue-lang/cue/issues/3528.

mvdan avatar Dec 12 '24 14:12 mvdan

We are back in the infinite loop:

❯ cue version
cue version v0.12.0-alpha.1

go version go1.22.10
      -buildmode exe
       -compiler gc
     CGO_ENABLED 1
          GOARCH arm64
            GOOS darwin
cue.lang.version v0.12.0
❯ time CUE_EXPERIMENT=evalv3 cue eval tests/fixtures/kafka_topics.cue
^C
________________________________________________________
Executed in  166.90 secs    fish           external
   usr time  165.31 secs  104.00 micros  165.31 secs
   sys time    3.16 secs  598.00 micros    3.16 secs

massive avatar Dec 19 '24 18:12 massive

The panic, tracked at #3528, should indeed have been fixed - but only on master, a day after your last comment.

The endless evaluation still seems to happen as of e4c4b8e8a15fa5f36e1f0d02379b7b9d8563f506. It seems to be doing something, because my memory usage is climbing very slowly, at about one gigabyte per minute. So, presumably, my machine would run out of memory after a while. I didn't wait longer than a minute or so.

mvdan avatar Dec 23 '24 12:12 mvdan

The endless evaluation with very slow memory usage increase still seems to happen as of bb1be011fa01f799c44d37e6b36861e6bf03cf86.

mvdan avatar Feb 17 '25 23:02 mvdan

@massive the performance issue seems to have been resolved; evalv3 is now significantly faster than evalv2 as of 2c002aef38a46c0911450c7e26787f3d7bab75f7:

$ time CUE_EXPERIMENT=evalv3=0 cue export tests/fixtures/kafka_topics.cue | wc -l
225

real	0m3.162s
user	0m7.005s
sys	0m0.197s
$ time CUE_EXPERIMENT=evalv3=1 cue export tests/fixtures/kafka_topics.cue | wc -l
201

real	0m0.416s
user	0m0.775s
sys	0m0.054s

However, I note that some of the output lines are mising on evalv3, meaning that the value is slightly different. That's possibly a bug; @massive would you be okay with me raising that as a new issue with a reduced reproducer, closing this performance issue as resolved?

mvdan avatar Mar 19 '25 16:03 mvdan

Also, would you be able to try evalv3 at master and let me know if it works for you, or if you see issues like the missing fields I show above?

mvdan avatar Mar 19 '25 16:03 mvdan

@mvdan, yes, the slowness seems to be resolved. Given we have been stuck with 0.4.3 due to performance reasons, the project is failing now due to a myriad of other things, such as list concatenation being changed. But the most important point is that it now seems that we have a path to upgrade.

I would appreciate if you create a follow-up ticket for the bug you mention.

massive avatar Mar 19 '25 17:03 massive

Thanks! Will do so, and close this issue once that's done.

mvdan avatar Mar 19 '25 17:03 mvdan

Reduced as https://github.com/cue-lang/cue/issues/3836 :) I will close this issue as resolved then, as the performance seems good now.

mvdan avatar Mar 20 '25 18:03 mvdan

To follow up here, @massive - evalv3 as of cfbeb48088c9674480496969ad85a93602e9778c, having fixed https://github.com/cue-lang/cue/issues/3838, does make your project work faster and correctly now:

$ for n in 0 1; do time CUE_DEBUG=openinline=0 CUE_EXPERIMENT=evalv3=$n cue export tests/fixtures/kafka_topics.cue | wc -l; done
225

real	0m3.074s
user	0m6.942s
sys	0m0.163s
225

real	0m0.883s
user	0m1.658s
sys	0m0.097s

Would you be able to add your project to https://cuelabs.dev/unity/ so that we can continuously test it for regressions from now on?

mvdan avatar Mar 23 '25 11:03 mvdan

@mvdan I appreciate you asking, but our project is private, so I'm afraid I can't add it there.

massive avatar Mar 25 '25 05:03 massive