neon icon indicating copy to clipboard operation
neon copied to clipboard

`plv8` extension creation crashes postgres sometimes

Open kelvich opened this issue 3 years ago • 10 comments

I've managed to do this two times. The rough algorithm is following:

  1. Create a new staging project
  2. Run create extension postgis;
  3. Run create extension plv8;

With some probability step 3 will result in a crash.

My try 1, late-surf-318322:

https://observer.zenith.tech/d/IJSLHBOnk/neon-logs-compute-nodes?orgId=1&var-environment=staging&var-region=virginia&var-project=compute-node-late-surf-318322&var-search=&var-exclude=substing%20to%20exclude&from=1661776289601&to=1661777410709

My try 2, silent-union-873343:

https://observer.zenith.tech/d/IJSLHBOnk/neon-logs-compute-nodes?orgId=1&var-environment=staging&var-region=virginia&var-project=compute-node-silent-union-873343&var-search=&var-exclude=substing%20to%20exclude&from=1661806532323&to=1661806628863

In both cases it was with the following log messages:

	
2022-08-29 23:56:05	
compute-node-silent-union-873343	2022-08-29 20:56:05.502 GMT [14] LOG:  server process (PID 33) was terminated by signal 7: Bus error
	
2022-08-29 23:56:05	
compute-node-silent-union-873343	2022-08-29 20:56:05.502 GMT [14] DETAIL:  Failed process was running: create extension plv8;

kelvich avatar Aug 29 '22 21:08 kelvich

My tests indicate we crash persistently when I execute the following (composite) statement:

CREATE EXTENSION postgis; CREATE EXTENSION plv8;

MMeent avatar Aug 30 '22 12:08 MMeent

I've reproduced it with

$ CREATE EXTENSION plv8;
$ CREATE EXTENSION postgis; -- no error yet
$ CREATE OR REPLACE FUNCTION test ()
>    RETURNS int
>    IMMUTABLE
>    LANGUAGE plv8
>AS $$ return "test".length + 1|0; $$;
<server closed the connection unexpectedly
<        This probably means the server terminated abnormally
<        before or while processing the request.

So it seems like it goes wrong while creating a new PLv8 function after installing PostGIS (which is something that PLv8 also does during it's installation procedure).

MMeent avatar Aug 30 '22 14:08 MMeent

I've reproduced it with

Locally? Is there a backtrace?

kelvich avatar Aug 30 '22 15:08 kelvich

I haven't yet been able to get this image running locally, so this was in staging

MMeent avatar Aug 30 '22 15:08 MMeent

And that was with PLv8/3.1.4 too, so it's not something that's fixed with that release

MMeent avatar Aug 30 '22 15:08 MMeent

Did this really get fixed by PR #2366? I think Github intepreted "this may or may not fix #2361" as just "fix #2361", and closed this issue automatically :-).

hlinnaka avatar Sep 01 '22 11:09 hlinnaka

well, we won't see PLV8 crash postgres anymore considering that we don't include it in the final image. But, indeed, I haven't yet found the origin of the issue, nor fixed the true underlying issue.

MMeent avatar Sep 01 '22 11:09 MMeent

are you working on this, @SergeyMelnikov ? were you trying to reproduce this?

stepashka avatar Sep 12 '22 15:09 stepashka

Just set default k8s limits on staging to 1 to 4 ratio (m6i.2xlarge instance) -- 1 cpu and 4 gigs of ram. So we can try to enable plv8 again and check whether it was an OOM or not

ololobus avatar Sep 19 '22 15:09 ololobus

we'll increase the amount of memory, deploy on staging and check if this still fails

stepashka avatar Sep 19 '22 15:09 stepashka

are we still working on this?

klink avatar Oct 14 '22 15:10 klink

Yes, we found bug in plv8 itself https://github.com/plv8/plv8/pull/504

kelvich avatar Oct 14 '22 15:10 kelvich