pg_squeeze icon indicating copy to clipboard operation
pg_squeeze copied to clipboard

[master branch] pg_squeeze sends database into recovery code

Open ramkly opened this issue 1 year ago • 5 comments

I installed pg_squeeze from the master branch, it caused the database to go into recovery mode when I ran squeeze.squeeze_table. I tried reinstalling it with REL1_6 by following these steps:

Dropped the extension from my database.
Deleted pg_squeeze.so from the PostgreSQL lib folder.
Installed pg_squeeze-REL1_6.
Reinstalled the pg_squeeze extension in my database.

However, the issue persists, and squeeze.squeeze_table still causes the database to go into recovery mode. Did I miss something, or should I consider installing an older version?

ramkly avatar Apr 26 '24 13:04 ramkly

To "send database into recovery mode" probably means "to crash it". How is this issue different from https://github.com/cybertec-postgresql/pg_squeeze/issues/70 ?

Any progress in getting the core dump and/or the backtrace<

ahouska avatar Apr 29 '24 06:04 ahouska

I'm still curious if I can reproduce the problem. Can you please share the definition (CREATE TABLE) of the table which fails to get processed? And also the output of SELECT version(). (Besides the actual server version, it should report the OS and architecture.)

ahouska avatar Apr 30 '24 04:04 ahouska

OS details:

NAME="openSUSE Leap"
VERSION="15.3"
ID="opensuse-leap"
ID_LIKE="suse opensuse"
VERSION_ID="15.3"
PRETTY_NAME="openSUSE Leap 15.3"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:leap:15.3"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/" 

db version:

PostgreSQL 14.11 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.5.0, 64-bit

Table structure:

 CREATE TABLE "vehicle"."last_status" (
  "vehicle_id" int8 NOT NULL,
  "status_id" int4,
  "status_updated" timestamptz(6),
  "longitude" float8,
  "latitude" float8,
  "location_updated" timestamptz(6),
  "speed" int2,
  "ignition" int2,
  "terminal_event_id" int8,
  "pnd_online" bool,
  "event_ts" timestamptz(6),
  "position_description_id" int8,
  "odometer" int8,
  "tracker_vehicle_id" int8,
  "idle_start" timestamptz(6),
  "last_valid_gps_ts" timestamptz(6),
  "rpm" int4,
  "third_party_sequence" int8 DEFAULT 0,
  "altitude" int4,
  "gps_fix_type" int2,
  "clock" int4,
  "last_identification_tag_id" varchar,
  "last_tag_ts" timestamptz(6),
  "bearing" int2,
  "last_ign_on_ts" timestamptz(6),
  "last_ign_off_ts" timestamptz(6),
  "terminal_event_type_id" int8,
  "event_description" varchar(20),
  "gps_accuracy" int2,
  "exceptions" int4,
  "input_state" int4,
  "temp1" float8,
  "temp2" float8,
  "temp3" float8,
  "temp4" float8,
  "event_geom" geometry,
  "driver_duty_state" varchar(2),
  "event_ts_last_visible" timestamptz(6),
  "driver_name" varchar,
  "input_state2" int4,
  "dynamic1" int8,
  "dynamic2" int8,
  "dynamic3" int8,
  "dynamic4" int8,
  "road_speed" int2,
  "input_state2_diagnostic_mask" int4,
  "input_state2_diagnostic_mask_counter" int4 DEFAULT 0,
  "oil_temp" int4,
  "water_temp" int4,
  "privacy_identification_tag" bool DEFAULT false,
  "fuel_level" int4,
  "vext" numeric(6,3),
  "vgsm" numeric(6,3),
  "user_id" int8,
  "enter_tile_ts" timestamptz(6),
  "tile_id" int8,
  "fuel_ts" timestamptz(6),
  "fuel_precentage_left" int4,
  "last_valid_latitude" float8,
  "last_valid_longitude" float8,
  "last_valid_position" int8,
  "battery_level" int4,
  "fuel_use" int4,
  "booking_status_id" int8,
  "booking_id" int8,
  "ev_battery_percentage_left" int4,
  "ev_battery_ts" timestamptz(6),
  "ev_charging_status" varchar,
  "ev_charging_status_ts" timestamptz(6),
  "is_charging" bool,
  "charging_start_ts" timestamptz(6),
  "brake_position" int4,
  "recieved_ts" timestamptz(6),
  "original_vehicle_id" int8,
  "input_state3" int4,
  "enter_geofence_ts" timestamptz(6),
  "geofence_id" varchar,
  "input_state2_event_ts" timestamptz(6),
  "client_driver_id" char(1) ,
  "malfuncation_state" int4,
  "terminal_id" int8,
  "brake_pressure" int4,
  "brake_bits" int4,
  "battery_charge" int8,
  CONSTRAINT "terminal_stataus_pkey" PRIMARY KEY ("vehicle_id")
);

After I downgraded to REL1_5, pg_squeeze started working. The only issue I faced with REL1_5 was related to the function squeeze.get_heap_freespace(), which doesn't return a valid amount of table-free space. However, with REL1-6, I encountered the same issue on some other servers. So, I have a plan to downgrade pg_squeeze on all servers.

ramkly avatar Apr 30 '24 07:04 ramkly

One more idea though: can you please share the REL1_6 library (pg_squeeze.so)? I'd like to try test it myself on opensuse 15.3.

And also, please send me the contents of shared_preload_libraries configuration parameter. And if there are other extensions there than pg_squeeze, please tell me their versions.

ahouska avatar May 03 '24 06:05 ahouska

Please find attached zip file. shared_preload is as follow shared_preload_libraries = 'pg_squeeze,pg_cron,pg_stat_statements,plprofiler'

pg_squeeze.zip

ramkly avatar May 09 '24 10:05 ramkly

@ramkly, you have opened the second segmentation fault issue with vague details and no stack trace, which makes fixing the issues practically impossible. If you need this settled, you have to provide a stack trace or a way to reproduce the crash.

You used master build, then switched to release without providing the exact steps taken. This procedure alone might have caused the crash. Like I said, without the required details there is no way to tell.

kovmir avatar Sep 05 '24 12:09 kovmir

Hi again, finally I was able to collect core dump. please find attached screenshot of core dump. please let me know if you need more information. image

ramkly avatar Sep 06 '24 06:09 ramkly

Thanks for the backtrace. Nevertheless, it looks weird because release_task() should not really call palloc().

As I probably said earlier, something might be wrong about your binary. Can you please replace it wit the one attached here? That's what I built for PG 14 on OpenSUSE 15.3. (I had to gzip that because github does not accept files with the .so suffix.) pg_squeeze.so.gz

ahouska avatar Sep 06 '24 07:09 ahouska

Could you please share the source file, I tried to compile REL1_6 but size of pg_squeeze.so you shared is 359K, the one I compiled is 95K. Could you confirm this is the right file I download? wget https://github.com/cybertec-postgresql/pg_squeeze/archive/refs/heads/REL1_6.zip And also could you please share the steps to change the library? I did folowing steps: 1- delete pg_squeeze.so from postgresql lib directory 2- copy the library you shared to postgresql lib directory 3- restart database 4- run alter extension pg_squeeze update Please let me know if I missed anything

ramkly avatar Sep 06 '24 08:09 ramkly

please find attached screenshot of core dump

@ramkly, thank you! Next time type set logging enabled on in GDB prompt to collect the output into a text file gdb.txt.

Could you confirm this is the right file I download? wget https://github.com/cybertec-postgresql/pg_squeeze/archive/refs/heads/REL1_6.zip

No it is not. You have to use a tag, not the latest commit of an arbitrary Git branch.

I did folowing steps: 1- delete pg_squeeze.so from postgresql lib directory 2- copy the library you shared to postgresql lib directory 3- restart database 4- run alter extension pg_squeeze update Please let me know if I missed anything

Right, but did it work?

Could you please share the source file, I tried to compile REL1_6 but size of pg_squeeze.so you shared is 359K, the one I compiled is 95K.

How do you check the size?

kovmir avatar Sep 06 '24 10:09 kovmir

@kovmir the 1.6 head and the last tag should be the same, but I agree that the tag is what should be downloaded.

@ramkly I think my file is bigger because it contains debugging symbols. Does it still crash? If so, what does the backtrace look like?

ahouska avatar Sep 06 '24 15:09 ahouska

I think my file is bigger because it contains debugging symbols.

Yes. On my system -g compilation flag gives 340K library, which becomes 94K if I strip it.

kovmir avatar Sep 06 '24 16:09 kovmir

So Far it worked without crashing but it didn’t help to remove table bloatsOn 6 Sep 2024, at 7:24 PM, ahouska @.***> wrote: @kovmir the 1.6 head and the last tag should be the same, but I agree that the tag is what should be downloaded. @ramkly I think my file is bigger because it contains debugging symbols. Does it still crash? If so, what does the backtrace look like?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

ramkly avatar Sep 06 '24 17:09 ramkly

So Far it worked without crashing but it didn’t help to remove table bloatsOn

Can you find the related record in the squeeze.log table?

ahouska avatar Sep 07 '24 11:09 ahouska

I found logs in the squeeze.error table showing that the squeeze process is sometimes unsuccessful due to an 'initial slot snapshot too large' error, while at other times, it succeeds. For instance, today I attempted to squeeze the table manually; at 9:13, it failed, but when I tried again 3 minutes later, it was successful. image and image

ramkly avatar Sep 09 '24 05:09 ramkly

  1. If there are no crashes anymore, something was probably wrong with your build, but I have no clue what it was. Please consider getting the extension binary from the openSUSE Build Service, according to https://www.postgresql.org/download/linux/suse/ (The latest OS version I see there is 15.5, so you might need to upgrade).

  2. The "initial slot snapshot too large" error is produced by PostgreSQL core, not by the extension. If the reason is too many running transactions, perhaps try to squeeze the table at a different time, when the system is not that busy.

  3. Regarding the bloat not being removed: how do you check that? Can't the problem be that the table gets bloated immediately after being squeezed? The logs you provided above show that, before 26k rows gets copied, 32k UPDATEs are performed.

ahouska avatar Sep 09 '24 10:09 ahouska

@ramkly, any news?

kovmir avatar Sep 20 '24 10:09 kovmir

Hi guys.After I compiled that version you shared, I didn’t face any issue, everything works well. RegardsAbbasOn 20 Sep 2024, at 2:02 PM, Ivan Kovmir @.***> wrote: @ramkly, any news?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

ramkly avatar Sep 20 '24 11:09 ramkly

@ramkly, glad to hear it!

kovmir avatar Sep 20 '24 11:09 kovmir