pg_filedump icon indicating copy to clipboard operation
pg_filedump copied to clipboard

weird number interpretation

Open sandroden opened this issue 1 year ago • 5 comments

I'm starting using pg_filedump and it's possible what I report is out of my ignorance rather than a bug, sorry in that case.

I have a simple db, single table, that is a stripped version of real data:

CREATE TABLE public."rdebug" (
    job_code text,
    cons text,
    data date,
    tipo text,
    qta numeric(10,2),
    unita text,
    p_a text,
    descr text,
    cambio double precision,
    att_code text,
    spunta boolean,
    id integer NOT NULL,
    giorno_id integer,
    phase integer
);

COPY public."rdebug" (job_code, cons, data, tipo, qta, unita, p_a, descr, cambio, att_code, spunta, id, giorno_id, phase) FROM stdin;
SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N	1		t	274	463	\N
SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N	1		t	274	463	\N
SW/FOSS	sd	2007-01-10	O	3.00	h	\N	sito	99989	code	\N	3783	2045	\N
SW/FOSS	\N	\N	\N	8.00	\N	\N	sqlkit	\N	\N	\N	4962	2317	\N
\.

I inspect if with pg_filedump:

root@bluffx:/rescue/postgres# pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,numeric,varchar,bool,int,int,int" 16/dbg/base/5/16384

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 16/dbg/base/5/16384
* Options used: -D 
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)
 Block: Size 8192  Version    4            Upper    7864 (0x1eb8)
 LSN:  logid      0 recoff 0x015094d0      Special  8192 (0x2000)
 Items:    4                      Free Space: 7824
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 40

<Data> -----
 Item   1 -- Length:   84  Offset: 8104 (0x1fa8)  Flags: NORMAL
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N
 Item   2 -- Length:   84  Offset: 8016 (0x1f50)  Flags: NORMAL
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2000-01-31	O	3.E9	h	\N	\N
 Item   3 -- Length:   88  Offset: 7928 (0x1ef8)  Flags: NORMAL
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2007-01-10	O	3.E9	h	\N	sito
 Item   4 -- Length:   60  Offset: 7864 (0x1eb8)  Flags: NORMAL
COPY: SW/FOSS	\N	\N	\N	8.E9	\N	\N	sqlkit	\N	\N	\N	4962	2317	\N


*** End of File Encountered. Last Block Read: 0 ***

quantity ( qta numeric(10,2),) should be 3.00 in the first 3 rows but I read 3.00, 3.E9, 3.E9

if I rerun it w/o any modification I get that number incremented:

root@bluffx:/rescue/postgres# pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,numeric,varchar,bool,int,int,int" 16/dbg/base/5/16384|grep SW/
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2000-01-31	O	3.I2	h	\N	\N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2007-01-10	O	3.I2	h	\N	sito
COPY: SW/FOSS	\N	\N	\N	8.I2	\N	\N	sqlkit	\N	\N	\N	4962	2317	\N
root@bluffx:/rescue/postgres# pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,numeric,varchar,bool,int,int,int" 16/dbg/base/5/16384|grep SW/
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2000-01-31	O	3.F0	h	\N	\N
Error: unable to decode a tuple, callback #9 returned -1. Partial data: SW/FOSS	sd	2007-01-10	O	3.F0	h	\N	sito
COPY: SW/FOSS	\N	\N	\N	8.F0	\N	\N	sqlkit	\N	\N	\N	4962	2317	\N

Am I wrong in creating the attr list? What is the tuple that is not decoding correctly?

When I'll be able to get consistent outputs I should apply it to the real data that was dump with pg12, should I re-compile it with headers from PostgreSQL 12 or can I just use the pg_filedump I compiled on mu Ubuntu 24:04 with headers form pg16?

Thanks for any hints

sandroden avatar Sep 30 '24 08:09 sandroden

Confirmed, there is something wrong in CopyAppendNumeric().

df7cb avatar Oct 04 '24 15:10 df7cb

Since the on-disk format didn't change for decades (or else pg_upgrade wouldn't work), you can use the 16 pg_filedump also with version 12. The bug will also be there, though.

df7cb avatar Oct 04 '24 15:10 df7cb

Thanks @df7cb, is this something will be addresses in a reasonable time or I'm better off not relying on this fix? Please don't take me wrong, this is just to plan my strategies (or surrender... :-)

sandroden avatar Oct 08 '24 08:10 sandroden

I'll be looking at it, but I can't promise any deadlines.

df7cb avatar Oct 08 '24 09:10 df7cb

Probably worth mentioning that "double precision" type != "numeric" type - although in docs (https://www.postgresql.org/docs/current/datatype-numeric.html) they both are in category Numeric. As for now, pg_filedump does not mention double precision format in the list of supported types. (Although the value changing problem still persists if using "~" to skip the rest of the tuple. Hadn't figure out why yet.)

GetsuDer avatar Nov 07 '24 07:11 GetsuDer

@GetsuDer was right - trying to dump a double precision value as numeric will not work, I did not see that when looking at the issue. pg_filedump does actually support that type, it's spelled float8 or float.

I don't think there is much we can do to fix the problem of detecting wrong data types - many byte patterns in one data type will be valid in other data types.

df7cb avatar Apr 15 '25 13:04 df7cb

Forgot to mention that I do get the proper values back when replacing numeric with float:

$ ./pg_filedump -D "varchar,varchar,date,varchar,numeric,varchar,varchar,varchar,float,varchar,bool,int,int,int" /home/myon/tmp/pg_virtualenv.gPHAc4/data/17/regress/base/5/16384 

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: /home/myon/tmp/pg_virtualenv.gPHAc4/data/17/regress/base/5/16384
* Options used: -D 
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)
 Block: Size 8192  Version    4            Upper    7864 (0x1eb8)
 LSN:  logid      0 recoff 0x0150bd10      Special  8192 (0x2000)
 Items:    4                      Free Space: 7824
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 40

<Data> -----
 Item   1 -- Length:   84  Offset: 8104 (0x1fa8)  Flags: NORMAL
COPY: SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N	1.000000000000		t	274	463	\N
 Item   2 -- Length:   84  Offset: 8016 (0x1f50)  Flags: NORMAL
COPY: SW/FOSS	sd	2000-01-31	O	3.00	h	\N	\N	1.000000000000		t	274	463	\N
 Item   3 -- Length:   88  Offset: 7928 (0x1ef8)  Flags: NORMAL
COPY: SW/FOSS	sd	2007-01-10	O	3.00	h	\N	sito	99989.000000000000	code	\N	3783	2045	\N
 Item   4 -- Length:   60  Offset: 7864 (0x1eb8)  Flags: NORMAL
COPY: SW/FOSS	\N	\N	\N	8.00	\N	\N	sqlkit	\N	\N	\N	4962	2317	\N


*** End of File Encountered. Last Block Read: 0 ***

df7cb avatar Apr 15 '25 13:04 df7cb