fsttable icon indicating copy to clipboard operation
fsttable copied to clipboard

fsttable should gracefully handle the deletion/modification of the file it refers to

Open martinblostein opened this issue 7 years ago • 1 comments

The table object should determine when the file it points to is deleted. At a minimum, this should happen when a query arrives that requires reading from disk, so that the inevitable error is caught. This check could also happen when ever the fsttable is acted upon (without disk read), and a warning could be thrown. I'm not sure if that is necessary.

Also, the fst file may be modified by other means, unknown to an existing fsttable object. There are a variety of ways to see whether the fstttable is up to date. A checksum is overkill. We could use Date Modified as provided by the OS, but I think that could be unreliable. Another option is to build a time-stamp into the fst metadata. This time-stamp could then be read directly, avoiding relying on the OS. Interested to hear alternative approaches to this.

martinblostein avatar Mar 28 '18 11:03 martinblostein

Setting the location to UTC on the timestamp does work, so this is not really a function of the system timezone but the way timestamps are serialized.

wbl avatar Apr 26 '22 19:04 wbl

Is the database column a timestamp or timestamptz? Can you make a small reproduction case for this?

jackc avatar Apr 27 '22 00:04 jackc

It's timestamp. I can make a small reproduction sometime: been a bit busy lately.

wbl avatar May 05 '22 22:05 wbl

The same problem here (from https://github.com/ent/ent/issues/2692):

The column created in Postgres is:

used_at timestamp with time zone NOT NULL,

The value in Postgres is saved without timezone:

2022-06-30 22:49:03.970913+00

I'm using PG in Docker and using the PG query:

show timezone

I get:

Etc/UTC

I get from Ent (using pgx stdlib) the value:

2022-07-01T00:49:03.970913+02:00

This is a mess on the clients (where an UTC value is expected).

Using pgdriver/pq I get the UTC value from DB.

Am I wrong somewhere?

frederikhors avatar Jun 27 '22 11:06 frederikhors

I tried using this connection with this code too:

postgres://postgres:postgres@localhost/project?sslmode=disable&timezone=UTC"
import (
	"database/sql"
	_ "github.com/jackc/pgx/v4/stdlib"
)

conn, err := sql.Open("pgx", "postgres://postgres:postgres@localhost/project?sslmode=disable&timezone=UTC")
//handle err

The problem is here.

I need a way to get back from DB the UTC values (that are laready stored in the DB).

frederikhors avatar Jun 27 '22 11:06 frederikhors

@jackc can you please help us here? 🙏

frederikhors avatar Jun 28 '22 13:06 frederikhors

I asked on SO too: https://stackoverflow.com/questions/72771272/how-to-setup-pgx-to-get-utc-values-from-db.

They suggest a custom type. More code, more issues!

@wbl did you find a way?

frederikhors avatar Jun 28 '22 15:06 frederikhors

Okay, I see what's going on here. The SO thread basically has the answer, but here's a little more of the explanation of why.

timestamptz doesn't store time zones. PostgreSQL always stores UTC and automatically converts on the way in and out. Except when using the binary format. Then it always sends an int64 of the number of microseconds since Y2K (similar to Unix time). There is no way to use the PostgreSQL session time zone (the text format is better, but it isn't perfect either).

That leaves us with the choice of using UTC or using the time.Local / ENV["TZ"]. It's an arbitrary choice, but its been that way for at least 5 years. I can't see breaking compatibility now.

jackc avatar Jul 01 '22 02:07 jackc

So Go time representation stores the timezone explicitly. Code that is working is setting the location to UTC on retrieval and adjusting the offsets on insertion. If the returned times are set to the UTC location, it won't break code that is working.

If we don't want to fix it via what would technically be a breaking change, adding documentation could help.

wbl avatar Jul 01 '22 03:07 wbl

@jackc I'm asking you an OPTIONAL config to return UTC as stored in DB. Please. No breaking.

frederikhors avatar Jul 01 '22 09:07 frederikhors

@jackc I'm asking you an OPTIONAL config to return UTC as stored in DB. Please. No breaking.

It already is optional. Just not with a config change. Create a new type that has pgtype.Timestamptz as its underlying type. Delegate all calls to pgtype.Timestamptz. For DecodeText and DecodeBinary run your time zone conversion code after the delegated call. See pgtype.JSONB for an example of pgx doing that style itself. pgtype.JSONB just adds a tiny bit of code to the original pgtype.JSON. Then use RegisterDataType to replace the default timestamptz data type.

jackc avatar Jul 01 '22 23:07 jackc

AFAICT a custom type works in all situations except when using the database/sql interface to scan into a time.Time, because stdlib does not respect ConnInfo when returning values from the driver.Rows interface and there's no easy way to change this behavior short of reimplementing the whole driver.Connector interface:

https://github.com/jackc/pgx/blob/d42b399be3ec4153ccd7e8b84ee50e60751a6936/stdlib/sql.go#L717-L726

univerio avatar Sep 14 '22 19:09 univerio

It already is optional. Just not with a config change. Create a new type that has pgtype.Timestamptz as its underlying type. Delegate all calls to pgtype.Timestamptz. For DecodeText and DecodeBinary run your time zone conversion code after the delegated call.

@jackc what is the minimal approach to implement the solution above in pgx v5?

I would like to wrap existing types as much as possible and just take the scan result time and convert it to UTC.

I thought the following could be a path

type timestamptzCodec struct {
	pgtype.TimestamptzCodec
}

func (t *timestamptzCodec) PlanScan(m *pgtype.Map, oid uint32, format int16, target any) pgtype.ScanPlan {
	plan := t.TimestamptzCodec.PlanScan(m, oid, format, target)
	if plan == nil {
		return nil
	}
	return &timestamptzPlan{plan}
}

type timestamptzPlan struct {
	pgtype.ScanPlan
}

func (t *timestamptzPlan) Scan(src []byte, target any) error {
	err := t.ScanPlan.Scan(src, target)
	fmt.Printf("Target info %v %T\n", target, target)
	return err
}

However target in the Scan method is of type *pgtype.timeWrapper so it cannot be cast to *time.Time and it can't be manipulated directly.

Is there a simpler solution?

dedalusj avatar Sep 28 '22 11:09 dedalusj

@dedalusj

I think what you would want to do is to create your own wrapper type that implements pgtype.TimestamptzScanner with your preferred time zone handling.

Then your choice is whether to have a custom Codec that handles *time.Time before delegating to the builtin Codec or you could insert logic in pgtype.Map.TryWrapScanPlanFuncs. For that approach you would write a function like pgtype.TryWrapBuiltinTypeScanPlan that detects time.Time and wraps it with your type. Prepend that function to the pgtype.Map.TryWrapScanPlanFuncs to give it priority over over the automatic wrapping of a *time.Time with *pgtype.timeWrapper.

jackc avatar Oct 01 '22 17:10 jackc

Just ran into this issue.

Currently if you define a timestampz field then marshal and unmarshal as normal using pgx you will get incorrect data in memory (but correct in db) unless your local timezone happens to match the timezone of the entry.

Setting the TZ environment variable does not seem to change this.

i0n avatar Oct 23 '22 12:10 i0n

Ran into this issue too. I would have expected the driver to convert time.Time values to UTC. I don't care about storing time zone, just want the same timestamp returned (e.g., Unix() is equivalent for what I stored and retrieved).

broady avatar Dec 05 '22 14:12 broady

Hey there :)

I am having the same issue..

Whats weird is that, when I use sql.Open or sqlx.Connect, everything works fine. However, when using OpenDB (passing it a driver), it always converts the time to Local on scan instead of UTC.. I am working with timestampz

Any idea?

Thanks

Emixam23 avatar Feb 27 '23 09:02 Emixam23

I seem to be running into this issue with a query that looks like this, where created_at is a timestamp column

SELECT id, created_at, title FROM sheet WHERE created_at <= $1

When I pass in time.Now() or time.Now().UTC() to the query, it's not working as expected. I'll post a workaround here if I figure it out

louisrli avatar Mar 31 '23 06:03 louisrli

@dedalusj Did you end up getting this working? I'm sure a working example could help out a lot of folks.

fantapop avatar Jun 15 '23 07:06 fantapop

Ran into this as well with created_at / updated_at columns with GORM and beyond. Interesting because, even if you aggressively set the timezone in the pg database, set it in the connection string, and override your system global time zone, and set them all to UTC... you get back timestamps that appear identical for all intents and purposes (time values are identical down to nanoseconds, tz are identical), but are not quite equal in the eyes of the compiler (although they are time.Time.Equal()). Those of us who are lazy, terrible, no-good devs who use reflect.DeepEquals() seem to be out of luck in the current situation, barring writing our own scan implementations.

astockwell avatar Oct 25 '23 20:10 astockwell

This issue was originally filed with pgx v4. I don't see making any changes there.

But think that PR https://github.com/jackc/pgx/pull/1948 would solve all the issues raised here for pgx v5.

It allows specifying the desired time.Location that should be used for scanned values on the Codec level. It also should solve the issue of the time location being nil vs. time.UTC.

		conn.TypeMap().RegisterType(&pgtype.Type{
			Name:  "timestamptz",
			OID:   pgtype.TimestamptzOID,
			Codec: &pgtype.TimestamptzCodec{ScanLocation: time.UTC},
		})

Please review and give feedback.

jackc avatar Mar 16 '24 18:03 jackc

#1948 has been merged. This should resolve this issue.

jackc avatar May 08 '24 13:05 jackc