fsttable icon indicating copy to clipboard operation
fsttable copied to clipboard

Differences in goals between disk.frame and fsttable

Open MarcusKlik opened this issue 6 years ago • 0 comments

Hi @phillc73, you posted a question on the differences between disk.frame and fsttable. As you say, disk.frame is a very nice package to work with large datasets split into chunks of data (stored as fst files). With fsttable, instead of working with chunks, the idea is to split all operations per column, so only columns required for a certain operation are in memory. In addition, I would like to use fsttable as a driver to develop grouping and ordering capabilities in fst. For grouping and ordering, it is necessary to be able to read datasets from disk in arbitrary row order. If that's possible, it's not needed to rewrite data to rearrange a dataset vertically.

(obviously, these features will be beneficial to disk.frame as well)

So, while disk.frame limits RAM usage by splitting operations in vertical chunks, fsttable limits RAM usage by processing columns separately. For grouped operations, fsttable can use subsets of column data to further limit RAM usage. Both packages can store the results onto disk again as new on-disk table's.

hope that answers your question, thanks!

MarcusKlik avatar May 09 '19 20:05 MarcusKlik

Sorry, I don't understand what you are asking.

jackc avatar Feb 24 '24 15:02 jackc

In the codebase I have (I haven't been able to reproduce in a sample yet) that when the database restarts or another issue causes the connection to disconnect the connection stays acquired in the pool

bck01215 avatar Feb 25 '24 17:02 bck01215


+	logger := logger.LoggerFromContext(ctx)
+	poolConn, err = pgPool.Acquire(ctx)

+	if err != nil {
+		logger.Errorf("failed to acquire postgres connection: %v", err)
+	}
+	defer poolConn.Release()
	for {
-		for {
-			poolConn, err = pgPool.Acquire(ctx)
-			if err != nil {
-				logger := logger.LoggerFromContext(ctx)
-				logger.Errorf("failed to acquire postgres connection: %v", err)
-				time.Sleep(time.Second)
-				continue
-			}
-			break
-		}
		conn := poolConn.Conn()

		_, err = conn.Exec(ctx, "listen "+pgx.Identifier{"sync"}.Sanitize())

I found the problematic code in our own code base. Still not sure why the Acquire was stuck open after calling it when the connection failed

bck01215 avatar Feb 25 '24 19:02 bck01215