presto icon indicating copy to clipboard operation
presto copied to clipboard

Support lastDataCommitTime in connector commit output

Open lliiun-z opened this issue 2 years ago • 7 comments

Previously we support lastDataCommitTime for write queries in Metastore. Here we support the lastAccessTime for read queries in Metastore. It means when we read partition information from the Metastore, we record their last access times into the QueryInputMetadata for further processing.

== NO RELEASE NOTE ==

lliiun-z avatar Jun 02 '22 12:06 lliiun-z

I started review, but I think there are some basics about the PR that I don't understand. I think my main question is: what are we trying to collect last access times for? Is that the last access times of the partitions we are overwriting if we are overwriting existing partitions on a write, or is it the previous last access times for partitions we are reading? Something else? For either of the 2 options I listed, seems like this info would be subject to a serious race condition, as something else could read the partition after you grabbed the information from the metastore.

I should have explained "lastAccessTime" more carefully. Ideally "LastDataCommitTime" should be passed from PrismMetastorePartition to hadoop.hive.partition to presto.partition. However, hadoop.hive.partition does not contain "lastDataCommitTime" field. So in presto-facebook, I assign lastDataCommitTime to lastAccessTime. Therefore, in presto, we claim we use lastAccessTime; but in presto-facebook, we know we actually use lastDataCommitTime. So for our purpose, we don't have any race conditions. To solve the confusion part, we need to either remove hadoop.hive.partition from the code path if possible (i have no ideas how hard it will be), or file a request to hadoop community to add the field.

lliiun-z avatar Jun 02 '22 17:06 lliiun-z

I am trying to see if I can use partition.parameters to store and retrieve the lastDatacommitTime. If so, we can avoid the confusion between lastDataCommitTime and lastAccessTime.

lliiun-z avatar Jun 02 '22 17:06 lliiun-z

This approach is much nicer! Can you also add some tests?

Will do.

lliiun-z avatar Jun 06 '22 19:06 lliiun-z

This approach is much nicer! Can you also add some tests?

Added some test for checking the partition params.

lliiun-z avatar Jun 06 '22 22:06 lliiun-z

I want to make sure i understand the overall goal. it seems like this collects lastDataCommitTimes for the input partitions for queries that also write, but doesn't do anything if the query only reads. And that's so we can have more information about where new partitions come from. Is that correct?

Also, is it possible to create a test with some test input partitions with some defined lastDataCommitTimes, and then check a new partition you create has those lastDataCommitTimes for its input?

  1. Correct! lastDataCommitTimes is used for partitions as the data input. If there is no write operations (e.g., add partitions, alter partitions, create tables, etc), lastDataCommitTimesForWrite will be empty.
  2. I will try to create tests.

lliiun-z avatar Jun 09 '22 05:06 lliiun-z

Also, is it possible to create a test with some test input partitions with some defined lastDataCommitTimes, and then check a new partition you create has those lastDataCommitTimes for its input?

Hi @rschlussel, I ran an integration test, and printed out the input/output times for the partitions. I don't know if we can create a unit test for that. Anyways, I put the log of the test here.

This is a query does just write. presto:bi> insert into lin_test1 values ('12', '2020-01-12'); INSERT: 1 row

Query 20220609_134250_00000_8c6jf, FINISHED, 4 nodes Splits: 30 total, 30 done (100.00%) 0:26 [0 rows, 0B] [0 rows/s, 0B/s]

2022-06-09T19:28:16.197+0545 INFO dispatcher-query-0 com.facebook.presto.hive.PrismEventListener input: [] 2022-06-09T19:28:16.198+0545 INFO dispatcher-query-0 com.facebook.presto.hive.PrismEventListener output: [1654782195]. <<<---- the time for the partition creation.

This is a read-only query. presto:bi> select * from lin_test1 where ds='2020-01-12'; c1 | ds ----+------------ 12 | 2020-01-12 (1 row)

Query 20220609_134411_00001_8c6jf, FINISHED, 1 node Splits: 5 total, 5 done (100.00%) 0:10 [1 rows, 264B] [0 rows/s, 25B/s]

2022-06-09T19:29:21.860+0545 INFO dispatcher-query-9 com.facebook.presto.hive.PrismEventListener input: [1654782195]. <<<----- the time for the partition scanned; the same as the partition creation. 2022-06-09T19:29:21.860+0545 INFO dispatcher-query-9 com.facebook.presto.hive.PrismEventListener output: []

This is a query containing both read and write. presto:bi> insert into lin_test select * from lin_test1 where ds='2020-01-12'; INSERT: 1 row

Query 20220609_134509_00002_8c6jf, FINISHED, 4 nodes Splits: 30 total, 30 done (100.00%) 0:33 [1 rows, 264B] [0 rows/s, 7B/s]

2022-06-09T19:30:43.004+0545 INFO dispatcher-query-10 com.facebook.presto.hive.PrismEventListener input: [1654782195] 2022-06-09T19:30:43.004+0545 INFO dispatcher-query-10 com.facebook.presto.hive.PrismEventListener output: [1654782341]

lliiun-z avatar Jun 09 '22 13:06 lliiun-z

Other than still needing automated tests for this feature, I don't have additional concerns.

rschlussel avatar Jun 16 '22 17:06 rschlussel

The test failure seems not related to this PR.

lliiun-z avatar Aug 26 '22 18:08 lliiun-z

@rschlussel / @lliiun-z Can we merge this, since this PR introduces ThriftMetastoreUtil.LAST_DATA_COMMIT_TIME which is used by https://github.com/facebookexternal/presto-facebook/pull/1907 which got merged today. The build is broken due to this.

ajaygeorge avatar Sep 01 '22 00:09 ajaygeorge