phoenix icon indicating copy to clipboard operation
phoenix copied to clipboard

PHOENIX-5991 IndexRegionObserver should not overwrite mutation timest…

Open kadirozde opened this issue 5 years ago • 2 comments

…amps set by clients

kadirozde avatar Jul 07 '20 07:07 kadirozde

@kadirozde - since HBase timestamps are ultimately derived from physical timestamps, rather than, say HLCs, we're already prone to weird effects from clock skew (such as different rs's having different clocks). This improves that in one aspect, since when the client-set timestamp "wins" it'll be consistent across region servers.

I'm curious about the implications of letting the server timestamp win when the server is behind the client. What if the client clock is slightly ahead of a server? Don't we go back to inconsistent timestamps then?

It seems like a preference for either the client or the server can cause skew-related problems.

When we honor the client timestamps, we get consistent timestamps across a logical operation no matter how many region servers it involves. However, we can get weird behavior where two clients with clock skew wrt each other can invert the order of operations.

When we honor the server timestamps, we can enforce an order between client operations regardless of the clients' clocks, but we can get inconsistent timestamps within each operation.

I can think of use cases I've worked with in the past that have a strong need for either consistent timestamps across a long-running operations and others with lots of small operations where order of operations across clients really matters , so I'm thinking this needs to be configurable.

@gjacoby126 - Thank you for the feedback. I agree with on almost all of the points you made. I am not sure about having a configuration parameter about this due to the conflict with SCN queries. I think we need to honor the client timestamps and it will improve the current situation as you pointed out. Initially, I started with the following logic: long now = getTimestamp(firstMutation); if (now == HConstants.LATEST_TIMESTAMP) { now = EnvironmentEdgeManager.currentTimeMillis() }

Then I noticed that the client always sets the timestamp for mutations in my ITs. I changed the logic. I think as you said, we should either use the client or server timestamp. I will revert the logic, and so we will use the client timestamp when the client set it. What do you think?

gjacoby126 avatar Jul 07 '20 17:07 gjacoby126

@kadirozde - good point about the SCN. I think if this is configurable, there would need to be a validation check giving you an exception if you try to ask for both a client-set SCN and server side timestamps on mutations.

I like your suggestion about a given operation deterministically using either the client or server timestamp.

Seems to me the big question is how clients and servers decide which of those two to use. If it's the client's choice to either set or not set a timestamp, that still leaves the question of how the client chooses. Same if it's the server's choice -- is it a cluster setting, a table setting, or a hard-coded default? Dependent on mutable vs immutable? Lots of implications to work through for all possibilities, and I don't have a firm opinion yet.

For example, it would be good for a particular table to be able to override the default to say whether client or server-side timestamps should govern. I can also see it being useful for clients to request it as well as a connection property, as they do for SCN, but I worry a little about what happens if different clients make different choices...so I'm undecided.

gjacoby126 avatar Jul 07 '20 21:07 gjacoby126