Gian Merlino

Results 161 comments of Gian Merlino

> Coordinator/Overlord is currently a single point of failure. There is failover if the Coordinator or Overlord fails, so I wouldn't call it a SPOF. Maybe the issue is scalability....

> The one that we are actively testing in prod is the polling that hits this endpoint: `/druid-ext/basic-security/authentication/db/%s/cachedSerializedUserMap`. That's good to know. Do you know if it's actually this API...

> Since the HTTP client simply pick 1 server at random, yes, sometimes they went to the busy Coordinator and predictably, suffered from the readTimeoutException. Ah, got it, I see....

I agree totally that it would be a cool feature. my 2¢ on implementation: The SQL input source is built to be super generic and pull from any database over...

> I am not sure that it is possible to pick and choose the parquet files that contains the latest version in iceberg. Is it possible? I don't know either....

> letting the overlord create k8s pods is a huge change including API, Task scheduling model and etc. That's too bad, since one of the original goals of the TaskRunner...

Looking at the 3 comments about issues here (from @didip, @applike-ss, & @dene14) it seems to me that the issues are probably related but different. The original report by @dene14...

I think I see why the original log from @dene14 was so strange. The ReadTimeoutException is a liar! It is a static that is initialized once with a stack trace...

I am using JvmMonitor on Java 11 too, and it works OK with these options: ``` --add-exports=java.base/jdk.internal.ref=ALL-UNNAMED --add-exports=java.base/jdk.internal.perf=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED ``` Like @dampcake, I'm not sure which one exactly is...