tidb-dashboard icon indicating copy to clipboard operation
tidb-dashboard copied to clipboard

cluster-info: disks showing host down message for tidb

Open dveeden opened this issue 4 years ago • 15 comments

Bug Report

What did you do?

Deploy TiDB with the TiDB Operator on AWS EKS. Then use the dashboard.

What did you expect to see?

Disk info for all node types

What did you see instead?

"Host information is unavailable due to nstances on the host is down"

image

In the debugger it does show partitions. So the host is not down.

What version of TiDB Dashboard are you using (./tidb-dashboard --version)?

The top left corner shows "PD v5.0.1"

dveeden avatar May 14 '21 09:05 dveeden

@baurine Would you like to take a look?

breezewish avatar May 14 '21 14:05 breezewish

sorry for just seeing this issue, let me have a look.

baurine avatar May 17 '21 08:05 baurine

hi @dveeden sorry for the late reply, recently I am handling a similar issue.

For this case, according to the console information and code logic, it is because there is no matched partitions.path and instances.partition_path_lower, it is expected that it has at least one pair that their values are the same.

image

I will investigate why this happens. Would you like to collect some information for us if the environment is still there?

Connect the database, run below 2 SQL:

select * from INFORMATION_SCHEMA.CLUSTER_LOAD where TYPE='tidb';
select * from INFORMATION_SCHEMA.CLUSTER_HARDWARE where TYPE='tidb';

Deploy TiDB with the TiDB Operator on AWS EKS.

What are the detailed steps? Maybe we can try to reproduce it.

Thanks!

baurine avatar Jun 21 '21 06:06 baurine

@baurine I see the same error message. I deployed a minimum cluster following this: https://docs.pingcap.com/tidb-in-kubernetes/stable/get-started.

Screenshot 2021-08-28 at 17 07 58

xpepermint avatar Aug 28 '21 15:08 xpepermint

@baurine I see the same error message. I deployed a minimum cluster following this: https://docs.pingcap.com/tidb-in-kubernetes/stable/get-started.

Screenshot 2021-08-28 at 17 07 58

hi @xpepermint , yep, I found the root cause, it is because when we deploy in the cloud, we don't pass the -log-file command parameter to tidb-server, it sounds weird but it is true, the current implementation of disk info depends on this parameter, when this value is empty, something goes wrong. Because this info can be got from grafana, it's not that urgent to fix, but we plan to refactor it.

baurine avatar Aug 29 '21 15:08 baurine

Good. I'm happy to see that you've found the bug. Is there a chance to get it to a Milestone?

xpepermint avatar Aug 29 '21 15:08 xpepermint

Good. I'm happy to see that you've found the bug. Is there a chance to get it to a Milestone?

We haven't decided yet.

baurine avatar Aug 30 '21 00:08 baurine

Sounds like this needs some change in TiDB Operator? @baurine

breezewish avatar Sep 02 '21 15:09 breezewish

I am just curious that why disk info needs to depend on the --log-file parameter, can we remove this dependency? @breeswish

baurine avatar Sep 03 '21 05:09 baurine

I am just curious that why disk info needs to depend on the --log-file parameter, can we remove this dependency?

@baurine I guess it is because the disk info contains info for ALL disks, while we only want to display (the most suitable) one in this case.

breezewish avatar Sep 05 '21 13:09 breezewish

Do we have further updates? @baurine

breezewish avatar Dec 29 '21 10:12 breezewish

Do we have further updates? @baurine

nope, maybe we can arrange a task for this issue in the next sprint.

baurine avatar Dec 29 '21 13:12 baurine

@baurine @breezewish any updates?

dveeden avatar Jun 14 '22 07:06 dveeden

image This is with v6.1.0 and it shows the wrong mountpoint for TiFlash when running with tiup playground

dveeden avatar Jun 14 '22 14:06 dveeden

hi @dveeden , I will try to resolve it in the next release.

/cc @lilyjazz

baurine avatar Jun 15 '22 01:06 baurine

As discussed with PM, we decide to update the wrong tooltip first by PR #1469 .

baurine avatar Jan 13 '23 10:01 baurine