TraceRCA icon indicating copy to clipboard operation
TraceRCA copied to clipboard

关于train-ticket微服务数据采集问题

Open chiyun1111 opened this issue 2 years ago • 10 comments

作者你好呀,我现在想把你论文里提到的train-ticket后台搭起来然后自己采集一些数据,但是现在基本搭起来后发现论文里提到的那些指标在trace中都没办法采集呢,所以想问下作者你这边采集数据的一些经验,不知道可不可以向你请教下呢?我的邮箱是[email protected](我没找到作者你的邮箱),在这里直接回复或者发邮件给我都可以,万分感谢哟!

chiyun1111 avatar Jul 15 '22 07:07 chiyun1111

机器指标需要用prometheus采集,可以参考 https://github.com/lizeyan/train-ticket (此仓库里面不是当时做TraceRCA这个工作时用的脚本,不一定完全一致)

lizeyan avatar Jul 15 '22 07:07 lizeyan

不是,我用的配置在给的链接里面

Zeyan LI 李则言 E-Mail: @.*** Ph.D. student Department of Computer Science Tsinghua University Beijing, China

2022年7月15日 15:23,done @.***> 写道:

感谢回复!Prometheus已经搭好了,但是怎么采集想要的指标还在研究中,目前看到的指标都不是想要的。请问一下你最后搭的系统用的是原作者最终带jaeger的compose文件吗?

— Reply to this email directly, view it on GitHub https://github.com/NetManAIOps/TraceRCA/issues/12#issuecomment-1185256536, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC7KLI7BSZ5QPNVOT3YJ2KLVUEGXTANCNFSM53UTE5PQ. You are receiving this because you commented.

lizeyan avatar Jul 15 '22 07:07 lizeyan

好的谢谢,我再看看!

chiyun1111 avatar Jul 15 '22 07:07 chiyun1111

作者你好呀!我对你在train_ticket方面都做的工作非常感兴趣,根据你的指导,我们试图部署你给出的系统(https://github.com/lizeyan/train-ticket),但是我们总是不能成功。你方便分享一下当时你搭建系统的一些信息吗, 1、5台设备都是什么配置(几个核,几G)? 2、K8S是哪一个版本的?Jaeger是all-in-one:1.23吗?(我们注意到定义了一个jaeger.yaml文件,但是我们没有发现它在什么地方被部署) 3、Elasticsearch是7.5.0吗? 多谢先!

chiyun1111 avatar Aug 17 '22 06:08 chiyun1111

  1. 每台是48核,64G的服务器
  2. 是这个jaeger.yaml文件,README忘了写了。K8S版本是1.23.1,当时就是装的最新版,应该影响不大..
  3. Elasticsearch版本是es-jaeger.yaml中写的,7.5.2

lizeyan avatar Aug 17 '22 06:08 lizeyan

好的,非常感谢!

chiyun1111 avatar Aug 17 '22 07:08 chiyun1111

我也有同样的问题——想采集train-ticket的日志数据,准确的说是trace log/span log。我刚接触微服务架构/APM(e.g. skywalking)/elasticsearch,不了解他们的详细用法,在官方文档中也难以找到答案。目前只能从部署的skywalking中可视化监测train-ticket的日志数据,但不知如何进一步批量收集日志为一个数据集(像这样https://cloud.tsinghua.edu.cn/d/8371855eddd64a8db23b/)。因此想向你学习一下数据采集方面的经验,比如APM的选择、是否要编写插件脚本、如何采集自己想要的日志数据类别(traceID/spanID/timestamp/...)等,还望不吝赐教,指点一二,万分感谢! 以下是我找到的trace log样本: { "name": "Hello-Greetings", "context": { "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2", "span_id": "0x5fb397be34d26b51", }, "parent_id": "0x051581bf3cb55c13", "start_time": "2022-04-29T18:52:58.114304Z", "end_time": "2022-04-29T22:52:58.114561Z", "attributes": { "http.route": "some_route1" }, "events": [ { "name": "hey there!", "timestamp": "2022-04-29T18:52:58.114561Z", "attributes": { "event_attributes": 1 } }, { "name": "bye now!", "timestamp": "2022-04-29T18:52:58.114585Z", "attributes": { "event_attributes": 1 } } ], } { "name": "Hello-Salutations", "context": { "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2", "span_id": "0x93564f51e1abe1c2", }, "parent_id": "0x051581bf3cb55c13", "start_time": "2022-04-29T18:52:58.114492Z", "end_time": "2022-04-29T18:52:58.114631Z", "attributes": { "http.route": "some_route2" }, "events": [ { "name": "hey there!", "timestamp": "2022-04-29T18:52:58.114561Z", "attributes": { "event_attributes": 1 } } ], } { "name": "Hello", "context": { "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2", "span_id": "0x051581bf3cb55c13", }, "parent_id": null, "start_time": "2022-04-29T18:52:58.114201Z", "end_time": "2022-04-29T18:52:58.114687Z", "attributes": { "http.route": "some_route3" }, "events": [ { "name": "Guten Tag!", "timestamp": "2022-04-29T18:52:58.114561Z", "attributes": { "event_attributes": 1 } } ], }

span log 样本: { "trace_id": "7bba9f33312b3dbb8b2c2c62bb7abe2d", "parent_id": "", "span_id": "086e83747d0e381e", "name": "/v1/sys/health", "start_time": "2021-10-22 16:04:01.209458162 +0000 UTC", "end_time": "2021-10-22 16:04:01.209514132 +0000 UTC", "status_code": "STATUS_CODE_OK", "status_message": "", "attributes": { "net.transport": "IP.TCP", "net.peer.ip": "172.17.0.1", "net.peer.port": "51820", "net.host.ip": "10.177.2.152", "net.host.port": "26040", "http.method": "GET", "http.target": "/v1/sys/health", "http.server_name": "mortar-gateway", "http.route": "/v1/sys/health", "http.user_agent": "Consul Health Check", "http.scheme": "http", "http.host": "10.177.2.152:26040", "http.flavor": "1.1" }, "events": [ { "name": "", "message": "OK", "timestamp": "2021-10-22 16:04:01.209512872 +0000 UTC" } ] }

WaldenLeefx avatar Feb 17 '23 14:02 WaldenLeefx

我也有同样的问题——想采集train-ticket的日志数据,准确的说是trace log/span log。我刚接触微服务架构/APM(e.g. skywalking)/elasticsearch,不了解他们的详细用法,在官方文档中也难以找到答案。目前只能从部署的skywalking中可视化监测train-ticket的日志数据,但不知如何进一步批量收集日志为一个数据集(像这样https://cloud.tsinghua.edu.cn/d/8371855eddd64a8db23b/)。因此想向你学习一下数据采集方面的经验,比如APM的选择、是否要编写插件脚本、如何采集自己想要的日志数据类别(traceID/spanID/timestamp/...)等,还望不吝赐教,指点一二,万分感谢! 以下是我找到的trace log样本: { "name": "Hello-Greetings", "context": { "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2", "span_id": "0x5fb397be34d26b51", }, "parent_id": "0x051581bf3cb55c13", "start_time": "2022-04-29T18:52:58.114304Z", "end_time": "2022-04-29T22:52:58.114561Z", "attributes": { "http.route": "some_route1" }, "events": [ { "name": "hey there!", "timestamp": "2022-04-29T18:52:58.114561Z", "attributes": { "event_attributes": 1 } }, { "name": "bye now!", "timestamp": "2022-04-29T18:52:58.114585Z", "attributes": { "event_attributes": 1 } } ], } { "name": "Hello-Salutations", "context": { "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2", "span_id": "0x93564f51e1abe1c2", }, "parent_id": "0x051581bf3cb55c13", "start_time": "2022-04-29T18:52:58.114492Z", "end_time": "2022-04-29T18:52:58.114631Z", "attributes": { "http.route": "some_route2" }, "events": [ { "name": "hey there!", "timestamp": "2022-04-29T18:52:58.114561Z", "attributes": { "event_attributes": 1 } } ], } { "name": "Hello", "context": { "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2", "span_id": "0x051581bf3cb55c13", }, "parent_id": null, "start_time": "2022-04-29T18:52:58.114201Z", "end_time": "2022-04-29T18:52:58.114687Z", "attributes": { "http.route": "some_route3" }, "events": [ { "name": "Guten Tag!", "timestamp": "2022-04-29T18:52:58.114561Z", "attributes": { "event_attributes": 1 } } ], }

span log 样本: { "trace_id": "7bba9f33312b3dbb8b2c2c62bb7abe2d", "parent_id": "", "span_id": "086e83747d0e381e", "name": "/v1/sys/health", "start_time": "2021-10-22 16:04:01.209458162 +0000 UTC", "end_time": "2021-10-22 16:04:01.209514132 +0000 UTC", "status_code": "STATUS_CODE_OK", "status_message": "", "attributes": { "net.transport": "IP.TCP", "net.peer.ip": "172.17.0.1", "net.peer.port": "51820", "net.host.ip": "10.177.2.152", "net.host.port": "26040", "http.method": "GET", "http.target": "/v1/sys/health", "http.server_name": "mortar-gateway", "http.route": "/v1/sys/health", "http.user_agent": "Consul Health Check", "http.scheme": "http", "http.host": "10.177.2.152:26040", "http.flavor": "1.1" }, "events": [ { "name": "", "message": "OK", "timestamp": "2021-10-22 16:04:01.209512872 +0000 UTC" } ] }

我就是把tracing工具采集到的数据存到 ES 里,然后我再从 ES里批量导出来而已.

lizeyan avatar Feb 19 '23 06:02 lizeyan

使用的API能分享一下吗,是ES的Query DSL吗,还是REST API?还是有其他方法?需要使用kibana吗

WaldenLeefx avatar Feb 22 '23 15:02 WaldenLeefx

python有个库,叫elasticsearch,是对es的http API的简单封装

lizeyan avatar Feb 22 '23 15:02 lizeyan