VictoriaLogs icon indicating copy to clipboard operation
VictoriaLogs copied to clipboard

Updated VM code for VL?

Open sc7565 opened this issue 2 years ago • 9 comments

Are there plans to update the VM code in this repository to support latest code?

sc7565 avatar Sep 02 '21 04:09 sc7565

I did some work with VM-1.64.1-cluster, fixed some bugs:

  • memory leak in vmselect
  • broken labels api
  • corrupted logs within vmstorage
  • wrong direction in vmselect
  • dead connections in connPool
  • duplicated fetchDataOption resulted empty func result
  • unexpected limit result

but there're many things left to do:

  • the right result for reversed sorted timeseries with specified limit.
  • the right result for rollup/aggfunc
  • the right result for binaryOpFunc
  • loki parsers support

@faceair Is this project still active?Any interest we move things forward?

mxlxm avatar Sep 06 '21 13:09 mxlxm

My initial goal was to try out the storage design using VictoriaMetrics to see if it could be used to store more types of data and to learn more about VictoriaMetrics.

Implementing the loki protocol just seemed simple enough, and I haven't actually used loki in a production environment. I'm confused if loki is designed to work well enough, since it doesn't support queries with high cardinality labels. Are you guys using loki in production, as a full replacement for Elasticsearch, or are you using both?

faceair avatar Sep 06 '21 14:09 faceair

I'm also worried about LogQL's performance when there's a lot of data.

{container="query-frontend",namespace="loki-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500

Take this expression as an example. The preceding label selection part we can treat as a traditional database index, and there are many ways to optimize its performance. But the log pipeline needs to load the raw data, and the scanning and filtering overheads are very high, which I feel is more complicated than even the full-text search implementation.

faceair avatar Sep 06 '21 15:09 faceair

Loki's great for lightweight environment, simple and more efficient than es. But yes, it's unusable with heavy load(peek at 60w logs/s). The pipeline design sucks with huge data. So, I'm thinking about auto-create label like __deduplicated_log__ with deduplicated(r.Line), and turn pipeline query into tagfilter to solve this.

mxlxm avatar Sep 06 '21 15:09 mxlxm

The performance of loki is better than I thought it would be.

Also, we may need to find a few reasons to move this project forward.

  1. breakthrough in query or storage performance?

It would be more difficult to be fully compatible with the LogQL query protocol.

  1. reduce the cost of storage?

We also need to support s3 storage to reduce costs. Or something like juicefs, but they have poor random read/write performance.

  1. reduce the resource footprint of loki itself?

The MergeSet is somewhat similar to LSMTree in that it will maintain the order of the data in the blocks by merging them multiple times. This makes sense when dealing with metric data, but for log data, where the size of one log is large, it is not very economical to sort in memory multiple times and write to disk repeatedly. With the current VictoriaLogs storage model design, the resource usage is not necessarily better than loki.

I don't really understand the deduplicated_log design you just mentioned, but it looks like a local optimization, and you should be able to implement it inside loki as well.

I've recently been looking for some new ideas to support log data storage and querying, the goal is also efficient querying and cheap storage, but it probably won't be designed to be compatible with the loki protocol anymore.

faceair avatar Sep 06 '21 16:09 faceair

I've tried Loki-distributed(with memcache enabled) with the same data, basically unable to query for entry rate large than 1w/s, even within 15 minutes. And querier would consume endless memory without buffer reuse.

As you said Implementing the loki protocol just seemed simple enough., and maybe we can just implement the basic needs.

For the s3-compatible storage, we could intergate minio-go into vmstorage within VM, and backport to VL.

For my current VL-cluster, the resource usage seems acceptable compare to loki.

mxlxm avatar Sep 07 '21 03:09 mxlxm

Great, it looks like we can still move forward with this project. You are welcome to submit your bug fixes to PR. Also, I'll be putting together a follow-up TODO.

faceair avatar Sep 07 '21 07:09 faceair

I'll send the PR later. By the way, I think we shouldn't change value type in vmstorage, how about extend vmstorage with a new field data to store log data, each log's value defaults to 1, so we could benefit with most MetricsQL funcs. Besides, we could support exemplars with this extended data field.

mxlxm avatar Sep 07 '21 11:09 mxlxm

@faceair https://github.com/faceair/VictoriaLogs/pull/7 PR is ready for review.

mxlxm avatar Sep 07 '21 14:09 mxlxm