pyroscope icon indicating copy to clipboard operation
pyroscope copied to clipboard

Pluggable storage

Open petethepig opened this issue 4 years ago • 11 comments

Problem

Pyroscope uses badger which is a key-value database. It has good performance but it is operationally harder to use because it requires you to have a disk. It would be great if we could implement support for object storage, e.g AWS S3. This way pyroscope would be easier to deploy, plus it would pave the way for sharding / clustering.

Proposed Solution

Grafana recently came out with Grafana Tempo which is a distributed tracing backend. Behind the scenes they use tempodb which is a key value store with pluggable backends (S3, Google Cloud Storage, Azure blob storage) which is exactly what we need.

TODO

  • [ ] do some research, see if it's possible to use tempodb
  • [ ] see if there's any features that tempodb is missing, work around those
  • [ ] create some sort of a DB interface, replace *badger.DB references with DB interface references.
  • [ ] wrap badger and tempodb with some code that implements this new DB interface.
  • [ ] add tempodb config parameters to pyroscope server config
  • [ ] make sure everything works and cover everything with specs

petethepig avatar Apr 28 '21 18:04 petethepig

Since Grafana Tempo has changed it's license to AGPL - https://github.com/grafana/tempo/tree/main/tempodb we might need to try out something else for a DB.

yashrsharma44 avatar Jul 16 '21 19:07 yashrsharma44

@yashrsharma44 can you elaborate on your license concerns with the tempo moving from apache 2.0 to AGPL 3?

I am not a lawyer, but I don't think that using an AGPL 3 package forces the whole project to comply with the AGPL 3 license, and it only stipulates that changes made to tempodb itself be shared?

Reading the post about this:

It’s important to note that this change does not prevent our users from using, modifying, or providing our open source software to others — provided, however, that under the AGPL license, users have to share source code if they are modifying it and making it available to others (either as a distribution or over a network). For distributors, AGPL has the same source code sharing requirements as GPL. Those conditions are designed to encourage third parties looking to modify the software to also contribute back to the project and the community. We think this is a fairer way forward, and one that we believe will help us build a stronger community.

Tempodb looks like a great solution here, and having an option backed by blob storage I think would help pyroscope avoid scaling bottlenecks, so I'd like to understand what the actual license concerns are, and why they would prohibit using tempodb.

dalehamel avatar Mar 01 '22 16:03 dalehamel

Hi @dalehamel , thanks for raising discussion points.

I don't have much idea about the implications of AGPL3, would have to read more about them, before commenting on the consequences.

It's just that, I thought that it is a derivative of GPL, and might lead us to change our license as well. But, alas, on skimming through the details, I might have overestimated the implications. So far, it should work on using tempo-db, but yeah, I am no lawyer, so would be curious about the nitpicking and implications :D

yashrsharma44 avatar Mar 01 '22 17:03 yashrsharma44

Thanks @yashrsharma44 that more or less matches my interpretation, so I think at least for now there is no obvious license issue preventing the use of tempodb.

In terms of implementation...

create some sort of a DB interface, replace *badger.DB references with DB interface references. wrap badger and tempodb with some code that implements this new DB interface.

There don't seem to be too many references directly to badger.DB, but quite a few references to badger itself, eg, badger.Txn, though it does seem tractable to do this if it can be hid behind an interface.

The implementation plan @petethepig came up with makes sense to me, and I'd love to see this feature.

dalehamel avatar Mar 01 '22 18:03 dalehamel

Yeah both @yashrsharma44 and @dalehamel you make great points here... we could potentially still use tempodb, but we are a little hesitant because it may make things more complicated for us down the road...

That being said, I do want to provide an update that we have officially started working on this as of this week and have some ideas as to how we can implement this with or without tempodb.

We should be able to provide some updates (and hopefully a v1 of this solution) soon!

Rperry2174 avatar Mar 01 '22 20:03 Rperry2174

@Rperry2174 Thanks for the update! Very exciting to hear that work has started on this. I'll be keeping a close eye 👀

we could potentially still use tempodb, but we are a little hesitant because it may make things more complicated for us down the road...

I don't have any specific attachment to tempodb in terms of a solution, it is more that I think being able to store the profiling data in blob storage, and in general having pluggable backends, could help avoid issues in scaling pyroscope horizontally (such as the problem of local disk storage w/ badger mentioned in the original issue description).

Without knowing too much about how tempodb itself works, I'd naively expect that a strategy of caching things in memory and falling back to a bucket on a cache miss should scale pretty well, though of course there could be some gotchas here.

dalehamel avatar Mar 01 '22 21:03 dalehamel

Hi!!,

I love pyroscope. I think it will become a benchmark in the world of continuous profiling.

One of the serious problem with pyroscope when you starts to use it in production, profiling a lot of java apps, is the storage. I think Pyroscope needs a solution for this. Maybe S3 pluggable storage, maybe #291 (storage based retention) or maybe both.

Right now Pyroscope is expensive in terms of storage needed to mantain a large environment. Also it's easy the storage became full again and again (although time-based retention is enabled) so we have gaps in our profiled data.

Thanks a lot and much encouragement. You are doing an incredible job.

Best regards!!!

netamego avatar Apr 02 '22 09:04 netamego

Maybe clickhouse.com with s3 disk + cache will good solution? Look details https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree/#table_engine-mergetree-s3

Slach avatar Apr 02 '22 17:04 Slach

Yeah both @yashrsharma44 and @dalehamel you make great points here... we could potentially still use tempodb, but we are a little hesitant because it may make things more complicated for us down the road...

That being said, I do want to provide an update that we have officially started working on this as of this week and have some ideas as to how we can implement this with or without tempodb.

We should be able to provide some updates (and hopefully a v1 of this solution) soon!

Hi, @Rperry2174.

I'm excited to see you say that pluggable storage will be supported soon, maybe coming in v1. Now, our team uses Pyroscope as a continuous profiling component. Pyroscope has many good features. We hope that Pyroscope can support remote object storage (such as Tempo, Loki, Mimir, Thanos) in 2nd has of the year to solve the single-instance performance and storage bottleneck problem of Pyroscope standalone in large-scale scenarios.

I don't know how Pyroscope supports this feature now, if possible, I would be happy to participate in the discussion of the solution for pluggable storage, or even in the development.

scottzhlin avatar Jul 19 '22 03:07 scottzhlin

Glad to see that you're trying to scale up Pyroscope @scottzhlin and appreciate the offer to help! Just as an update for @dalehamel, @netamego, and anyone else who lands on this issue between now and when we officially support this:

As it turned out, solving this for the open source version was more complicated than we initially anticipated. We've currently solved this scalability problem for the cloud version of Pyroscope which we've released in beta and have started onboarding some companies who have run into scalability issues. It's free while in beta and we'll happily work with anyone interested -- just reach out on slack or email!

We designed this version from scratch with the aim of making it horizontally scalable and replacing the main bottleneck, badgerDB with S3. Of course there were plenty of follow up optimizations we had to make to store/write/read efficiently with S3, but more important piece is that we now have a much better understanding of how one could plug in S3 or their cloud equivalent to make Pyroscope scale orders of magnitude better than on a single instance.

We're still making some optimizations and working and ironing out some other minor issues before figuring out how to apply this back over to the open source version, but it will definitely take some time to figure out how to do it in a user-friendly way.

Rperry2174 avatar Jul 19 '22 06:07 Rperry2174

Wooow. Pyroscope Cloud is an amazing feature. Great progress!!! But most important is what @Rperry2174 said "we now have a much better understanding of how one could plug in S3". We are very jealous about our business data. For us, a perfect solution will be S3 pluging that enable Pyroscope to use On-Premise S3 repo. Or a Pyroscope Cloud On-Premise version.

Great great work guys!!

netamego avatar Jul 22 '22 18:07 netamego

hurry up!!!! the grafana team has supported this feature. link address:https://github.com/grafana/phlare

Cheap, durable profile storage: Grafana Phlare uses object storage for long-term data storage, allowing it to take advantage of this ubiquitous, cost-effective, high-durability technology. It is compatible with multiple object store implementations, including AWS S3, Google Cloud Storage, Azure Blob Storage, OpenStack Swift, as well as any S3-compatible object storage.

linthan avatar Nov 03 '22 07:11 linthan

Are there plans to support Grafana Phlare as a profile backend?

AndresPineros avatar Jan 04 '23 06:01 AndresPineros

https://grafana.com/blog/2023/03/15/pyroscope-grafana-phlare-join-for-oss-continuous-profiling/

dalehamel avatar Mar 15 '23 13:03 dalehamel

Is there a plan for when this capability will be supported?

cao2358 avatar Jun 29 '23 07:06 cao2358