toc icon indicating copy to clipboard operation
toc copied to clipboard

Health of cortex project

Open dims opened this issue 3 years ago • 12 comments

Grafana folks now have a new project called Mimir, here's the announcement: https://grafana.com/blog/2022/03/30/announcing-grafana-mimir/

Project stats have been dropping off, example: https://cortex.devstats.cncf.io/d/15/new-prs-in-repository-groups?orgId=1

Here's the current list of maintainers (2 from AWS, rest from Grafana as of Aug 15): https://github.com/cortexproject/cortex/blob/master/MAINTAINERS

cc @alolita @halcyondude @RichiH (as TAG observability leads) cc @alanprot @alvinlin123

Questions:

  • How do we get some fresh help for the remaining maintainers?
  • Is there a path for Cortex to Graduate? If not, what support do we provide the project to turn things around?

dims avatar Aug 15 '22 17:08 dims

Hi

Currently there are 3 active maintainers for Cortex @alanprot, @friedrich-at-adobe, and myself AWS and Adobe). Currently I am trying to recruit more maintainers from different organizations. We might have more maintainers from AWS to help reviewing PRs, but my goal is to have more maintainers from other organizations.

I am going to give a talk for Cortex during KubeCon 2022 NA, I am trying to get a booth for Cortex as well. These will generate more interest for Cortex. I think there is still a path to graduation for Cortex; thing seems slower after the Mimir fork because newer maintainer like @alanprot and I have to go through steep learning curving on existing code base and build/release processes. We will ramp up soon in the near future.

alvinlin123 avatar Aug 15 '22 18:08 alvinlin123

I am a maintainer too. I am not from AWS

friedrich-at-adobe avatar Aug 16 '22 09:08 friedrich-at-adobe

For completeness' sake, @jtlisi is also a maintainer, but he has not been active since changing jobs.

This is mirroring other discussions about various other projects which we have had on GB, TOC and project levels over the last few months: Cortex, like other CNCF projects, has several large users and it would be best if they increased hiring and staffing for the projects they derive value from.

TOC could also make a public call for users to support Cortex.

On a positive note, July and August have seen an increase in activity.

RichiH avatar Aug 16 '22 09:08 RichiH

High level overview of Cortex vs. Mimir, side-by-side:

https://cauldron.io/compare?projects=6274&projects=6273

nice high level quantified "data based" summary of Commits, Issues, PR/MR, and the Community #'s.

the Cortex project started back in 2016, here's those reports w/ timescale (for context) going back to the beginnings

https://cauldron.io/compare?projects=6274&projects=6273&tab=overview&from_date=2016-06-01&to_date=2022-05-10

image

image

halcyondude avatar Aug 16 '22 15:08 halcyondude

Thank you for all the feedbacks and data. As a Cortex maintainer, I am wondering what would be needed to resolve this issue? I would definitely do anything I can to move Cortex along the CNCF graduation path.

alvinlin123 avatar Aug 17 '22 05:08 alvinlin123

@alvinlin123 at the moment, we use this to signal folks that more contributors are needed for Cortex. So no action items for the cortex team itself other than, please be ready to welcome folks if/when they show up.

dims avatar Aug 17 '22 12:08 dims

Got it, thanks for the response @dims

alvinlin123 avatar Aug 17 '22 20:08 alvinlin123

Concerns

  • longevity of the Cortex Project
  • Abandonment by project by maintainers shortly after Incubation.
  • Customers don't have vendor support for (Cortex) Apache2 based project.

Action: Determine Adopter position(s): "Cortex vs. Mimir"

  • [ ] Amazon Web Services (AWS)
  • [ ] Aspen Mesh
  • [ ] Buoyant
  • [ ] DigitalOcean
  • [ ] Electronic Arts
  • [ ] Etsy
  • [ ] EverQuote
  • [ ] GoJek
  • [ ] GrafanaLabs
  • [ ] MayaData
  • [ ] Northflank
  • [ ] Opstrace
  • [ ] Platform9
  • [ ] REWE Digital
  • [ ] SysEleven
  • [ ] Weaveworks

Logistical Assistance & Resourcing

  • [ ] get facilitator assigned from project
  • [ ] Set up regular project SIG meetings. Could be a product or project manager
  • [ ] Reach out to contributing teams to get engineers involved; Bootstrap and diversify contributor pool

Assess and Determine what is needed to make the project sustainable

  • [ ] development / engineering resources
  • [ ] community facilitator / manager
  • [ ] Project & Product Manager

How did we get here?

  • CNCF could be lacking checks and balances for Incubation/Graduated projects.
  • Diversity of maintainers (multiple companies, multiple regions) as a requirement
  • Maintainer(s) abandoned project shortly after achieving Incubation milestone.

halcyondude avatar Sep 06 '22 14:09 halcyondude

I find it odd that we are avoiding the elefant in the room.

Is Mimir a healthy open source project?

Mimir is a fork of cortex that uses AGPL instead of GPL and is solely maintained by Grafana Labs developers. That license excludes all possible meaningful collaboration from other companies. So yes, you can fork mimir and create a new AGPL mimir fork, but you can't contribute the ideas in Mimir to other GPL projects like Prometheus, Thanos, Cortex, etc. Personally, I don't look at the code found in Mimir to prevent myself from facing future legal issues. It's effectively a project run by a single company, to me that is bad health for any open source project.

I am being really careful not to say anything bad about the Grafana developers who were Cortex maintainers, they are all great people. But this was a decision that heavily affected us users and forced us to adapt.

Is Cortex a healthy open source project?

I let you decide that but yes, Cortex practically lost its main developers from Grafana Labs and any users that can tolerate the AGPL license and is willing to use a project maintain by a single company. But Cortex didn't lose the other maintainers from other companies ( 2 from AWS and me from Adobe) and the other users that have been working with the project and cannot switch to Mimir. Switching to Thanos or Prometheus was not possible without the great features that Cortex adds. We don't have major bugs, we had a major release in July with some new experimental compactor improvements and we are looking to have another one really soon with chunk storage removal.

This issue being open is not helping us, but maybe we need to be more public about the stuff that is happening in Cortex, how users are being helped and the roadmap. I take this as an invitation to give cortex docs more love and make it obvious that yes, there is indeed future for us.

PS. I am cortex maintainer. My other handle is @friedrich-at-adobe

friedrichg avatar Sep 08 '22 11:09 friedrichg

@friedrich-at-adobe that's fair. We can take a checkpoint again say at the year end? In the meanwhile, please link any positive news ("more public about the stuff that is happening in Cortex") here?

dims avatar Sep 08 '22 13:09 dims

Assigning to @rochaporto for review

amye avatar Sep 05 '23 22:09 amye

@amye issue opened directly with the project for follow up: https://github.com/cortexproject/cortex/issues/5728

rochaporto avatar Jan 16 '24 16:01 rochaporto

@rochaporto Do we have a check-in date with the project for this? I'd like to make sure we've got it captured and dont miss the check in window with them to bring this issue to closure

TheFoxAtWork avatar Apr 16 '24 17:04 TheFoxAtWork

This is being tracked in https://github.com/cortexproject/cortex/issues/5728 directly with the project. The last update was Feb 13th on the ticket but i've seen progress since on the open items. Will check with the project.

rochaporto avatar Apr 16 '24 20:04 rochaporto

I'd like to be involved in this discussion with the project. @rochaporto lmk whenever you are having a check-in.

alolita avatar Apr 16 '24 20:04 alolita

Following the evaluation of the project, the TOC has seen progress in multiple areas:

  • Roadmap: established and now being kept up to date
  • Maintainers: list has been updated with maintainers from two different organizations
  • Adopters: list updated showing multiple end users relying on the project in production

The TOC has the following recommendations for the project moving forward:

  • Improve community meetings, either following the current defined cadence of 2 weeks or updating the cadence to what is feasible and reflecting that in the repos
  • Reach out to TAG-ContributorStrategy for recommendations on how to get more contributors potentially becoming maintainers over time
  • Perform an early check for graduation to identify other areas where the project could be improved

TAG-Observability will continue supporting the project, and will organize quarterly checks to identify areas where the TOC can further help with all the tasks above.

rochaporto avatar Apr 25 '24 17:04 rochaporto

Thanks for the summary @rochaporto

Reiterating that TAG-Observability will continue being engaged with the project and supporting it including quarterly syncs.

alolita avatar Apr 25 '24 18:04 alolita