django-dbbackup icon indicating copy to clipboard operation
django-dbbackup copied to clipboard

Deciding the Project's Future

Open Archmonger opened this issue 2 years ago • 27 comments

Summary

Right now we're at a bit of an impasse. It's noted in the original readme that django-dbbackup ...tries to use the traditional dump & restore mechanisms. In terms of the history of this project, it's possible that this used to be true. However, the current implementation appears to heavily rely on custom connectors in order to facilitate data dumps.

My proposal is implementing some breaking changes in order adhere to the original project description, and to limit breaking issues caused by django-dbbackup internals.

Suggested Path Forward

The suggestions below would convert django-dbbackup to more of an "upgraded" dumpdata/loaddata rather than a completely different kind of backup engine.

  1. Utilize Django's dumpdata and loaddata for doing the heavy-lifting in terms of serializing data
    • Using dumpdata with -o outputs a character stream to stdout that we can utilize.
  2. Pass-though all of Django's integrated dump/load features, such as multiple compression types and export formats
  3. Add encryption support on top of all this
  4. Add in "bonus features", such as...
    • Natively backing up to remote storage locations.
    • Post-processing scripts (probably an array within settings.py, similar to Django middleware)
    • Parallel execution of backup/restore on multiple databases by using subprocesses/threads
    • Backup/restore up all databases by default, but also allow for backing up specific databases
    • Convenient helper functions for supporting scheduled backups (via Celery/Huey)
    • Automatically delete old backups over the configured maximum amount of backups

Thoughts, Comments, and Remarks

I'm opening this up for anyone to voice their opinion on the project direction. This would be a breaking change, so if there's a general consensus that this isn't the ideal project direction then we can reassess.

Archmonger avatar Jan 19 '22 20:01 Archmonger

@Archmonger I think this is a great path forward for django-dbbackup. I think the reality is that the project hasn't been maintained for a while, so if you have a vision and the time to execute in a manner that keeps the project healthy and lets us build upon existing libraries, it would be great for this project.

johnthagen avatar Jan 20 '22 12:01 johnthagen

I likely won't have time to develop this until somewhere around April, so until then this ticket will remain open for people to voice their opinions.

Archmonger avatar Feb 12 '22 09:02 Archmonger

New future user here: I'd like to use the project to backup/restaure my db/media. A completely new approach fully integrated with django itself feels ok to me

stunaz avatar Apr 04 '22 03:04 stunaz

Hi! I used to maintain this project years back - I think that the proposals sound great. I'm only hear to show some sign of life since I'm receiving emails from Read the Docs with a warning about the project being abandoned, which I think it isn't. No one has contacted me in this regards.

benjaoming avatar Apr 04 '22 11:04 benjaoming

@benjaoming Thank you for posting! Could you please add @Archmonger and myself to have permissions to push to RTD? We are the current maintainers and would like to get out a new stable release.

johnthagen avatar Apr 04 '22 11:04 johnthagen

Hey @benjaoming sorry about that. I tried reaching out to jonathan-s and ZuluPro to gain access to the RTD. Neither has been responsive to providing access so I reached out to RTD themselves to assist.

As I found out this morning, RTD put in an abandonment check to see if they could give access to the docs for johnthagen and myself.

Let me know if you can add us as RTD maintainers, as it would be much appreciated.

Archmonger avatar Apr 04 '22 16:04 Archmonger

@johnthagen @Archmonger - absolutely! I'll just need your RTD usernames to do that :+1:

benjaoming avatar Apr 06 '22 11:04 benjaoming

Aha! It seems that this is already in order, there was already a jonathan-s added as maintainer? And I've added Archmonger supposing it's you?

benjaoming avatar Apr 06 '22 11:04 benjaoming

Thanks! I've confirmed I've been added as a maintainer. I'll add in the Jazzband bot and johnthagen as soon as I get a chance.

Archmonger avatar Apr 07 '22 01:04 Archmonger

Wishing you the best with this project, thanks for being in Jazzband :100:

benjaoming avatar Apr 07 '22 10:04 benjaoming

@benjaoming Thank you for posting and your work to help this project move along.

@Archmonger I like the ideas you've presented and wanted to voice my encouragement to act with boldness and not be too weighed down by breaking changes.

banagale avatar Apr 10 '22 22:04 banagale

I made it here because I was notified I no longer am a collaborator on the pypi project I created! No big deal, I don't think I had any significant contribution to this for many years now. It's been amazing to see the project grow beyond anything I imagined, and owe a huge thanks to @benjaoming and @ZuluPro for taking over when I had moved on. I'm personally fine with any direction the current maintainer wants to take this package, since I don't really consider myself a maintainer anymore, my voice shouldn't carry much weight.

To clarify, the line "tries to use the traditional dump & restore mechanisms" was originally meant to mean that we use pgdump for Postgres and mysqldump for MySQL etc, rather than Django's loaddata and dumpdata. The reason being is the db specific tools are generally more tailored to work with the database files better, especially when those databases reach much larger sizes. This project was originally created because I needed an easy solution to backup database files that were several hundred GB in size and Django's serializer was not up for the task at the time. Admittedly, I do not know if Django made improvements here or not. But I have to imagine, using pgdump is still much more superior to Django's dumpdata (and likwise for other databases and their own tools).

Again, I'll state that I am totally fine with any direction, but I suspect we both we may have interpreted the phrase "traditional dump & restore" to mean opposite things.

pkkid avatar Apr 11 '22 17:04 pkkid

@pkkid Thank you for sharing this valuable historical context!

johnthagen avatar Apr 11 '22 19:04 johnthagen

Aplogies @pkkid!

I'm trying to get all active maintainers funneled through the Jazzband org.

Within this GitHub org, PyPi & RTD access is really only needed for emergencies, everything else is handled by the Jazzband-Bot. To limit potential security vulnerabilities (ex. hacked PyPI accounts), I'm trying to keep that list short.

If you'd like to maintain control over the project I can put you in as a project lead. Just let me know!

Also, thanks for the context and clarification! Dumpdata is pretty solid for my use cases, but admittedly I haven't tried it on giant datasets. I'll take a stab at a side by side comparison and compare performance.

Archmonger avatar Apr 11 '22 19:04 Archmonger

Ha, no need to apologize. The project is in great hands, and I appreciate you and @johnthagen taking reigns to keep this project alive. Thank you!

pkkid avatar Apr 11 '22 20:04 pkkid

I've just seen and read this pinned issue after raising #468 yesterday (where I suggested a generic Python backup package that returned to using "traditional dump & restore mechanisms").

Is there a place for such a package at JazzBand? This would be a fork of the current project under a different name that went in a different direction.

isedwards avatar Nov 21 '22 21:11 isedwards

Jazzband typically only hosts Django related packages, so I would doubt it.

Technically, it is fully possible to move the current connectors out of Django-DBBackup and have them be standalone.

However, if we implement the changes I suggested in this issue we would be firmly tied to Django and unable to separate any of that functionality.

Archmonger avatar Nov 21 '22 22:11 Archmonger

@isedwards I don't think you can start a new project in Jazzband based on a concept. See especially the section Viability here: https://jazzband.co/about/guidelines#viability

benjaoming avatar Nov 21 '22 22:11 benjaoming

@johnthagen I'm thinking of spinning this repo out of Jazzband. I've been hesitant to make major changes or test/CI changes due to how slowly things are moving in Jazzband, which has spiraled into practically nothing getting done.

What's your thoughts on this, and would you assist in maintaining the package under either a new or old org?

Archmonger avatar Apr 24 '23 21:04 Archmonger

I think spinning it out would be a fine idea. I'd help out with basic maintenance under a new org.

johnthagen avatar Apr 24 '23 21:04 johnthagen

It would be nice to transfer the repo so we keep the stars.

johnthagen avatar Apr 24 '23 21:04 johnthagen

It is fascinating that in this case jazzband as an organization seems to have the same challenges that independent OS projects do.

banagale avatar Apr 25 '23 05:04 banagale

Jazzband is currently a centralized org with only one admin. So naturally, if that one admin becomes busy then things don't move forward.

Archmonger avatar Apr 25 '23 05:04 Archmonger

👋🏻 Hey folks, wondering if any progress has been made towards using django's dumpdata/loaddata with this plugin?

WillNilges avatar Mar 31 '24 21:03 WillNilges

May I suggest if we want this to work with dumpdata/loaddata, that we create a new database type instead of replace what we have? As mentioned above, the original intent of the project was specifically to make it easy to use the pgdump and their variants. Dumpdata is a more generic solution developed by Django but also comes with downsides for larger projects. However, I think it might slot nicely into a new db type (maybe generically at db/django.py). Keep the old and support Django's way of doing things being the user's choice.

pkkid avatar Apr 01 '24 12:04 pkkid

That's a really good idea. I agree that it's best represented as an optional DB type.

I don't know when I'll have time to develop this, I've been stretched pretty thin lately.

Archmonger avatar Apr 01 '24 19:04 Archmonger

@WillNilges I agree with both @pkkid and @Archmonger dumpdata/loaddata could be just an additionnal DB types in dbbackup.

ZuluPro avatar Apr 01 '24 21:04 ZuluPro