deequ icon indicating copy to clipboard operation
deequ copied to clipboard

Considering a fork of this repo to get it usable again

Open MrPowers opened this issue 4 years ago • 10 comments

Thanks again for making this project.

I did some searching / emailing to see if anyone is still maintaining this project and wasn't able to find anyone.

I'm considering making a fork of this repo, at least to publish Spark 3 JAR files that work (current Spark 3 JAR files are malformed).

Open to comments / suggestions and will let this sit for a bit before making the fork. No judgement BTW, companies / people can abandon open source whenever they want.

MrPowers avatar Mar 24 '21 18:03 MrPowers

Hi, I already forked deequ and keep bugfixing and adding features here: https://github.com/aviatesk/deequ You can find changes from the original deequ repository in CHANGELOG. We're building the forked deequ ourselves and thus haven't released jars, but I'm welcome to do that if there is a demand for that.

I'm also planning to announce it as "being maintained deequ" somewhere (maybe as an issue of this repository).


Do we want to join the forces there ? Welcome to any suggestions or ideas.

aviatesk avatar Mar 25 '21 03:03 aviatesk

@aviatesk - glad to see you've already made great progress.

It'd be great it you could shift your fork to SBT & publish it to Maven. I can help if you're not familiar with the process.

We recently worked to publish itachi to Maven and the README of this project has good instructions.

Getting a properly built Spark 3 JAR file would be great for the community. Thank you for your efforts & keep me posted on progress ;)

MrPowers avatar Mar 25 '21 10:03 MrPowers

I think this is a great idea. I lost my push access to this repository since I left Amazon, and I don't have time in my new job to work on Deequ anymore. I'd be happy to see it continued by a community!

sscdotopen avatar Mar 25 '21 10:03 sscdotopen

I've been maintaining my own fork after merging https://github.com/awslabs/deequ/pull/299 for my own use - and would enthusiastically contribute to a new community.

ets avatar Mar 26 '21 21:03 ets

Hi all. I'm really sorry, we haven't had much availability for deequ over the last months. Thank you all for contributing to the project! We are committed to fixing the jars by mid-April. We will also review our open-source commitments internally some time soon and might have more availability for deequ in the future. Thanks again!

twollnik avatar Mar 31 '21 08:03 twollnik

Hi all. Thanks so much for your patience! We just had a team internal discussion about our future commitments for deequ. The main take away is that this project is important to us and we want to keep maintaining it. The last months have been very busy for us and we are sorry to have been so inactive on the deequ front. We don’t yet have a decision on how much time exactly we can allocate for deequ and as of now we can’t promise to implement new features. We will keep you updated in this issue when we have a more precise idea regarding our future availability. Please let us know in this issue if you have any questions or comments. Thanks again for all your contributions!

twollnik avatar Apr 27 '21 12:04 twollnik

hi @twollnik thanks a lot for your reply. It's great that you plan to maintain this project in the future. I understand the timeline for new features is not guaranteed, how about just getting it work on Spark 3? Is this something that may happen relatively soon?

konradwudkowski avatar Apr 28 '21 12:04 konradwudkowski

Hi @konradwudkowski, you can already use deequ with spark 3 if your workflow includes attaching jars directly to clusters (use the jar here: https://mvnrepository.com/artifact/com.amazon.deequ/deequ/1.1.0_spark-3.0-scala-2.12). ETA for a spark 3 release that can be imported with sbt or maven is next week.

twollnik avatar Apr 28 '21 12:04 twollnik

@konradwudkowski - FYI, Deequ is working with Spark 3 now, see here: https://github.com/awslabs/deequ/issues/353#issuecomment-828753420.

MrPowers avatar Apr 29 '21 13:04 MrPowers

I have my own fork as well, which we use to extend functionality. A lot of the code is in private[deequ] scope which doesnt lend for extension very well unless we create the exact same package hierarchy within our code base.

joshivinay avatar May 17 '21 23:05 joshivinay