MonetDBLite-R icon indicating copy to clipboard operation
MonetDBLite-R copied to clipboard

MonetDBLite removed from CRAN

Open KnutJaegersberg opened this issue 5 years ago • 15 comments

MonetDBLite removed from CRAN - do some dependencies have check issues or something?

KnutJaegersberg avatar Apr 22 '19 07:04 KnutJaegersberg

I would like to know about this issue as well. My R package depends on MonetDBLite and it's quite unfortunate to suddenly realise that people can't install it due to MonetDBLite sudden disappearance to nowhere. Is it possible to somehow help so the solution will be up quickly?

vadimnazarov avatar Apr 26 '19 14:04 vadimnazarov

Any updates?

vadimnazarov avatar May 06 '19 17:05 vadimnazarov

@vadimnazarov I understand that the continued changes needed to track the MonetDB codebase (while keeping step with CRAN's changing requirements for checks) have led @hannesmuehleisen and team to develop a new database package at https://github.com/cwida/duckdb which promises several improvements over MonetDBLite. Hannes can no doubt provide more details but meanwhile you might want to keep an eye on duckdb or take it for a spin! Hopefully it will be on CRAN soon.

cboettig avatar May 08 '19 18:05 cboettig

I see, thank you for notifying! So did I understand you correctly: there will be no MonetDB for R, but MonetDB itself will live and thrive?

vadimnazarov avatar May 08 '19 18:05 vadimnazarov

MonetDB itself will live on, yes. Thanks @cboettig for the explanation here.

hannes avatar May 10 '19 11:05 hannes

Got it, thank you! Can't wait for the duckdb on CRAN. On the side note - will there be any workaround to use MonetDB from R? What to do if I want to connect to the existing MonetDB database, and don't use the embedded database?

vadimnazarov avatar May 10 '19 11:05 vadimnazarov

I think we all need a word about this because several packages now depend on MonetDBLite and it was becoming a standard for data analysis on R (see all the examples of https://github.com/ajdamico/asdfree). In my case, I was using MonetDBLite on Python and R on a Windows platform. Also, I was using MonetDBLite not only as an embedded database but also to connect to a MonetDB Server database. So you can imagine my surprise when I updated to R 3.6 and discovered that MonetDBLite was not in CRAN anymore. Now I really don't know what features will remain in this new package and what features will be drop forever.

palmaresk8 avatar May 24 '19 14:05 palmaresk8

I too want to express some disappointment that MonetDBLite is going away. At the same time, I'm very appreciative of those who have the skills and dedication to work on open source projects like this. Not having those skills, I can only imagine the effort it takes to maintain MonetDBLite.

The duckdb project does look exciting. Will it be as fast as MonetDBLite?

MonetDBLite and dplyr have become my preferred method for working with a dataset that's ~1.7 GB in size (just over 3 million rows and 91 columns). Even when I just want to load the whole thing into memory, I've found nothing faster than MonetDBLite (this includes vroom, data.table's fread(), and the fst package).

And for query-like data manipulations, using dplyr with MonetDBLite on disk is faster for many things than using dplyr with the data in memory. I'm also a huge fan of data.table. It's just slightly faster than MonetDBLite for the things I do.

I installed duckdb this weekend and played around with it. It's great to see that the dplyr compatibility is already working. Yet it seems to be much slower. Using the same dataset, loads the data ~7x slower than MonetDBLite.

Again, very appreciative of the time and efforts!

nilescbn avatar Jun 09 '19 02:06 nilescbn

I installed duckdb this weekend and played around with it. It's great to see that the dplyr compatibility is already working. Yet it seems to be much slower. Using the same dataset, loads the data ~7x slower than MonetDBLite.

We have not optimized the loader yet, it will happen though.

hannes avatar Jun 11 '19 08:06 hannes

@nilescbn @Mytherin has just pushed upgrades to the CSV loader, please try again and see if the performance issue is still present.

hannes avatar Jun 13 '19 08:06 hannes

@hannesmuehleisen, my apologies, I didn't see the notification of your last message. I only noticed today as I was browsing for updates. I tried updating duckdb earlier, using remotes with build = FALSE, but the install failed this time (on both Windows 10 and Linux Mint). I will keep trying it. To be clear, the performance issue I was having related to loading the data into R from a MonetDBLite table (i.e. using dplyr's collect() function). I don't know if that's connected to the CSV loader or not. Either way, I'm looking forward to trying it out.

nilescbn avatar Jul 07 '19 05:07 nilescbn

I'm curious: why can't we install MonetDBLite from Github directly? Is that version still working/stable?

winston-p avatar Jan 01 '20 15:01 winston-p

@winston-p you can, of course. But without it on CRAN other users cannot publish packages to CRAN that depend on MonetDBLite. duckdb has been working well for me on windows, mac and linux, looking forward to seeing it on CRAN.

cboettig avatar Jan 01 '20 23:01 cboettig

@cboettig I see, thanks for explaining!

winston-p avatar Jan 02 '20 02:01 winston-p

I was able to get duckdb installed again thanks to the CRAN-like repo: cwida/duckdb#392.

Thank you for creating that @hannesmuehleisen.

Comparing speeds again, I'm seeing duckdb close the gap some but yet MonetDBLite is still ~2x faster in completing queries. For others who have similar questions about speed differences, these two issues may be of interest:

cwida/duckdb#407

cwida/duckdb#11

nilescbn avatar Feb 02 '20 05:02 nilescbn