Jovan Veljanoski comments

Results 94 comments of


Jovan Veljanoski

[FEATURE-REQUEST] Create pyarrow structs via vaex

> @JovanVeljanoski any opinions on this? How should we attach this, or do you like my code proposal? Still thinking about it.. i want to do some tests but busy......

[FEATURE-REQUEST] Create pyarrow structs via vaex

I like the proposal of @maartenbreddels above. The one correction/suggestion I would make is this ```python df['person'] = df.struct.merge(['name', 'age']) df['person'] = df.struct.merge({'name':'Name', 'age':'Age'}) ``` Although I have to say...

[BUG-REPORT] Too much memory Consumption and not releasing it.

Very hard to tell without a code example that we can reproduce. Also there were bunch of questions in the form that you didn't answer.

[BUG-REPORT] Too much memory Consumption and not releasing it.

I appreciate the example and insight into what you are doing, but this is not something I can copy paste and run & debug. Can you provide some (fake) data...

[BUG-REPORT] Too much memory Consumption and not releasing it.

In both functions you are mixing vaex and non-vaex code (numpy/pandas). That might cause memory spikes. It might be a good idea to add bunch of print statements and time...

[BUG-REPORT] Too much memory Consumption and not releasing it.

How many unique values do you have in `tm_cid` and `tm_mid`?

[BUG-REPORT] Too much memory Consumption and not releasing it.

There has been few releases since this thread was active. @ashsharma96 can you see if things have improved for your usecase in the latest version?

Added test for dropna to consider hidden original columns

@maartenbreddels (if understand this correctly..) What would happen in this case. Say you have a original column without N/A values. You do some computation using that column, and overwrite it...

Iterators for ML

Hi, Interesting PR. Here are my thoughts: 1. Regarding `df.to_numpy()` proposal: Something very similar already exists actually, it is called `df.to_arrays()`, where by default you get the data in the...

Output features property for Transformers

Another option for the PCA issue would be to not modify the PCA at all, but also simply not support this `features_` property in this case.