vaex icon indicating copy to clipboard operation
vaex copied to clipboard

[FEATURE-REQUEST] Vaex All Columns Dynamic Access Support

Open khanfarhan10 opened this issue 2 years ago • 2 comments

Description I wish to aggregate all columns to a single column in vaex.

Something like :

df["combined"] = ",".join(df[reduced_cols])

Is your feature request related to a problem? Please describe. There is no simple way to do this in vaex.

Additional context Can be done in pandas using axis of apply, something like

df["combined"] = df[reduced_cols].apply(
    lambda row: ",".join(row.values.astype(str)), axis=1
)

khanfarhan10 avatar Jun 28 '22 03:06 khanfarhan10

@JovanVeljanoski any idea on this?

khanfarhan10 avatar Jun 28 '22 04:06 khanfarhan10

Hey,

Vaex in general does not support the axis argument i believe.. so most if not all operations are column oriented (with exception of joins of course).

But there are relatively easy ways to accomplish what you are after. For example, first thing that comes to mind is:

import vaex

df = vaex.example()

# Get all the columns
columns = df.get_column_names()

# Build an expression in a loop
expr = df[columns[0]].astype('string')
for col in columns[1:]:
    expr += df[col].astype('string')

# Assign the expression to the dataframe
df['everything'] = expr

print(df)

JovanVeljanoski avatar Aug 07 '22 14:08 JovanVeljanoski

Will close due to inactivity. Please reopen if needed

JovanVeljanoski avatar Aug 29 '22 06:08 JovanVeljanoski

Apologies for not replying (must have missed this!!)

I believe that perfectly answers my queries!

Since Vaex will use an expression for storing stuff I believe it is still fast!

khanfarhan10 avatar Aug 30 '22 05:08 khanfarhan10

Since Vaex will use an expression for storing stuff I believe it is still fast!

It will not use much memory, but that might make it slow, please consult https://vaex.io/docs/guides/performance.html

Happy to hear it answers you Q!

maartenbreddels avatar Aug 31 '22 10:08 maartenbreddels