Ben Epstein

Results 38 issues of Ben Epstein

Koalas dataframes don't seem to recognize np.number as a type that columns conform to when using select_dtypes ## Recreate: ``` from pyspark.sql import SparkSession from databricks import koalas as ks...

bug

It seems that when creating a scatterplot, the marker sizes are always in pixels. The result is that when zooming into the plot, the points remain small. Is there a...

feature
P3

**Description** Since vaex provides all these great struct operations, it would be great if we could create structs in vaex directly via massive dataframes **Additional context** ``` import pyarrow as...

to support the following symmetricly ``` import vaex df = vaex.example() df.export("file.json") vaex.open("file.json") ```

current behavior ``` import vaex df = vaex.from_arrays( id=vaex.vrange(0, 200_000) ) 299_999 in df.id # True but wrong ``` proposed ``` 299_999 in df.id # False ```

Helper function to concatenate many hdf5 files. Tested against hundreds of thousands of files. I could imagine using this when a user globs with a `.open` where vaex can call...

When you run SQL queries and the output is as a tabledisplay, the databar and heatmap features don't work. They work fine with Pandas dataframes.

By default, if you train a PySparkML model with a dataframe that has uppercase column names, and then try to run an inference with the same column names but in...

enhancement
help wanted

I really like this project and I rely on it for a lot of work, but I often have problems picking back up when I haven't used it in a...