koalas icon indicating copy to clipboard operation
koalas copied to clipboard

Return empty column name when column schema cannot be infered

Open vkrot-exos opened this issue 5 years ago • 1 comments

This is not an issue, but would be nice to have column name listed in exception text when some column schema cannot be inferred because of all column values are nulls. Sample code:

    import databricks.koalas as ks
    import pandas as pd

    df = spark_session.createDataFrame([
        ('1', None,),
    ], 'a string, b string')
    kdf = df.to_koalas()

    def f(pdf: pd.DataFrame):
        return pdf

    print(kdf.koalas.apply_batch(f))

There is no type hint for f func and all values for column b are nulls, thus schema cannot be inferred. This code throws exception which is a bit confusing for new users: ValueError: can not infer schema from empty or null dataset It would be much more user-friendly to throw something like ValueError: can not infer schema from column 'b' cause all row values are nulls

vkrot-exos avatar Oct 12 '20 20:10 vkrot-exos

Hi @vkrot-exos, thanks for the suggestion! It sounds a good idea. Would you mind submitting the PR to modify the error message? Thanks!

ueshin avatar Oct 12 '20 21:10 ueshin