spark-df-profiling
spark-df-profiling copied to clipboard
Running in DataBricks
I have loaded a dataframe and when I run the command profile = spark_df_profiling.ProfileReport(df)
I get the following error:
pycache not bottom-level directory
I have confirmed the df is loaded and looking good (it is very large if that matters), not sure where to go next with this, suggestions?
It occurred to me that I was running on a serverless cluster so tried your example code on a Standard just to make sure that wasn't it:
Ran: import spark_df_profiling df = sqlContext.createDataFrame([["2",True,None,"8"], ["2",False,None,"8"], ["2",True,"5","7"]], ["a","b","c","d"]) rep = spark_df_profiling.ProfileReport(df) displayHTML(rep.html)
Error: pycache not bottom-level directory in ....
Hey Folks,
I created a PR #22 to display renderable HTML in Databricks notebook.
mani