Possibly add a function that's similar to pandas json_normalized
Kind of cheating but a naive solution is to use pandas json_normalized to parse the json and then convert the resulting pandas df into Spark. The logic seems a bit too simple to justify a dedicated helper function though
@huynguyent - would be nice to create an implementation that's really performant and doesn't depend on pandas!
It is possible only if you know the final schema. Otherwise you need to infer the schema first somehow. And even with known schema the simplest solution is still to use UDFs. My first question, do we know the schema in such a case? If not, I would suggest to start from the function like infer_json_schema(col).