spark-dgraph-connector icon indicating copy to clipboard operation
spark-dgraph-connector copied to clipboard

Add language support for wide node mode

Open EnricoMi opened this issue 4 years ago • 2 comments

The node source in wide mode has a column for each predicate. With language strings, each language of each predicate requires its own column, which needs to be known upfront. Configuration could provide a set of languages for those predicates, but this is not very handy. Zero service should tell us for each predicate, which languages exist for predicates with @lang directive. From this, we can derive the output DataFrame schema and configure the encoder.

EnricoMi avatar Nov 16 '20 18:11 EnricoMi

Asked in the dgraph forum for a feature that would support knowing existing languages per predicate upfront: https://discuss.dgraph.io/t/list-of-existing-languages-per-predicate/11479

EnricoMi avatar Nov 16 '20 19:11 EnricoMi

Alternatively, the type of a column with @lang in wide mode could be a Map[String, T] where the key is the language tag mapping to the respective value of type T. This would not require any upfront information on existing languages, and the table does not explode in width and get sparse.

EnricoMi avatar Nov 05 '21 08:11 EnricoMi