polars
polars copied to clipboard
feat(python): allow schema definition from 2-Lists and not only 2-Tuples
closes https://github.com/pola-rs/polars/issues/12178.
As pointed out in the issue, processing a DataFrame with schema of type List[List[str, pl.dtype]]
failed silently.
This PR
- processes lists similarly to tuples when using
_unpack_schema
- adds a unit test to
_unpack_schema
Remarks:
- Because strings are also sequences, we would have to check that the
Sequence
is not astr
instance (see 2nd commit). I wanted to coerce lists and tuples into dictionary, butdict([("foo", int), "ab"])
results in{"foo": int, "a": "b"}
which is a weird edge case - I'm not sure I understand the intended behavior of
lookup_names
. Should it affect onlycolumn_dtypes
or also affectcolumn_names
?
In: _unpack_schema(schema=[("foo", int), ("bar", str)], lookup_names=[None, "barbar"])
Out: (['foo', 'bar'], {'foo': Int64, 'barbar': Utf8})
- If more tests are required please let me know