Transformers4Rec
Transformers4Rec copied to clipboard
[BUG] `cannot pickle 'mappingproxy' object` when using `TabularFeatures` `create_categorical`
Bug description
I have a bug just creating a schema programmatically. Can you help me on this?
thx
Steps/Code to reproduce bug
import merlin_standard_lib as msl
from merlin_standard_lib import Schema
from transformers4rec.torch.features.tabular import TabularFeatures
features_schema = Schema([msl.ColumnSchema.create_categorical("language", num_items=149),]
)
a = TabularFeatures.from_schema(
features_schema,
)
I have TypeError: cannot pickle 'mappingproxy' object
coming from
│ /home/mdenadai/miniconda3/envs/gnn/lib/python3.9/site-packages/transformers4rec/torch/features/t │
│ abular.py:175 in from_schema │
│ │
│ 172 │ │ │ │ │ **kwargs, │
│ 173 │ │ │ │ ) │
│ 174 │ │ │ else: │
│ ❱ 175 │ │ │ │ maybe_continuous_module = cls.CONTINUOUS_MODULE_CLASS.from_schema( │
│ 176 │ │ │ │ │ schema, tags=continuous_tags, **kwargs │
│ 177 │ │ │ │ ) │
│ 178 │ │ if categorical_tags: │
│ │
│ /home/mdenadai/miniconda3/envs/gnn/lib/python3.9/site-packages/transformers4rec/torch/tabular/ba │
│ se.py:190 in from_schema │
│ │
│ 187 │ │ ------- │
│ 188 │ │ Optional[TabularModule] │
│ 189 │ │ """ │
│ ❱ 190 │ │ schema_copy = deepcopy(schema) │
│ 191 │ │ if tags: │
│ 192 │ │ │ schema_copy = schema_copy.select_by_tag(tags)
This happens even when I just do:
import deepcopy
import merlin_standard_lib as msl
from merlin_standard_lib import Schema
from transformers4rec.torch.features.tabular import TabularFeatures
deepcopy(Schema([msl.ColumnSchema.create_categorical("language", num_items=149),]))
Environment details
- Transformers4Rec version: 23.6.0
- Platform: unix
- Python version: 3.9
It seems that if I removeint_domain
from ColumnSchema
everthing can be copied
class ColumnSchema(Feature):
@classmethod
def create_categorical(
cls,
name: str,
num_items: int,
shape: Optional[Union[Tuple[int, ...], List[int]]] = None,
value_count: Optional[Union[ValueCount, ValueCountList]] = None,
min_index: int = 0,
tags: Optional[TagsType] = None,
**kwargs,
) -> "ColumnSchema":
_tags: List[str] = [t.value for t in TagSet(tags or [])]
extra = _parse_shape_and_value_count(shape, value_count)
int_domain = IntDomain(name=name, min=min_index, max=num_items, is_categorical=True)
_tags = list(set(_tags + [Tags.CATEGORICAL.value]))
extra["type"] = FeatureType.INT
return cls(name=name, int_domain=int_domain, **extra, **kwargs).with_tags(_tags)
and it gets solved with betterproto 2.0, maybe because of https://github.com/danielgtaylor/python-betterproto/pull/339. However, this creates a dependency clash
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. merlin-core 23.6.0 requires betterproto<2.0.0, but you have betterproto 2.0.0b6 which is incompatible.
Thanks for the detailed bug report and the fix.
You can try updating the dependencies in requirements.txt; there's a reasonable chance that it'll work. We're unfortunately not able to update our containers at this time but if you can test that it's working we'd love a PR with your solution.