modnet
modnet copied to clipboard
complex compositions take very long to featurize
I would like to run modnet on a dataset in which I have compositions that have very complex stoichiometries. On example would be C100H3815Br21I279N2185Pb100
To reproduce, this could be an example code:
import pandas as pd
from modnet.models import MODNetModel
from modnet.preprocessing import MODData
from pymatgen.core import Composition
data = {'composition': ['Cu2ZnSnSe4', 'Cu2ZnSnS4', 'CsPbI3', 'CH3NH3PbI3', 'C100H3815Br21I279N2185Pb100' ],
'target': [1.0, 1.5, 1.78, 1.6, 1.63]}
df_simple = pd.DataFrame(data)
df_simple["composition"] = df_simple["composition"].map(Composition)
data = MODData(
materials=df_simple["composition"], # you can provide composition objects to MODData
targets=df_simple["target"], # you can provide target values to MODData
target_names=["target"]
data.featurize()
Am I doing something wrong here? Would there be a workaround to get these complex compositions running smoother through the featurizer?
Thanks!