models
models copied to clipboard
[QST] Train Ranking Model Using Pre-Train Model
❓ Questions & Help
Details
I want to built ranking model using pre-train embeddings
- I can train a model using embedding lookups, but the input of the model will be an id-based feature
- I want to give embedding to my model at inference time not id based feature. How can I do this?
I follow this tutorial. I don't have any problems while training the DCN model. After the model training is completed, I want to change the input of the model id to embedding.
My code
import nvtabular as nvt
from nvtabular import ops
cat_features = ["query", "title"] >> ops.Categorify(dtype="int32",
out_path="../data/categories",
freq_threshold={"query":0, "title":0}
)
from merlin.models.utils.example_utils import workflow_fit_transform
train_path = os.path.join("../data/train.parquet")
valid_path = os.path.join("../data/val.parquet")
output_path = os.path.join("../data/integration")
workflow_fit_transform(output, train_path, valid_path, output_path)
query_embs = np.random.random((2000, 64))
title_embs = np.random.random((2000, 64))
embed_dims = {}
embed_dims = {"query" : query_embs.shape[1],
"title" : title_embs.shape[1]
}
embeddings_init = {
"query": mm.TensorInitializer(query_embs),
"title": mm.TensorInitializer(title_embs),
}
embeddings_block = mm.Embeddings(
train.schema.select_by_tag(Tags.CATEGORICAL),
infer_embedding_sizes=True,
embeddings_initializer=embeddings_init,
trainable={'query': False,
'title': False},
dim=embed_dims,
)
input_block = mm.InputBlockV2(train.schema, categorical=embeddings_block)
model = mm.DCNModel(
train.schema,
depth=2,
input_block=input_block,
deep_block=mm.MLPBlock([64, 32]),
prediction_tasks=mm.BinaryOutput(target_column)
)
model.compile(optimizer="adam")
model.fit(train, batch_size=1024, epochs=10)