ml-commons
ml-commons copied to clipboard
[BUG] Generating embeddings for arrays of objects is broken starting 2.17
What is the bug? Following this tutorial https://opensearch.org/docs/2.17/ml-commons-plugin/tutorials/generate-embeddings/, you will get the results like below. The title_embedding is not correctly ingested into the books as the new field. This results do not match what's in the tutorial.
{
"docs": [
{
"doc": {
"_index": "my_books",
"_id": "1",
"_source": {
"_ingest": {
"_value": {
"title_embedding": [
0.009794682,
0.04060341,
0.016146386,
...
-0.03778624
]
}
},
"books": [
{
"title": "first book",
"description": "This is first book"
},
{
"title": "second book",
"description": "This is second book"
}
]
},
"_ingest": {
"_value": null,
"timestamp": "2025-03-14T22:02:43.240620757Z"
}
}
}
]
}
How can one reproduce the bug? Follow the tutorial, and you will duplicate the error.
What is the expected behavior? The results in the tutorial is expected - new embeddings are ingested into the Books.
What is your host/environment?
- OS: [e.g. iOS]
- Version [e.g. 22]
- Plugins
Do you have any screenshots? If applicable, add screenshots to help explain your problem.
Do you have any additional context? Add any other context about the problem.