ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

[BUG] ML model not deployed error on neural search

Open vibrantvarun opened this issue 9 months ago • 5 comments

What is the bug? While running bwc tests and integ tests, we are facing an issue of "ML model not deployed yet". Therefore, crashing neural search gradle check. We are trying to deploy local model.

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Model not ready yet. Please deploy the model first."}],"type":"illegal_argument_exception","reason":"Model not ready yet. Please deploy the model first."},"status":400}
        at __randomizedtesting.SeedInfo.seed([9F9DD61772480865:FD846F909289E8B8]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:385)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:355)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:330)
        at app//org.opensearch.neuralsearch.BaseNeuralSearchIT.search(BaseNeuralSearchIT.java:513)
        at app//org.opensearch.neuralsearch.BaseNeuralSearchIT.search(BaseNeuralSearchIT.java:467)
        at app//org.opensearch.neuralsearch.BaseNeuralSearchIT.search(BaseNeuralSearchIT.java:455)
        at app//org.opensearch.neuralsearch.processor.NormalizationProcessorIT.testResultProcessor_whenOneShardAndQueryMatches_thenSuccessful(NormalizationProcessorIT.java:105)
  1> [2024-04-30T12:47:29,358][INFO ][o.o.n.p.NormalizationProcessorIT] [testResultProcessor_whenDefaultProcessorConfigAndQueryMatches_thenSuccessful] before test
  1> [2024-04-30T12:47:37,253][INFO ][o.o.n.p.NormalizationProcessorIT] [testResultProcessor_whenDefaultProcessorConfigAndQueryMatches_thenSuccessful] after test
  1> [2024-04-30T12:47:37,258][INFO ][o.o.n.p.NormalizationProcessorIT] [testQueryMatches_whenMultipleShards_thenSuccessful] before test
  1> [2024-04-30T12:47:44,931][INFO ][o.o.n.p.NormalizationProcessorIT] [testQueryMatches_whenMultipleShards_thenSuccessful] after test
  2> NOTE: leaving temporary files on disk at: /local/home/varunudr/opensearch-spatial/neural-search/build/testrun/integTest/temp/org.opensearch.neuralsearch.processor.NormalizationProcessorIT_9F9DD61772480865-001
  2> NOTE: test params are: codec=Asserting(Lucene99): {}, docValues:{}, maxPointsInLeafNode=1583, maxMBSortInHeap=7.164599990698049, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=en, timezone=America/Phoenix
  2> NOTE: Linux 5.10.214-180.855.amzn2int.x86_64 amd64/Amazon.com Inc. 21.0.2 (64-bit)/cpus=8,threads=4,free=383863208,total=536870912

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Clone neural search and run ./gradlew check

What is the expected behavior? Tests should pass

What is your host/environment? Linux, windows

vibrantvarun avatar Apr 30 '24 21:04 vibrantvarun

https://github.com/opensearch-project/ml-commons/pull/2389, this PR should fix the issue.

Zhangxunmt avatar May 01 '24 02:05 Zhangxunmt

Can we just run ./gradlew check again the main branch? Or it needs to run 2.14? I don't have access permission to pull any other branch except main.

Zhangxunmt avatar May 01 '24 06:05 Zhangxunmt

You can run ./gradlew check on main

vibrantvarun avatar May 01 '24 16:05 vibrantvarun

Verified both "./gradlew check" and "./gradlew :qa:rolling-upgrade:testRollingUpgrade" passed from my local environment with the lasted pull of neural search repo.

Zhangxunmt avatar May 01 '24 17:05 Zhangxunmt

"./gradlew bwcTestSuite -Dbwc.version=2.14.0-SNAPSHOT" is running successful in my dev desktop. Varun also verified from his dev desktop. So let's verify again in the next CI and close this issue if no problems anymore.

Zhangxunmt avatar May 02 '24 01:05 Zhangxunmt

Resolving this ticket as no issues anymore. @vibrantvarun

Zhangxunmt avatar May 06 '24 21:05 Zhangxunmt