djl interOpNumThreads and intraOpNumThreads thread count discrepancy

Hello Team. I am testing a language model called al-mpnet using DJL onnxruntime engine. I have created the criteria in this manner

 Criteria.builder()
      .setTypes(classOf[String], classOf[Array[Number]])
      .optTranslator(translator)
      .optModelUrls(model_file_path)
      .optModelName(model_file_name)
      .optEngine(runtime)
      .optOption("mapLocation", "true")
      .optProgress(new ProgressBar())
      .optOption("interOpNumThreads", "1")
        .optOption("intraOpNumThreads", "1")
      .build()

On printing the criteria I can see that the intraOpNumThreads =1 and interOpNumThreads=1 as shown below

(Criteria is -->,Criteria:
	Application: UNDEFINED
	Input: class java.lang.String
	Output: class [Ljava.lang.Number;
	Engine: OnnxRuntime
	ModelZoo: ai.djl.localmodelzoo
	Options: {"intraOpNumThreads":"1","mapLocation":"true","interOpNumThreads":"1"}

But in the console I am the getting the following

[main] INFO ai.djl.pytorch.engine.PtEngine - PyTorch graph executor optimizer is enabled, this may impact your inference latency and throughput. See: https://docs.djl.ai/docs/development/inference_performance_optimization.html#graph-executor-optimization
[main] INFO ai.djl.pytorch.engine.PtEngine - Number of inter-op threads is 6
[main] INFO ai.djl.pytorch.engine.PtEngine - Number of intra-op threads is 6

Now my confusion is what is the configuration which is being set.

I am seeing seriously high CPU utilisation when inferring with this model hence want to restrict the num of threads

Jul 17 '23 08:07 AbhishekBose

I am using scala and this is what my build.sbt looks like

libraryDependencies += "ai.djl.aws" % "aws-ai" % "0.22.1"
libraryDependencies += "ai.djl" % "api" % "0.22.1"
libraryDependencies += "ai.djl.onnxruntime" % "onnxruntime-engine" % "0.22.1"
libraryDependencies += "org.slf4j" % "slf4j-simple" % "2.0.5"
libraryDependencies += "ai.djl.pytorch" % "pytorch-engine" % "0.22.1"
libraryDependencies += "ai.djl.pytorch" % "pytorch-model-zoo" % "0.22.1"
libraryDependencies += "au.com.bytecode" % "opencsv" % "2.4"
libraryDependencies += "ai.djl.huggingface" % "tokenizers" % "0.22.1"
libraryDependencies += "org.json4s" %% "json4s-core" % "3.6.0-M2"
libraryDependencies += "org.json4s" %% "json4s-native" % "3.6.0-M2"
libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.0-M2"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.2.14" % "test"
libraryDependencies += "org.scalatestplus" %% "mockito-3-4" % "3.2.10.0" % "test"

Jul 17 '23 08:07 AbhishekBose

cc: @frankfliu

Jul 17 '23 08:07 AbhishekBose

You are using single omp thread for OnnxRunime engine, but your PyTorch engine is using default omp threading. You can set omp threading for PyTorch engine with the following code:

System.setProperty("ai.djl.pytorch.num_threads", "1");
System.setProperty("ai.djl.pytorch.num_interop_threads", "1");

Jul 17 '23 14:07 frankfliu

djl djl copied to clipboard

interOpNumThreads and intraOpNumThreads thread count discrepancy

djl
djl copied to clipboard