djl icon indicating copy to clipboard operation
djl copied to clipboard

interOpNumThreads and intraOpNumThreads thread count discrepancy

Open AbhishekBose opened this issue 2 years ago • 3 comments

Hello Team. I am testing a language model called al-mpnet using DJL onnxruntime engine. I have created the criteria in this manner

 Criteria.builder()
      .setTypes(classOf[String], classOf[Array[Number]])
      .optTranslator(translator)
      .optModelUrls(model_file_path)
      .optModelName(model_file_name)
      .optEngine(runtime)
      .optOption("mapLocation", "true")
      .optProgress(new ProgressBar())
      .optOption("interOpNumThreads", "1")
        .optOption("intraOpNumThreads", "1")
      .build()

On printing the criteria I can see that the intraOpNumThreads =1 and interOpNumThreads=1 as shown below

(Criteria is -->,Criteria:
	Application: UNDEFINED
	Input: class java.lang.String
	Output: class [Ljava.lang.Number;
	Engine: OnnxRuntime
	ModelZoo: ai.djl.localmodelzoo
	Options: {"intraOpNumThreads":"1","mapLocation":"true","interOpNumThreads":"1"}

But in the console I am the getting the following

[main] INFO ai.djl.pytorch.engine.PtEngine - PyTorch graph executor optimizer is enabled, this may impact your inference latency and throughput. See: https://docs.djl.ai/docs/development/inference_performance_optimization.html#graph-executor-optimization
[main] INFO ai.djl.pytorch.engine.PtEngine - Number of inter-op threads is 6
[main] INFO ai.djl.pytorch.engine.PtEngine - Number of intra-op threads is 6

Now my confusion is what is the configuration which is being set.

I am seeing seriously high CPU utilisation when inferring with this model hence want to restrict the num of threads

AbhishekBose avatar Jul 17 '23 08:07 AbhishekBose

I am using scala and this is what my build.sbt looks like

libraryDependencies += "ai.djl.aws" % "aws-ai" % "0.22.1"
libraryDependencies += "ai.djl" % "api" % "0.22.1"
libraryDependencies += "ai.djl.onnxruntime" % "onnxruntime-engine" % "0.22.1"
libraryDependencies += "org.slf4j" % "slf4j-simple" % "2.0.5"
libraryDependencies += "ai.djl.pytorch" % "pytorch-engine" % "0.22.1"
libraryDependencies += "ai.djl.pytorch" % "pytorch-model-zoo" % "0.22.1"
libraryDependencies += "au.com.bytecode" % "opencsv" % "2.4"
libraryDependencies += "ai.djl.huggingface" % "tokenizers" % "0.22.1"
libraryDependencies += "org.json4s" %% "json4s-core" % "3.6.0-M2"
libraryDependencies += "org.json4s" %% "json4s-native" % "3.6.0-M2"
libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.0-M2"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.2.14" % "test"
libraryDependencies += "org.scalatestplus" %% "mockito-3-4" % "3.2.10.0" % "test"

AbhishekBose avatar Jul 17 '23 08:07 AbhishekBose

cc: @frankfliu

AbhishekBose avatar Jul 17 '23 08:07 AbhishekBose

You are using single omp thread for OnnxRunime engine, but your PyTorch engine is using default omp threading. You can set omp threading for PyTorch engine with the following code:

System.setProperty("ai.djl.pytorch.num_threads", "1");
System.setProperty("ai.djl.pytorch.num_interop_threads", "1");

frankfliu avatar Jul 17 '23 14:07 frankfliu